UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Meta level tracking with stochastic grammar Wang, Alex Sheng-Yuan 2009

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
24-ubc_2009_fall_wang_alex.pdf [ 23.66MB ]
Metadata
JSON: 24-1.0067518.json
JSON-LD: 24-1.0067518-ld.json
RDF/XML (Pretty): 24-1.0067518-rdf.xml
RDF/JSON: 24-1.0067518-rdf.json
Turtle: 24-1.0067518-turtle.txt
N-Triples: 24-1.0067518-rdf-ntriples.txt
Original Record: 24-1.0067518-source.json
Full Text
24-1.0067518-fulltext.txt
Citation
24-1.0067518.ris

Full Text

Meta Level Tracking with Stochastic Grammar by Alex Sheng-Yuan Wang  B. A. Sc., University of British Columbia, 2003 M. A. Sc., University of British Columbia, 2005  A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF Doctor of Philosophy in The Faculty of Graduate Studies (Electrical and Computer Engineering)  The University of British Columbia (Vancouver) August, 2009 c Alex Sheng-Yuan Wang 2009  Abstract The ability to learn about a stochastic process from noisy observations is fundamental to many applications. In order to track a dynamic process, the typical knowledge representation required is the state space model such as a linear Gauss Markov model, where efficient algorithms exists to perform state estimation under many different model assumptions. However, for meta level tracking, we are not only interested in the state estimation of the process, but classification of the process to a finite set of categories. In other words, in order to extract semantics of the sequential data, the approach taken is to define a model for each category, and determine the most likely one during the tracking. However, current models that are widely applied in classifying sequential data are mainly Markov models, but they are not only restrictive in the patterns that they can express, they often require state space that grows exponentially in the length of the observation. The solution presented in the thesis is to apply a more expressive and general model than Markov models to characterize the sequential process; the prior knowledge of the sequential process is to be encoded as a declarative language (linguistic framework) using stochastic context free grammar (SCFG) methods. The objective of the thesis is to formulate a meta level tracking framework, introduce and analyze the use of SCFG as the knowledge representation model, and discuss properties, applications, and algorithms involved. The research of the meta level tracking presented in the thesis is the result of the two main projects: electronic support measure against a multifunction radar, and ground surveillance with GMTI (ground moving target indicator) radar. In the electronic support measure problem, the algorithm developed plays the role of a target that is being tracked by a radar, and its aim is to estimate the operation mode of the radar and maximize its ownship safety. The ground surveillance is a reverse problem, where the algorithm runs in a radar and aims to learn the geometric patterns of ground moving targets’ trajectories, and implicitly infer their intents. In both cases, because the sequential process involved has hierarchical structure and long range dependency; for example an arc spatial pattern in a ground moving target’s trajectory, Markov models are not sufficient for the characterization and representation of the process. SCFG, on the other hand, can compactly encode the prior knowledge as production rules, and has demonstrated strength in modeling the branching and self-embedding dependencies that are often seen in processes with hierarchical structure and multiple time scales. In the electronic support measure problem, a novel model called Markov modulated SCFG is developed, and efficient algorithms are derived to perform both the state and parameter estimation of the model. In GMTI problem, a stochastic parser is modified to deal with GMTI data, and a detailed formal ii  language analysis of several common two-dimensional spatial patterns is performed with their corresponding stochastic grammar constructed. All the developed algorithms are implemented in C++ and their performance evaluated.  iii  Table of Contents Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  ii  Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  iv  List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  ix  Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  xi  Co-authorship Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  xii  1  Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Summary of Contributions . . . . . . . . . . . . . . . . . . . . . . . . 1.1.1 Syntactic Modeling of Multifunction Radars for ESM . . . . . . 1.1.2 Syntactic Tracking and Ground Surveillance with GMTI Radar 1.1.3 Scope of the Research . . . . . . . . . . . . . . . . . . . . . . 1.2 Survey of Tracking Algorithms and Methodologies . . . . . . . . . . . 1.2.1 Numeric Tracking Algorithms . . . . . . . . . . . . . . . . . . 1.2.1.1 Bayesian Trackers . . . . . . . . . . . . . . . . . . . 1.2.1.2 Numeric Tracking with Meta Information . . . . . . 1.2.2 Symbolic Tracking Algorithms . . . . . . . . . . . . . . . . . . 1.2.2.1 Logic Expert System . . . . . . . . . . . . . . . . . 1.2.2.2 Fuzzy Logic . . . . . . . . . . . . . . . . . . . . . . 1.2.2.3 Bayesian Network . . . . . . . . . . . . . . . . . . . 1.3 Background of SCFG and Its Applications . . . . . . . . . . . . . . . . 1.3.1 Stochastic Grammar and Its Statistical Properties . . . . . . . . 1.3.2 Survey of Applications Using Grammar . . . . . . . . . . . . . 1.3.3 SCFG and Tracking . . . . . . . . . . . . . . . . . . . . . . . 1.4 Meta Level Tracking Formulation in SCFG . . . . . . . . . . . . . . . 1.4.1 Reasoning with Stochastic Grammar . . . . . . . . . . . . . . . 1.4.2 Why SCFG for Meta Level Tracking . . . . . . . . . . . . . . .  . . . . . . . . . . . . . . . . . . . .  . . . . . . . . . . . . . . . . . . . .  . . . . . . . . . . . . . . . . . . . .  . . . . . . . . . . . . . . . . . . . .  1 4 5 7 9 11 12 12 14 15 16 17 18 19 19 22 23 24 26 28  iv  Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2  Syntactic Modeling and Signal Processing of Multifunction Radars . . . . . . . . 2.1 Elements of Syntactic Modeling . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.0.1 Formal Languages . . . . . . . . . . . . . . . . . . . . . . . 2.1.0.2 Grammars . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.0.3 Chomsky Hierarchy of Grammars . . . . . . . . . . . . . . . 2.1.0.4 Regular Languages and Finite State Automata . . . . . . . . 2.1.0.5 Context-Free Languages and Context-Free Grammars . . . . 2.1.0.6 Non-Self-Embedding Context-Free Grammars . . . . . . . . 2.1.0.7 Stochastic Languages and Stochastic Grammars . . . . . . . 2.1.0.8 Stochastic Finite-State Languages, Markov chains, and HMM 2.2 Electronic Warfare Application - Electronic Support and MFR . . . . . . . . . 2.2.1 MFR Signal Model and Its System Architecture . . . . . . . . . . . . . 2.2.2 Inadequacy of HMM for modeling MFR . . . . . . . . . . . . . . . . . 2.2.3 A Syntactic Approach to MFR . . . . . . . . . . . . . . . . . . . . . . 2.2.4 A Syntactic Model For a MFR called Mercury . . . . . . . . . . . . . 2.2.4.1 Phrase Scheduler . . . . . . . . . . . . . . . . . . . . . . . 2.2.4.2 Radar Controller and the Stochastic channel . . . . . . . . . 2.2.5 MFR and System Manager - Markov Modulated SCFG . . . . . . . . . 2.3 Signal Processing in CFG Domain . . . . . . . . . . . . . . . . . . . . . . . . 2.3.1 Overview of MFR Signal Processing at Two Layers of Abstractions . . 2.3.2 Bayesian Estimation of MFR’s State via Viterbi and Inside Algorithms 2.3.3 MFR Parameter Estimation using EM Algorithm . . . . . . . . . . . . 2.4 Signal Processing in Finite-State Domain . . . . . . . . . . . . . . . . . . . . 2.4.0.1 Context-Free Grammars and Production Graphs . . . . . . . 2.4.1 CFG-based Finite-State Model Synthesis . . . . . . . . . . . . . . . . 2.4.1.1 Verification of Non-Self-Embedding . . . . . . . . . . . . . 2.4.1.2 Grammatical Decomposition . . . . . . . . . . . . . . . . . 2.4.1.3 Synthesis of Finite-State Components . . . . . . . . . . . . . 2.4.1.4 Composition of the Finite-State Automaton . . . . . . . . . . 2.4.2 State Machine Synthesis of the Mercury Radar Controller . . . . . . . 2.4.3 Finite-State Signal Processing . . . . . . . . . . . . . . . . . . . . . . 2.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  30 36 39 39 40 41 42 43 44 44 46 48 50 52 53 54 55 56 61 62 62 64 65 68 68 70 70 71 74 75 77 78 78  Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  82  Signal Interpretation of Multifunction Radars . . . . . . . . . . . . . . . . . . . 3.1 Electronic Support and MFR . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.1 MFR System Architecture and Its Signal Generation Mechanism . . . .  87 90 91  3  v  3.2  3.3  3.4  3.5  A Syntactic Representation of MFR Domain Knowledge . . . . . . . . . . . 3.2.1 Formal Languages and Transformational Grammars . . . . . . . . . 3.2.2 A Syntactic Model For a MFR called Mercury . . . . . . . . . . . . 3.2.2.1 Radar Manager . . . . . . . . . . . . . . . . . . . . . . . 3.2.2.2 Command Scheduler . . . . . . . . . . . . . . . . . . . . 3.2.2.3 Radar Controller and the Stochastic Channel . . . . . . . . 3.2.3 Well Posedness of the Model . . . . . . . . . . . . . . . . . . . . . . Statistical Signal Interpretation of the MFR Signal and Control . . . . . . . . 3.3.1 MLE of MFR’s State via Viterbi and Inside Algorithms . . . . . . . . 3.3.2 Model Parameter Estimation using EM Algorithm . . . . . . . . . . 3.3.3 Optimization of Target-MFR Interaction Dynamics . . . . . . . . . . Numerical Studies of the Algorithms . . . . . . . . . . . . . . . . . . . . . . 3.4.1 Implementation of the Software . . . . . . . . . . . . . . . . . . . . 3.4.2 Model Complexity and Its Modeling Power . . . . . . . . . . . . . . 3.4.3 Numerical Results of State and Parameter Estimation . . . . . . . . . 3.4.4 Numerical Results of Autonomous Selection of Maneuvering Models Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  . . . . . . . . . . . . . . . . .  94 94 97 97 100 101 102 103 104 105 106 109 109 109 110 112 112  Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 4  Syntactic Tracking and Ground Surveillance with GMTI Radar . . . . 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.1 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Overview of GMTI Based Syntactic Tracking . . . . . . . . . . . . . 4.2.1 SCFG for Syntactic Target Tracking . . . . . . . . . . . . . . 4.2.2 Syntactic Tracking Estimation Overview . . . . . . . . . . . 4.2.3 GMTI Based Syntactic Tracking Framework . . . . . . . . . 4.3 Syntactic Modeling of GMTI . . . . . . . . . . . . . . . . . . . . . . 4.3.1 SCFG Modulated State Space Model . . . . . . . . . . . . . 4.3.2 Dynamics of Syntactic Motion Patterns . . . . . . . . . . . . 4.3.3 Structural Analysis of the SCFG Model . . . . . . . . . . . . 4.3.3.1 Arc Tracklet . . . . . . . . . . . . . . . . . . . . . 4.3.3.2 m-Rectangular Tracklet . . . . . . . . . . . . . . . 4.3.3.3 Well Posedness of the Model . . . . . . . . . . . . 4.4 Syntactic Tracking Algorithms . . . . . . . . . . . . . . . . . . . . . 4.4.1 Syntactic Enhanced Tracker . . . . . . . . . . . . . . . . . . 4.4.1.1 Multiple Model Sequential MCMC (particle filter) . 4.4.1.2 Extended Kalman filter with IMM . . . . . . . . . 4.4.2 Extended Earley Stolcke Parsing of Target Trajectory . . . . . 4.4.2.1 Prediction . . . . . . . . . . . . . . . . . . . . . .  . . . . . . . . . . . . . . . . . . . .  . . . . . . . . . . . . . . . . . . . .  . . . . . . . . . . . . . . . . . . . .  . . . . . . . . . . . . . . . . . . . .  . . . . . . . . . . . . . . . . . . . .  119 119 122 123 124 127 127 128 128 130 131 132 132 133 134 134 135 136 138 140 vi  4.5  4.6  4.4.2.2 Scanning . . . . . . . . . . . . . . 4.4.2.3 Completion . . . . . . . . . . . . Experimental Setup and Results . . . . . . . . . . . 4.5.1 Experimental Setup . . . . . . . . . . . . . . 4.5.2 Pre-Processing of Experiment Data . . . . . 4.5.3 Numerical Studies of Syntactic Tracking . . 4.5.4 Performance of Syntactic Enhanced Tracker . Conclusion . . . . . . . . . . . . . . . . . . . . . .  . . . . . . . .  . . . . . . . .  . . . . . . . .  . . . . . . . .  . . . . . . . .  . . . . . . . .  . . . . . . . .  . . . . . . . .  . . . . . . . .  . . . . . . . .  . . . . . . . .  . . . . . . . .  . . . . . . . .  . . . . . . . .  141 141 142 143 144 145 146 150  Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 5  Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . . 5.1 Overview of Thesis . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Discussion of Results . . . . . . . . . . . . . . . . . . . . . . . 5.3 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.1 Grammatical Knowledge Representation with Ontology 5.3.2 Feedback Control with Meta Level Information . . . . . 5.3.3 Grammatical Inference with Genetic Programming . . .  . . . . . . .  . . . . . . .  . . . . . . .  . . . . . . .  . . . . . . .  . . . . . . .  . . . . . . .  . . . . . . .  156 156 157 158 159 160 161  Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 Appendices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 A Functional Specification of the Mercury Emitter . . . A.1 General Remarks . . . . . . . . . . . . . . . . . . A.2 Radar Words . . . . . . . . . . . . . . . . . . . . A.3 Time-Division Multiplexing – Phrases and Clauses A.4 Search-while-Track Scan . . . . . . . . . . . . . . A.5 Acquisition Scan . . . . . . . . . . . . . . . . . . A.6 Track Maintenance . . . . . . . . . . . . . . . . .  . . . . . . .  . . . . . . .  . . . . . . .  . . . . . . .  . . . . . . .  . . . . . . .  . . . . . . .  . . . . . . .  . . . . . . .  . . . . . . .  . . . . . . .  . . . . . . .  . . . . . . .  . . . . . . .  . . . . . . .  164 164 164 164 165 166 166  B Justification of Logit Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168 C GMTI STAP Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 D List of Publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173  vii  List of Tables 1.1  Listing of numeric processing state space models . . . . . . . . . . . . . . . .  13  2.1 2.2 2.3  Deterministic grammars, production rules, and languages. . . . . . . . . . . . . Mercury emitter phrase structure . . . . . . . . . . . . . . . . . . . . . . . . . Semi-ring operations of sum and product. . . . . . . . . . . . . . . . . . . . .  42 55 71  3.1 3.2 3.3 3.4 3.5  List of MERCURY radar commands and their corresponding radar words . . List of target’s motion models . . . . . . . . . . . . . . . . . . . . . . . . . Production rules of Mercury’s command scheduler. . . . . . . . . . . . . . . Production rules of Mercury’s radar controller. . . . . . . . . . . . . . . . . . The source and estimated parameter values of the Markov modulated SCFG. .  4.1 4.2  Simplified example demonstrating the Earley Stolcke parsing algorithm . . . . 140 Radar parameters of the DRDC XWEAR system used in data collection. . . . . 144  . 98 . 99 . 100 . 102 . 111  viii  List of Figures 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9  Markov vs SCFG dependency . . . . . . . . . . . . . . . . . . . . . . Abstract model of a multifunction radar . . . . . . . . . . . . . . . . . Self protection control strategy for evasive maneuvering mode selection System framework for meta level tracking with GMTI radar . . . . . . JDL data fusion model . . . . . . . . . . . . . . . . . . . . . . . . . . An example of hierarchical hidden Markov model . . . . . . . . . . . . A rule based system. . . . . . . . . . . . . . . . . . . . . . . . . . . . The Chomsky hierarchy of formal languages. . . . . . . . . . . . . . . System framework of the meta level tracking. . . . . . . . . . . . . . .  . . . . . . . . .  . . . . . . . . .  . . . . . . . . .  . . . . . . . . .  4 7 8 10 11 16 17 20 26  2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 2.11 2.12 2.13 2.14 2.15 2.16 2.17 2.18 2.19 2.20 2.21  The Chomsky hierarchy of formal languages. . . . . . . . . . Example of a finite-state automaton . . . . . . . . . . . . . . Derivation sequence for stochastic grammars . . . . . . . . . Example of a Markov chain . . . . . . . . . . . . . . . . . . . Example of a HMM . . . . . . . . . . . . . . . . . . . . . . . The electronic warfare framework . . . . . . . . . . . . . . . Decomposition of MFR signal as radar words . . . . . . . . . MFR system architecture . . . . . . . . . . . . . . . . . . . . Comparison of regular and context free grammar . . . . . . . Radar scheduling process as a grammatical derivation process High level functionality of the Mercury emitter . . . . . . . . Production rules of Mercury’s phrase scheduler. . . . . . . . . Weighted grammar of the Mercury emitter . . . . . . . . . . . Decomposition of radar signal processing . . . . . . . . . . . Inside and outside probabilities in SCFG. . . . . . . . . . . . Production graph . . . . . . . . . . . . . . . . . . . . . . . . Strongly-connected components of the production graph . . . Components of the finite-state automaton . . . . . . . . . . . Synthesis of the finite-state automaton . . . . . . . . . . . . . Production graph of the Mercury grammar . . . . . . . . . . . Mercury state machine components (Part 1) . . . . . . . . . .  . . . . . . . . . . . . . . . . . . . . .  . . . . . . . . . . . . . . . . . . . . .  . . . . . . . . . . . . . . . . . . . . .  . . . . . . . . . . . . . . . . . . . . .  42 43 46 47 49 50 51 52 54 54 57 58 61 63 64 69 73 74 76 78 79  . . . . . . . . . . . . . . . . . . . . .  . . . . . . . . . . . . . . . . . . . . .  . . . . . . . . . . . . . . . . . . . . .  . . . . . . . . . . . . . . . . . . . . .  . . . . . . . . . . . . . . . . . . . . .  ix  2.22 Mercury state machine components (Part 2) . . . . . . . . . . . . . . . . . . .  80  3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 3.11 3.12  Electronic support framework . . . . . . . . . . . . . . . . . . . . . . . . Decomposition of radar signal as radar words . . . . . . . . . . . . . . . . MFR system architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . Radar scheduling process as a grammatical derivation process . . . . . . . MFR states and transition probabilities. . . . . . . . . . . . . . . . . . . . Sample realization of MFR control process . . . . . . . . . . . . . . . . . Inside and outside probabilities in SCFG. . . . . . . . . . . . . . . . . . . Selection of maneuvering mode with stochastic approximation . . . . . . . Likelihood values of parameter estimation algorithm . . . . . . . . . . . . Trajectory optimization simulation . . . . . . . . . . . . . . . . . . . . . . Simulated sample path of maneuvering modes . . . . . . . . . . . . . . . . Empirical distribution of the occupancies in the four maneuvering models. .  . . . . . . . . . . . .  . . . . . . . . . . . .  91 92 93 96 99 103 104 107 111 113 113 113  4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 4.10 4.11 4.12  Chomsky hierarchy of formal languages . . . . . . . . . . . . . . . . . . . The battalion formations . . . . . . . . . . . . . . . . . . . . . . . . . . . Syntactic analysis of estimated tracks . . . . . . . . . . . . . . . . . . . . Intent inference framework with stochastic parsing. . . . . . . . . . . . . . Figure illustrates an example Earley parser state . . . . . . . . . . . . . . . SAR image captured by DRDC XWEAR system . . . . . . . . . . . . . . Relationship between the ECEF and Target Coordinate System . . . . . . . The output of the IMM/Extended Kalman filter for GMTI tracking . . . . . Likelihood probabilities of geometric patterns from ground target tracking . The trajectories of a pincer operation. . . . . . . . . . . . . . . . . . . . . Numerical results of parsing a square pattern . . . . . . . . . . . . . . . . . Tracking covariance reduction by feeding back meta level description . . .  . . . . . . . . . . . .  . . . . . . . . . . . .  121 124 126 128 139 143 145 147 148 148 149 150  A.1 Mercury output sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165  x  Preface The thesis is presented in the manuscript format where each chapter in the body of the thesis is a stand-alone manuscript paper with its own notation references, i.e. each chapter has its introduction, literature review, and research results. The thesis consists of five chapters: • Chapter 1 is the introduction of the thesis. The chapter summarizes the contributions of the thesis, provides surveys of the relevant algorithms, and discusses the system framework that is applied throughout the thesis. • Chapter 2 discusses the modeling aspect of meta level tracking to electronic support measure against a multifunction radar. A detailed syntactic model of a multifunction radar is introduced, where important radar components are identified and abstractly modeled as production rules of a stochastic grammar. The content of the chapter is based on the paper “Syntactic Modeling and Signal Processing of Multifunction Radars: A Stochastic Context Free Grammar Approach” referenced below. • Chapter 3 extends the work in Chapter 2, and it considers the controlling aspect of meta level tracking to electronic support measure. Multifunction radar’s model elements are parameterized by aircraft’s control parameters, which models the effect of the aircraft’s maneuvers on radar’s ability to detect and maintain tracks. Based on the parameterization, optimal selection of the evasive maneuver strategy is studied. The research results presented in the chapter are based on the paper “Threat Estimation of Multifunction radars : modeling and statistical signal processing of stochastic context free grammars”. • Chapter 4 applies meta level tracking to ground surveillance with GMTI (ground moving target indicator) radar. The research focuses on real time interpretation of two dimensional spatial patterns based on Bayesian track estimates. The interpretation is implemented as a combined tracker of a stochastic parser and a Bayesian tracker such as interacting multiple model. Various geometric patterns such as square and arc are analyzed in formal language theory, and corresponding grammar are constructed. The chapter content is based on the paper “Syntactic Tracking and Ground Surveillance with GMTI Radar”. • Discussion of results, proposal for future research and conclusion are presented in Chapter 5. xi  Co-authorship Statement The thesis is prepared in the manuscript format, and its content consists of three journal papers. The author’s contribution to the research presented in the papers is described here. The contribution Syntactic Modeling and Signal Processing of Multifunction Radars: A Stochastic Context-Free Grammar Approach is a research work based on collaboration with Dr. Visnevski from McMaster University and Dr. Krishnamurthy. Dr. Visnevski is credited for the SCFG formulation based on a unclassified radar intelligence report, and development of algorithms that approximate SCFG with a finite state Markov model (Section 2.1 and 2.4 of the paper). The author’s contribution to the paper is extension of the SCFG model to include more general radar management operation, identification of key radar model components, and development of a novel Markov modulated SCFG model and derivation of its corresponding state and parameter estimation algorithms (Section 2.2 and 2.3 of the paper). The author implemented the statistical signal processing algorithms in C++, and thoroughly tested the algorithms. Signal Interpretation of Multifunction Radars: Modeling and Statistical Signal Processing with Stochastic Context Free Grammar is a research work based on collaboration with Dr. Krishnamurthy. The work is built with the model developed in the previous paper, and the novelty is the extension of the model to enable aircraft control based on the state estimates of the multifunction radar. The Markov chain in the Markov modulated SCFG is parameterized by Logit distribution, where the parameters relate the aircraft evasive strategy to MFR’s operation mode. The syntactic model of the MFR is extended accordingly after the modification. The stochastic approximation algorithm applied to the selection of the optimal evasive strategy is based on Dr. Krishnamurthy’s previous work. The author implemented the signal processing algorithms in C++, and evaluated their performance. Syntactic Tracking and Ground Surveillance with GMTI Radar is a research work based on collaboration with Dr. Krishnamurthy and Dr. Balaji from DRDC (Defense Research & Development Canada). The work contains new insights on novel application of stochastic grammar to ground surveillance with GMTI (ground moving target indicator) radar. Dr. Balaji is credited for all the GMTI detection processing algorithms, and the flight trials performed for data collection. The author’s contribution to the paper includes syntactic modeling of the ground moving target’s trajectories, structural analysis of the model, generalization of a parsing algorithm to process GMTI data, and formulation of tracking algorithms to filter GMTI detections. The author implemented the tracking algorithms and the stochastic parsing algorithms in C++, and tested the algorithms with real data. xii  Chapter 1  Introduction The interest in tracking started since Kepler when people were curious about the movement of heavenly objects [43]. Tracking is state estimation of stochastic dynamic processes based on noisy observations, and it is relevant in various application domains: for example the tracking of planets in astrology, the tracking of aircrafts in defense, or the tracking of option price in financial forecasting. A fundamental problem in tracking formulation concerns with model representation of the unknown process. Initial approach models tracking as a function mapping observable to system states, and applies least square parameter estimation to determine the mapping [70]. As tracking evolves, state estimation of nonstationary time series became important, and Kalman filter was developed and became widely used [3]. For state space tracking, in general, the representation of a dynamic process is typically expressed as follows xk+1 = f (xk , wk ) zk =g(xk , vk ),  (1.1)  where xk denotes the state sequence, zk denotes the observations, and wk and vk are the process and the observation noise respectively. Wide spread of the state space system modeling is apparent in radar tracking community, where motion models such as constant velocity model, constant turn model, and various other maneuvering models have been developed [50]. In a nutshell, the aim of tracking is to model events of interest with stochastic processes, and derive algorithms to establish tracks (sequence of estimated state values) by recursively computing P(xk |z1:k ) [3], where z1:k = (z1 , z2 , . . . , zk ). In meta level tracking, the observer is interested not only in state estimation, but also inference of more abstract descriptors summarizing the symbolic interpretation of the states. In video surveillance, for example, the observer is not only interested in the whereabout of the people, but also their behaviour such as running and walking [51], or their interactions such as following and approaching [61]. More examples can be seen in speech recognition, where the observer is not only interested in recognizing the spoken words from a speaker, but also his/her intention with the words. The intention may be ordering of airline tickets [40] or communicating operation status as in air traffic control [58]. In general, a typical meta level tracking system  1  deals with the following switched mode state space model [4] xk+1 = f (xk , ak (θ ), wk ) zk =g(xk , ak (θ ), vk ).  (1.2)  The random variable θ takes values from a finite set of “meta level descriptors”, and a1:k is a sequence of discrete features summarizing the state sequence at each point in time. Examples of each of the variables will be given later. The aim of the meta level tracking is to recursively compute  θˆ = arg max P(θ |z1:k ) θ  = arg max θ  ∑a1:k P(z1:k |a1:k (θ ))P(a1:k (θ )|θ )P(θ ) , ∑θ ∑a1:k P(z1:k |a1:k (θ ))P(a1:k (θ )|θ )P(θ )  (1.3)  where ∑a1:k sums over all random variables (a1 , a2 , . . . , ak ), and ∑θ sums over the set of possible meta descriptors. Based on the factorization, the specification of meta level knowledge involves three components: • P(θ ) is the prior probability of the meta level and its computation is based mainly on the knowledge of the observer and the availability of data. • P(a1:k (θ )|θ ) is the likelihood probability of feature sequence being generated given a meta descriptor. In conventional jump Markov process, each θ characterizes a Markov chain, and the sequence a1:k is its realization [65]. • P(z1:k |a1:k (θ )) is the likelihood probability of the observation given the feature sequence. It is often function of a sensor model and its corresponding noise distribution. Using terminology from formal language, we will demonstrate that conventional techniques model a1:k as a regular language (equivalent counterpart to a Markovian state sequence), and the objective of the thesis is to generalize the formulation to include context free language. A formal language hierarchy between different languages is shown in Fig. 1.8, where it is clearly shown that regular language is the smallest subset of all the possible sequential data that can be generated from a finite set of symbols. The background of the formal language is given in Sec. 1.3, and it will be shown that SCFG is more general than regular language and can capture long range dependencies and recursively embedding structures that cannot be efficiently modeled by Markov models.  2  Toy Language Processing Example Motivating Meta Level Tracking Vocabulary: database of primitive patterns  Feature Detector 1 Feature Detector 2  Classifier  Meta Labels: database of grammar elements  Hypothesis Testing  Parser  Feature Detector N  S NP  green ideas sleep furiously Adj  VP N  V  Adverb  Meta level tracking is illustrated in a language processing setting, as it is one of the first applications utilizing context free language [19; 20]. The objective is not to explain language processing itself, but rather to demonstrate meta level tracking using it as an example. The figure above shows the overall process, where a stream of audio signal is shown at the bottom left, and the aim is to recursively interpret the signal in terms of English grammar. Conventionally, because the state space required to process the signal directly is too large, local summarization of the signal is performed at the lower level to map the signal to English words (analogous to the features a1:k described previously), and the words are processed with techniques such as hidden Markov model (HMM) [57]. HMM, and its variants such as couple Markov chain and variable length HMM, has been applied extensively in tracking higher level processes in, for example, language processing, motion analysis, and situation awareness. However, if we consider modeling sequential data in general, HMM approach has two main drawbacks: Inefficient Model Representation: In order to classify a process into categories, models have to be trained for each category. For example, one HMM for each particular word sequence. In general, however, the number of categories for sequential data can be so large that it makes the application of state space representation for meta level tracking intractable. Lack of Representative Training Data: A limitation on the data driven model can be seen from the sentence ”green ideas sleep furiously” shown in the figure. Even though the sentence is grammatically valid, but no English speaking individual will speak this sentence. As a result, no sample data of this sort will even exist. In order to deal with these two problems, this thesis focus on a knowledge based approach. For example, if domain knowledge of the mechanism that generates the sentence is known, which in this case is the English grammar, even though no sample data exists, correct modeling and classification of the sentence is still possible. Meta level tracking that allows domain experts to efficiently codify their knowledge is developed in this thesis, and it can track and interpret novel processes with little or no training data. 3  B B  B  B  B  B B  a1  a2  a3  a4  a1  a2  a3  a4  Figure 1.1: The diagram illustrates the differences between the Markov and the SCFG dependency structure.  In this thesis, mathematical algorithms that recognize meta level description of the underlying process from track estimates are developed. The model sequence a 1:k , in addition to Markov dependency, can model long range and branching tree dependency. The dependencies are illustrated pictorially in Fig. 1.1. The graphical model on the left shows a process with a sequential chain structure signifying the local Markov dependency between the states. The graphical model on the right illustrates an arbitrary tree structure with more complicated long range and branching dependency. As described above, the chain like structure is insufficient in modeling many real world applications. The objective of the thesis is to formulate a meta level tracking framework, introduce and analyze the use of stochastic context free grammar (SCFG) as the knowledge representation model, and discuss properties, applications, and algorithms involved. The thesis is presented in the manuscript format where each chapter is a stand-alone manuscript paper, i.e. each chapter has its introduction, literature review, and research results. The breakdown of the current chapter is as follows: Sec. 1.1 summarizes the research contributions, and briefly describes the application domains presented in the thesis. Sec. 1.2 provides a survey of the state of the art tracking algorithms in both the numeric and symbolic signal processing communities. The background material of the stochastic grammar and the resulting meta level tracking system framework are described in Sec. 1.3, and the formulation of meta level tracking with SCFG is defined in Sec. 1.4.  1.1 Summary of Contributions Conventional tracker provides real time state estimates based on sequential observation data. However, while the state estimates provide a true picture of the process under observation, it does not give any interpretation. This research in meta level tracking considers a two layer approach where the interpretation capability is implemented based on output from conventional Bayesian trackers. The methodology is to codify a priori knowledge with production rules in SCFG and analyze the filtered observable within the context of the syntactic model. The syntactic patterns observed in the track estimates contain valuable information for interpreting complicated sequential data. The algorithms are developed for two application domains: the electronic 4  support measure and the ground surveillance with GMTI (ground moving target indicator) radar. It has been shown in literature that SCFG is a more powerful representation language [30; 57], and it is capable of representing systems that cannot be fully characterized by Markov models. In what follows, Sec. 1.1.1 discusses the syntactic model characterizing the functionalities of a multifunction radar for electronic support measure. Multifunction radar is an agile radar that not only makes system characterization with Markov models difficult, it also demands specification of constraints that can not be efficiently modeled by Markov models. This part of the research is published in two journals papers, which are included in Chapter 2 and Chapter 3. Sec. 1.1.2 describes the development of a syntactic parser that estimates the geometric patterns of ground moving vehicles’ trajectories using GMTI radar. In ground surveillance, the extraction of the spatial geometric patterns is often performed by human operators as an indicator for intent inference. However, as the amount of GMTI data can easily overloads the operator, an automatic syntactic tracking algorithm is needed. This part of research has resulted in one published conference paper, and a journal paper manuscript that is included in Chapter4. In both application domains, the meta tracking algorithms developed are based on real data from real operational systems, and the data is provided to us based on our collaboration with DRDC (Defense Research & Development Canada).  1.1.1 Syntactic Modeling of Multifunction Radars for Electronic Support Measure Electronic support measure, a division of electronic warfare, involves intercepting and interpreting radiated electromagnetic energy to locate and identify radar sources, and to evaluate their potential threats. The particular type of threats considered is multifunction radars (MFRs). MFRs are sophisticated sensors with complex dynamical modes that are widely used in surveillance and tracking. They are capable of switching between multiple operation modes electronically, and adapt their radar waveforms according to the target state and the environment [8; 28; 76; 78]. The interpretation of a MFR based on the model (1.1) can be seen as follows: • θ denotes operation modes of a radar system, • ak represents radar commands such as search or track maintenance, • xk is a radar word consisting of a sequence of radar pulses of, for example, certain pulse repetition frequency, and • zk is a sequence of observed radar pulses. The problem is to intercept radar pulses zk , and estimate the operation mode θ of the radar system. However, because advanced radar systems such as MFRs are typically controlled by expert system with, for example, IF-THEN rules or fuzzy rules, the mechanism with which the radar words a1:k are generated is a series of rule applications[7; 69; 12; 53]. Markov assumption is not 5  sufficient in this setting as the dependency between observable has a complicated tree structure. In this context, a new solution to the interpretation of radar signal is critical to aircraft survivability and successful completion of missions. The electronic support algorithms developed in this thesis consider the self protection of a target from radar threats, and its two major functionalities are: 1) interpretation of the intercepted radar pulses to determine the radar’s operation modes such as “search” and “track maintenance”, and 2) evaluation of the threat and selection of the optimal evasive strategy. The research is based on a real anti-aircraft defense radar called Mercury, whose sanitized description is provided by DRDC and included in the appendixA. Fig. 1.2 illustrates the detailed model components developed to address the complicated behaviour of a MFR, and such detailed model is necessary so the first functionality, the interpretation of the intercepted radar pulses, can be implemented. The research results are published in [76] and the paper is included in Chapter 2. The MFRs’ three main characteristics that make their signal interpretation challenging are i) MFRs’ behaviour is mission dependent, that is, selection of different radar tasks in similar tactic environment given different policies of operation, ii) MFRs’ control mechanism is hierarchical where their top level commands are implemented by atomic radar-specific commands, and iii) MFRs are event driven and difference and differential equations are often not adequate. Our approach to overcome these challenges is to employ knowledge-based statistical signal processing with syntactic domain knowledge representation. In particular, we model MFRs as systems that “speak” a language that is characterized by a novel Markov modulated SCFG. We derive two statistical estimation approaches for MFR signal processing - a maximum likelihood sequence estimator to estimate radar’s policies of operation, and a maximum likelihood parameter estimator to infer the radar parameter values. Two layers of signal processing are introduced in this paper. The first layer is concerned with the estimation of MFR’s policies of operation, and it involves signal processing in the CFG domain. The second layer is concerned with identification of tasks in which the radar is engaged, and it involves signal processing in the finite-state domain. Both of these signal processing techniques are important elements of a bigger radar signal processing problem that is often encountered in electronic warfare applications - the problem of threat estimation that a radar poses to each individual target at any point in time. Fig. 1.3 demonstrates the second functionality that performs the optimal selection of evasive strategy for electronic support measure based on the developed MFR’s syntactic model. The research results are published in [78] and the paper is included in Chapter 3. In this work, the SCFG formulation of the MFR is based on the work presented in Chapter 2, and extensions are made to enable feedback control. The novel contributions presented in this work are as follows: 1) Parameterization of the transition probability distribution of the Markov modulated SCFG with Logit distribution. The parameterization relates the evasive maneuvering strategies available to the aircraft to the switching probability of a MFR’s operation modes. Such formulation allows feedback control of the evasive strategy based on the estimated MFR operation mode; 2) Detailed modeling of the link between the MFR operation mode to its corresponding radar commands. In Chapter 2, the operation modes are abstractly modeled as states in a Markov 6  Random Environment  Situation Assessment  Radar Controller  Command Queue  System Manager  Phrase Scheduler  Planning Queue  Figure 1.2: Multifunction radar is a complicated system that can switch operation modes electronically. The selection of the operation modes and the mapping of modes to radar commands are implemented with a complicated set of rules. The figure shows the abstract model components that capture the rule generation mechanism. The situation assessment provides the evaluation of tactical environment to the radar manager. The radar manager, based on the evaluation, selects a radar task on which the command scheduler/radar controller will operate. The command scheduler plans and preempts the tasks in the planning queue depending on the radar load, and the moves the tasks fixed for execution to the command queue. The radar controller maps the tasks in the command queue to appropriate radar commands, and which is retrieved by the radar for final execution.  chain. In Chapter 3, explicit formulation of the operation mode is provided in SCFG format; 3) Stochastic approximation algorithms were presented to select the optimal evasive strategy. Because of the complexity of the signal representation and recursive nature of the sequential data, direct optimization of the evasive strategy is not practical.  1.1.2 Syntactic Tracking and Ground Surveillance with GMTI Radar The aim of ground surveillance is to develop ground situation with many targets over a large region [10]. The current state of the art sensor technology for ground surveillance is airborne ground moving target indicator (GMTI) radar with space time adaptive processing [45]. The detail of the GMTI STAP technology is summarized in Appendix C. Motivated by GMTI’s capability to track ground moving targets over a large region, the development of ground surveillance involves maintaining tracks in a two dimensional plane and deriving spatial interpretation from the tracks. The geometric interpretation of the tracks is believed to convey useful information regarding the targets’ intents and capabilities. For example we are interested in whether a target is circling a restricted area (perimeter surveillance), or alternatively if a vessel is loitering 7  Control Strategy  Model 1  Target Maneuvers  Stochastic Environment  MFR  Threat Evaluation Figure 1.3: Selection of evasive maneuvering strategy based on estimation of MFR’s operation mode. MFR’s mode switching probability is modulated by the target’s maneuvering strategy. Stochastic approximation algorithm is applied to select the optimal strategy.  near the coast (possible smuggling attempt). Moreover, in military setting, a forward quarter intercept or a pincer is characterized by two cooperating vehicles maneuvering in arcs [7]. The dynamic state space of the ground moving targets based on the model (1.1) can be interpreted as follows: • θ denotes a set of geometric patterns that describes the ground vehicles’ trajectories, • ak represents motion tracklets such as the ground vehicles’ acceleration direction, • xk is kinematic state of the vehicle such as position and velocity, and • zk is GMTI observation such as range, angle and Doppler. The mode sequence a1:k summarizes a sequence of maneuvers or modes that causes the target to move in a two dimensional spatial trajectory. Conventional tracking of maneuvering targets assumes that ak follows a finite state Markov chain, and aims to compute the posterior distribution P(xk , ak |z1:k ) so as to compute conditional estimates of xk and ak . This is typically done by Interacting Multiple Model (IMM), particle filters [4; 66], or VS-IMM [74; 44] (VS-IMM is a more sophisticated IMM tracker where road structure is exploited to enhance tracking quality). In this thesis, in addition to tracking the kinematic states of the ground moving targets, we are primarily interested in determining specific patterns in target trajectories from conventional track estimates, which can be used to infer the possible intent of the target and situation assessment [7]. Fig. 1.4 illustrates the syntactic tracking developed in this thesis. The research results are published in [79], and its journal version is completed and included in Chapter 4. Our primary focus is on syntactic tracking for ground surveillance with GMTI radar. This has many civil and military applications. If domain knowledge exists to relate the patterns to certain 8  application specific intents, it has the potential to be used as an intent inference tool. Because of the enormous amount of data that can be generated from GMTI trackers, there is strong motivation to develop automated algorithms that yield a high level interpretation from the tracks. The key modeling contribution of this work is the construction of a syntactic model with SCFG for the mode sequence a1:k . The main goal is to devise SCFG models and associated polynomial time Bayesian syntactic parsing algorithms to estimate θ given estimates from the conventional target tracker. The main results of the work are summarized as follows: 1) Combined Tracking and Trajectory Inference: A framework for estimating target trajectory type from track estimates is developed by combining a tracklet estimator (conventional tracker) at fast time scale with a syntactic pattern estimator operating at slower time scale. This results in a novel GMTI based meta level tracking framework. 2) SCFG Modulated State Space Model: A SCFG modulated state space model that permits modeling of complex spatial trajectories is developed by deriving probabilistic production rules that characterize the target motion patterns. In addition, a detailed structural analysis of the SCFG model is also presented. More specifically, based on formal language techniques such as the pumping lemma, we show specific syntactic pattern like an arc generates a context free language, and it cannot be modeled by Markov models efficiently. 3) Bayesian Syntactic Tracking: A Bayesian syntactic tracking algorithm is developed. The interpretation of the syntactic patterns are represented by parse trees built on top of the target trajectories, which is tracked at the detection level by Bayesian filters such as particle filter and IMM/extended Kalman filter, and at the tracklet level by a generalized Earley Stolcke Bayesian parser [44; 54]. The Earley Stolcke algorithm is a generalization of the Forward-Backward algorithm for HMM, and it allows real time forward parsing. The complexity of the algorithm is O(l 3 ), where l is the length of the input string.  1.1.3 Scope of the Research This section describes what is not within the scope of the thesis. The analysis and the interpretation of the signal is based on algorithms formulated with SCFG, and it is not straightforward to extend them to more general grammars such as context sensitive or unrestricted grammar. In terms of knowledge representation, the algorithms derived are applicable mainly to interpret syntactic patterns in one dimensional sequential data. Even though more complicated grammars such as graph grammar and tree grammar may be applied to recognize patterns in two dimensional or even three dimensional data, they are not considered in this thesis; one dimensional sequential data is the most common data representation for tracking applications. In addition to the scope imposed by the grammar, model assumptions are also made in both of the projects to limit the scope, and they will be described next. In the electronic support measure problem, the aim is to interpret the operation mode of the MFR based on the intercepted radar pulses, and the interpretation is performed based on the assumption that the radar is not aware that it’s being tracked. If the radar is aware that it’s being tracked, it may adapt its responses according to the actions of the aircraft, and that turns 9  Geometric Pattern Knowledge−base  GMTI STAP Processor  Area of Interest  Tracklet Estimator  Syntactic Pattern Estimator  Radar Resource Allocator  Figure 1.4: The meta level tracking system is mounted on a aircraft equipped with GMTI radar. The aircraft can circle around an area, and maintain a consistent scan over a fixed region. The processing stages are as follows: GMTI detections provide target measurements such as range, angle and Doppler. The measurements are processed by the Tracklet Estimator to generate a sequence of acceleration vectors, and which are further processed by the Syntactic Estimator to answer questions such as if the target is circling or moving in an arc. The trajectory of a target circling around the building is shown with a yellow dashed line. The geometric knowledge is stored as a SCFG model. The derived knowledge about the target trajectory may be applied to adjust radar parameters such as signal to noise ratio or the look direction.  the problem into a game theoretic problem. As a result, in the study of the maneuvering mode selection, the problem concerns how the aircraft can adaptively tunes its behaviour in response to the MFR operation mode, but the MFR does not adaptively choose its operation mode given the aircraft’s maneuvering mode; MFR chooses its operation mode base solely on the kinematic states of the target. In the ground surveillance problem, we performed preliminary work on interpreting the geometric patterns of ground moving vehicles. The work may be considered as the first step towards situation awareness because the algorithms developed are capable of annotating track estimates automatically given the supplied domain specific knowledge. It is not sufficient to be considered as situation awareness yet, as situation awareness establishes a much broader and comprehensive view. In addition, in terms of practicality of the system, the work is a proof of concept for extraction of geometric patterns from trajectories, and only basic tracking capability is implemented even though a great deal of research exists in multiple target tracking. Only limited work such as nearest neighbor data association is implemented, and no effort is spent on optimizing the GMTI radar resources. Yet, the research work demonstrated that syntactic tracking has the potential to establish higher level interpretation of the tracks, and more extensive work on integrating multiple target tracking with the presented framework can enhance the quality of the interpretation. 10  Sensors  Object refinement  Situation refinement  Threat assessment  Process refinement Figure 1.5: Joint Directors of Laboratories data fusion model proposed by the United States Department of Defense. The work presented in this thesis deals mainly with algorithms belonging to the Object refinement and the Situation refinement blocks. The algorithms relevant to the Object refinement are mainly numeric based, and the algorithms to the Situation refinement are mainly symbolic based.  1.2 Survey of Tracking Algorithms and Methodologies In order to put meta level tracking in context, its tracking process is interpreted as a high level data fusion process. Fig. 1.5 illustrates a generic data fusion model called Joint Directors of Laboratories that is proposed by the United States Department of Defense. The model decomposes the fusion process into four parts: Object refinement, Situation refinement, Threat assessment, and Process refinement. The object refinement is the stage where information is collected about each entity, i.e., the processing techniques such as data alignment, data association, and state estimation belong to this stage. Situation refinement is a stage where the attributes of the entities, and the spatial and temporal relationships among entities or within an entity’s trajectory are studied. The threat assessment stage, given the analyzed situation from the situation refinement, estimates and predicts the intents and the capabilities of the observed entities. The process refinement, lastly, fine tunes the fusion process adaptively as measurements are collected and analyzed. The description here is very concise, and more information can be found in [77]. The processing stages, if proceeding from left to right, concerns with data types that change from numeric to symbolic; the data association and tracking concerns with numeric data, and situation refinement concerns with entity attributes that are mostly symbolic and have no natural measure of distance. A meta level tracking system that deals with the model (1.3) may be regarded as a system framework that combines object refinement and situation refinement, where contextual analysis of sensor measurements is done with an aim to construct plausible explanation in the presence of uncertainties [63]. Current studies in object refinement and situation refinement belong to two different camps. The object refinement camp focuses on algorithms that recursively update state estimates based on numeric data, and the situation refinement camp, on the other hand, aims to reason with symbolic data with techniques such as logic and Bayesian network and to build corresponding knowledge repository. In this section, a survey is provided that summarizes the techniques that are relevant in each of the camps. (This literature review is about the meta 11  level methodology, and each chapter contains its own detailed literature review in its more specific context.) Sec. 1.2.1 summarizes the numeric tracking algorithms that are relevant to object refinement, and Sec. 1.2.2 summarizes the symbolic tracking algorithms that are relevant to situation refinement. It will be shown that numeric tracking algorithms are effective in dealing with sequential data, but has very limited meta level reasoning capability. The symbolic tracking, on the other hand, is strong in reasoning with complicated logic, but is not scalable to deal with sequential data.  1.2.1 Numeric Tracking Algorithms State of the art tracking algorithms are mainly formulated in state space model, whose knowledge representation focuses mainly in the form of differential or difference equation. The dominating technique in tracking has been the Bayesian filtering [6], for example the Kalman filter and the hidden Markov model, which provides mechanism to recursively update the a posterior probability distribution of the state given all the available observations. The Bayesian tracking algorithms are surveyed in Sec. 1.2.1.1. Extending the capabilities of the Bayesian tracker, many techniques have been developed to explore the use of meta information. It has been shown that the incorporation of meta information can greatly enhance the accuracy of the tracker, and a survey of tracking algorithms that exploit the meta information is provided in Sec.1.2.1.2. 1.2.1.1  Bayesian Trackers  The basic operation of a tracking system consists of modules such as detection, data association, track maintenance, and gate computation, and their operations are summarized here: Sensor measurements are processed to form detection hits and whose kinematics and attributes, such as length and shape, are estimated; data association assigns the detection hits to tracks in a database, and updates the tracks if assignments are found, delete the tracks if their quality degrades below a certain threshold, and initiates new tracks if none could be assigned to. For each track, the most likely position of its next measurement is predicted, and a gate measured in terms of variance is computed to facilitate the data association of the next sensor measurement. For single target tracking, Bayesian filter has been the de facto algorithm in recursive tracking of the target’s kinematics state; the dynamic state estimation is formulated as a recursive estimation of the posterior probability density function of the state, based on all available information. Let k denotes discrete time, the recursive estimation consists of two steps: prediction and updating. The prediction step is P(xk+1 |Z1:k ) =  P(xk+1 , xk |Z1:k )dxk =  P(xk+1 |xk )P(xk |Z1:k )dxk ,  12  P(xk |Z1:k−1 ) = N (xk ; xˆk|k−1 , Pk|k−1 )  LG  P(xk |Z1:k ) = N (xk ; xˆk|k , Pk|k ) N  P(xk |Z1:k ) ≈ ∑ wik δ (xk − xik )  Particle  i=1  wik ∝ wik−1 IMM  p(zk |xik )p(xik |xik−1 ) q(xik |xik−1 , zk )  P(xk+1 , rk+1 = j|Z1:k ) = ∑ πi j  p(xk+1 |xk , rk+1 = j)p(xk , rk = i|Z1:k )dxk  i  P(xk , rk = j|Z1:k ) =  p(zk |xk , rk = j)p(xk , rk = j|Z1:k−1 ) P(zk |Z1:k−1 )  Table 1.1: Listing of specifications of the major state space models used in numeric processing.  and the updating step is P(xk |Z1:k ) = P(xk |Z1:k−1 , zk ) =  P(zk |xk )P(xk |Z1:k−1 ) . P(zk |Z1:k−1 )  The expression of the two steps for three most common frameworks, i.e., linear Gaussian (LG), particle filtering, and interacting Multiple models (IMM), are shown in Table 1.1. LG is the mostly commonly used dynamic model with Kalman filter, and various extension exists. For example, extended Kalman filter approximates the nonlinearity in the system model. It should be noted the particle filter is suitable to problems where the dynamics are nonlinear or the noise that distorts the signal is non-Gaussian. In such case, Kalman filter solutions may be far from optimal. Instead of approximating the functional form of the state transition or the observation model, particle filter approximates the probability distribution function itself using a a set of simulated samples and weights . The IMM, on the other hand, assumes a set of possible dynamic models M = {M j }rj=1 to handle maneuvering targets where a single dynamic model is not sufficient, and the prior probability is P(Mj |Z0 ) = u j (0), where Z0 is the prior information, and ∑rj=1 u j (0) = 1. Each possible model is assumed to be linear-Gaussian, and the approach is applicable to nonlinear systems as well since they can be linearized. When tracking multiple targets, on the other hand, the tracking problem is often modeled as a two stage problem consisting of a data association problem and multiple single target state estimation problems. The data association methods can be categorized into three types: Heuristic, Maximum a posterior, and Bayesian approach. Heuristic approach such as nearest neighbor (NN) filter processes the new measurement in a certain order, and associates it with the track 13  whose predicted position is closest to it. NN basically assumes Gaussian distribution and hence the assignment with the maximum likelihood is the one giving the shortest Euclidean distance. MAP approach, on the other hand, associates new measurements with the most probable track. Exact Bayesian data association is not trackable, a pseudo-Bayesian method called joint probabilistic data association (JPDA) filter is often used. JPDA uses Bayesian inference to perform data association by assuming priori distribution for the background clutter processes. More specifically, let x be the state of the process and y the measurement, it is assumed that there is an association matrix Φ, whose element is defined as Φ(t, i) =  1, 0,  if measurement yi originated from target t, otherwise  Throughout the process of maximizing the conditional probability P(Y, Φ|X) = P(Y |X, Φ)P(Φ|X), the association matrix can be obtained. That is ˆ = arg max{log p(Y |X, Φ) + log P(Φ|X)}. Φ Φ  JPDA filter would enumerate all measurement-target association at each scan and then combine them into a single component. Multiple hypothesis tracking also belongs to the MAP approach, and it allows sophisticated management of all possible association hypotheses for data association. Other sequential Monte Carlo method for multiple target tracking and data fusion can be found in [34]. 1.2.1.2  Numeric Tracking with Meta Information  Previous section discusses mainly algorithms in conventional tracking where a single model is applied to perform kinematic tracking. However, because of the complexity of the process under observation, the single model approach may be insufficient. Researchers have been developing the ideas of using meta information to enhance the accuracy of the tracker by allowing multiple models for different aspects of the target, and two main directions can be categorized: 1) Incorporation of meta information of the target, and 2) Incorporation of meta information of the state sequence. In the first case, information of the target, either static information such as target class and target platform attributes or dynamic information such as target maneuvering modes and terrain information, can be utilized to enhance the kinematic model for tracking. In the second case, information of the target’s state sequence may include specific areas of interest that the target visited or specific maneuvering moves such as repeated accelerations and decelerations that, for example, may indicate move patterns to avoid being tracked by GMTI. This section provides a literature survey of the tracking techniques that exploits the two categories of meta information. The incorporation of target’s meta information to enhance tracking is found mainly in radar tracking community. In [7], attribute tracking is discussed where target class information such 14  as wing span and jet engine modulation are utilized to facilitate data association. The meta information reduces the variance in the estimation of target’s capability and thus is of great value in establishing track and intent inference. In addition, a different type of meta information concerns the knowledge of the target’s maneuverability. Multiple model estimation [4] has been applied extensively to tracking maneuvering targets, where a fixed set of kinematic models is assumed to be able to characterize target’s maneuverability fully. However, initial research in applying a fixed set of kinematic models has reached its limitation, and work has been done to extend the idea to include a set of adaptively chosen kinematic models to track maneuvering targets [49]. The meta information concerns which model set is responsible for the target’s kinematics, and the model is called VS-IMM (variable structure interacting multiple model). In [44], VS-IMM is applied in airborne tracking where the variable structure is modulated by the road structure at the target’s position. The incorporation of meta information of target’s state sequence, on the other hand, is found mainly in the artificial intelligence community. Initially, hidden Markov model is the main computational tool in analyzing the target’s state sequence [9]. However, it is found that in order to model the complex meta information of target’s intents and actions, hidden Markov models is insufficient as it is incapable of modeling the complex multi-scale process that appears in many natural process. [29] generalizes hidden Markov models by defining embedded transition structure in each state, thus allowing each state to generate nested state sequence. The model is named hierarchical hidden Markov model (HHMM) and it is illustrated in Fig. 1.6. The model classifies states into root state, internal state, production state, and end state. Each internal state is a HHMM itself, and nesting mechanism is characterized by vertical transitions from an internal state to its child states. Production states are equivalent to the states in HMM, and only they can generate outputs, and the end state triggers return of control back to the internal state. The importance of this model is signified by its attempt to model hierarchical structure of the process, which is characterized by the multiple length scales and recursive embedding structure. As it is shown in [29], HHMM is a special case of SCFG. A similar idea is developed in [13], where the transition of the hidden Markov model is re-formulated to be modulated by a higher level Markov process, and this higher level Markov process may be modulated by yet another higher level process. The developed model is named Abstract hidden Markov model (AHMM).  1.2.2 Symbolic Tracking Algorithms High level information fusion requires human like reasoning to integrate separate evidence to assess the likelihood of different situations using prior and emergent knowledge. One of the oldest knowledge representation is deductive logic, and it has been the mathematical framework used for inference since Aristotle. The basic format of a logic is A → B, and its semantics is interpreted either as 1) if A is true, then B is true, or 2) if B is false, then A is false. Inference systems have been built to perform automatic reasoning with the knowledge representation, and they are discussed in Sec. 1.2.2.1. However, one practical issue regarding this knowledge 15  R  I  P  P  P  I  E  P  E  P  E  Figure 1.6: The figure illustrates an example hierarchical hidden Markov model. The label R denotes the root state, I denotes the internal states, P denotes the production state, and E denotes the end state.  representation is that the evidence collected may not be sufficient for the evaluation of the logic statement. The evidence corresponds to the derivations from sensor measurements, and it is only sufficient to claim the truth of the statement B. However, if B is true, the most one can claim is that A is plausible [37]. Information fusion relies heavily on appropriate strategies for knowledge representation and reasoning, and deal with uncertain and incomplete information. Two other high level information processing techniques are based on fuzzy logic and probability theory, and they are summarized in Sec. 1.2.2.2 and Sec. 1.2.2.3 respectively. 1.2.2.1  Logic Expert System  Based on the logic knowledge representation, many automatic inference systems have been developed [33]. Practical situation assessment systems can be implemented based on logical rules [32], where the rules provide necessary inferential chaining and facilitate uncertainty management. Some examples of the rule based system frameworks include production rule based system, frame based system and case based system [7; 42], and they will be summarized in this section. Production system provides a framework that separates the knowledge bases and the inference mechanism, and it allows implementation of parallel computation to increase the performance of the system. Fig. 1.7 illustrates the basic components making up a rule based expert system. The production system is a formalism for problem-solving knowledge representations[72, Chapter 5.3], which is used by Chomsky to specify rewrite rules in linguistics, and it consists of three main components: knowledge base, working memory, and inference engine. A knowledge base is a database of domain declaration and production rules, which are typically expressed in IF-THEN format; a working memory is a memory store that contains data on the current state of the problem; and an inference engine is a finite state machine with a cycle consisting of three action states: match, select and execute. The match rule compares the data pattern in working memory against the conditions of the rule set, and execute rule performs the action part of the rule. If any of the rules are in conflict, the select rule picks a rule from the con16  Knowledge Base  Working Memory conclusions  facts  Inference Engine Figure 1.7: A rule based system.  flict set for execution. The cycle repeats until a solution is found. Corkill applies a blackboard system for collaborative situation assessment benefiting from distributed information sources [23]. Another inference system is template based systems. The expert knowledge is organized as a schemata consisting of a set of slots, constraints on these slots, and relationship with other schemata [60]. One example of the knowledge representation used in intelligent transportation system is for automatic ticket ordering. The essential information about arrival and departure of trains is collected in a frame data structure organized as Departure(train(360), location(abdn),time(1000)). The data structure indicates that the information ”train 360 is leaving Aberdeen at 10:00 am” can be associated with a template such as [train] is leaving [station] at [time], and the slots in square brackets can be filled by looking up a table. Other similar techniques such as case based reasoning applied in decision support system [52] can also be found. However, such knowledge representation is more suitable for reasoning with static data. The specification of frames or cases for sequential data requires a lot of data, and the matching process can be computational complicated. 1.2.2.2  Fuzzy Logic  Fuzzy logic can be interpreted as a logic system with its predicate assumption relaxed. Fuzzy logic involves extension of Boolean set theory and Boolean logic to a continuous-value logic via the concept of membership functions. Member functions are application specific continuous functions defined on the interval [0, 1] and they are used to quantify imprecise concepts such as threat of a target or height of average males. Fuzzy logic allows imprecise specification of an attribute, where as probability allows imprecise specification of an attribute value [71]. [7] applies fuzzy logic to situation assessment and sensor management; [2] uses fuzzy clustering for image segmentation; [11] approaches intrusion detection program with fuzzy logic, and [26] developed Fuzzy Intrusion Recognition Engine (FIRE) using fuzzy sets and fuzzy rules. 17  In FIRE, simple data mining techniques were applied to process the network input data and generates fuzzy sets for the observed feature, and application specific fuzzy rules are used to detect the intrusion. The objections against fuzzy logic that prevents us from using it for meta level tracking are two folds: the generation process of the fuzzy rules can be intensive and it is not straightforward for non-experts, and the application of fuzzy logic to process temporal sequential data is difficult. Two approaches can be taken for fuzzy logic to deal with temporal sequential data: one is to combine fuzzy logic with template technique, which was described in the previous subsection, and the other is the fuzzy state machine [7]. Both approaches have their drawbacks. The template based system requires extensive expert knowledge to build, and it is not efficient in the sense that each new instance may require a new template. The fuzzy state machine, on the other hand, suffers from the same problem as the state space model. 1.2.2.3  Bayesian Network  A Bayesian network is a graphical probabilistic model consisting of a finite set of nodes and edges, and it represents knowledge as a joint probability distribution function of a set of propositional variables; the nodes denote the propositional variables, the edges denote causal relationships between the variables, and the topology of the network encodes the qualitative knowledge of the domain of interest [62]. The construction of the Bayesian network often requires extensive expert knowledge, especially in the military where data is typically lacking [39]. Bayesian network has been applied in many diverse fields, and a few references in the field of high level information fusion and defense related fields are summarized here: [25] applies Bayesian belief network to battlefield situation awareness; Kruegel et al [47] proposed a multi-sensor fusion approach where the outputs of different intrusion detection system sensors are aggregated to produce a single alarm; Bayesian inference is employed to determine if traffic burst contain cyber-attacks [75]; and in [73], a graphical belief network representation is adopted to deal with uncertainty and entity specific behavioural specifications for high level information fusion. A process of obtaining the topology of a Bayesian network from domain’s expert knowledge can be found [24], and the learning of which is described in [59]. However, even though Bayesian network has a general mechanism to model complex system, some issues that make it inapplicable are summarized as follows: the performance of the system is sensitive to the choice of the prior distribution, the complexity of the network topology often grows exponentially with the number of targets, and new Bayesian networks have to be constructed in real time as the scenario changes. Even though dynamic construction of Bayesian network is possible [16], the complexity of constructing plan hypothesis (key feature of dynamic Bayesian network construction) is extensive for the setting of our interest.  18  1.3 Background of SCFG and Its Applications In the previous section, many techniques are surveyed with their advantages and disadvantages discussed. The observation is that research in interpreting track estimates involves both numeric and symbolic data. However, while numeric algorithms are efficient at processing sequential data, their capability to deal with symbolic inference is limited. The attempt to model meta description usually comes with increased computational complexity. Similar situation for the symbolic algorithms. While they are expressive in representing many complex model formulation, their capability to deal with sequential data is limited. In this section, SCFG will be introduced, and it is believed SCFG serves as a practical balance between numeric and symbolic processing.  1.3.1 Stochastic Grammar and Its Statistical Properties Let’s define a formal language as any set of strings consisting of concatenations of symbols, and the set of features as a complete set of distinguishable symbols in the language. For example, a feature set might be T = {a, b}, and one language over this set might consist of all finite (or null) repetitions of the combinations ’ab’ followed by either ’b’ or ’aa’; in this language, the strings ’b’, ’aa’, ’ababaa’ and ’ababb’ are valid strings but ’aba’ is not. In order to compactly represent the language generation mechanism, a grammar is used. A probabilistic grammar is a 4-tuple G = {N, T, P, S}, where • N is a finite set of nonterminals • T is a finite set of terminals • P is a finite set of production rules • S ∈ N is the start symbol The terminals represent the features, and the nonterminals denote the meta level information. Production rules specify the generation of the language, and they model the effect of specialization and decomposition. The grammars are divided into four different types according to the forms of their production rules [19; 20; 27], and they are organized in a hierarchy that is known in the literature as the Chomsky hierarchy of grammars. Fig. 1.8 illustrates the four types of grammar: regular grammar, context free grammar, context sensitive grammar, and unrestricted grammar. Let A and a be a nonterminal and a terminal respectively, the properties of the grammar types are summarized below: • Regular has production rules of the form A → aA or A → a. The left-hand side of the production rule must contain one nonterminal only, and the right-hand side could be either one terminal or one terminal followed by one nonterminal. 19  regular context-free context-sensitive unrestricted  Figure 1.8: The Chomsky hierarchy of formal languages.  • Context-Free has production rule of the form A → η , where η ∈ (N ∪ T )+ . The left-hand side must contain one nonterminal only, whereas the right-hand side can be a sequence containing arbitrary combination of terminals and nonterminals. • Context-Sensitive has production rules of the form α1 Aα2 → α1 ηα2 , where α1 , α2 ∈ (N ∪ T )∗ , and η = ε . The allowed transformations of nonterminal A are dependent on its context α1 and α2 . • Unrestricted has production rules of the form α1 Aα2 → γ , where α1 , α2 , γ ∈ (N ∪ T )∗ . No restrictions are placed in the production rules. Regular grammar is analogous to the state space model because of its left to right dependency, and the context free grammar is a more general model that includes the regular grammar. More specifically, a stochastic regular grammar is equivalent to a hidden Markov model with terminal states (more detail can be found in [14]). The thesis focuses on the exploitation of context free rules in signal processing, and the more general grammars are not considered for the following two reasons: 1) context free rules are intuitive for human knowledge codification, and 2) polynomial time algorithms are available to optimally parse strings generated by the regular grammar and the context free grammar, but the parsing of context sensitive language is NP-complete, and there is no parser for unrestricted language that is guaranteed to stop [67]. More discussions of the language and the grammar are presented in Chapter 2. Context free grammar has production rules P of the form A → η where A ∈ N and η ∈ (N ∪ T )+ ; the superscript Σ+ indicates the set of all finite length strings of symbols in a finite set of symbols Σ, excluding the string of length 0. The rule A → η indicates the replacement of the nonterminal A by η . In addition, as shown in [20], any context free grammar may be reduced to Chomsky Normal form, and which has production rules of the form A i → A j Ak and Ai → w, where Ai , A j and Ak ∈ N and w ∈ T . An example of context free grammar in the Chomsky  20  Normal form consists of the following elements T = {a, b}  N = {A0 , A1 }  S = {A0 }  P = {A0 → A0 A1 |b, A1 → a}, where the bar | separates the two production rules, meaning that the nonterminal A0 may be mapped to either A0 A1 or b. Starting from the nonterminal A0 , the strings can be derived by applying production rules to iteratively replace nonterminal symbols with substrings. The preceding example admits the following derivations A0 ⇒ b A0 ⇒ A0 A1 ⇒ bA1 ⇒ ba etc. In order to make the model more robust to uncertainties in the process, and also to capture the random effect in the model, each production rule is associated with a probability value. Let A be a nonterminal in N, the probability of its production rule A → η in P is denoted as P(A → η ), and the probabilities must satisfy  ∑ P(A → η ) = 1,  η ∈Θ  where Θ is the set of all right hand sides for A in P. For example, the stochastic version of the grammar given above may be developed by assigning the following probabilities to the production rules 0.8  A0 −→ A0 A1  0.2  A0 −→ b  0.1  A1 −→ aA1  0.9  A1 −→ a  In addition to probabilities, many other techniques exist to extend the modeling power of the grammar. For example, a discriminative CFG is a 4-tuple {N, T, P, S} whose model elements are identical to SCFG, except instead of probabilities, score based on features is defined on each production rule, i.e. S(Pi ). Denote a parse by T , a parse tree then has score ∑{Pi ∈T } S(Pi ). To link it to SCFG, the probability of the production rule Pi can be expressed as F 1 exp ∑ λk (Pi ) fk (w1 , w2 , . . . , wN , Pi ), Z k=1  where { fk , λk }Fk=1 is a set of features and their associated weights. Another variation is semantics SCFG. A semantic grammar G is also a four-tuple, (N, T , P , S), where T = (T, IT ) and P = (P, IP ) [46]. The sets N, T , P and S are the same as the SCFG defined above. IT is a terminal dependent function IT : R → R, and IP is the function associated with the production rules. The functions allows the computation of attribute value for each derivable parse tree of the grammar.  21  1.3.2 Survey of Applications Using Grammar Grammar has found applications in many different fields, and a broad survey and some illustrative examples are provided in this section to demonstrate grammar’s diverse use. The survey is not comprehensive as it includes solely grammars that model sequential data. Other grammars such as tree grammar or graph grammar are also widely used [30], but they are out of scope because the thesis is focused on tracking sequential data. The survey is also purposely kept brief as more comprehensive and application specific surveys are provided in each of the chapters. One of the earliest development of grammar started in the field of language processing. Initial developments in language processing focuses on acoustic and speech modeling. The relevant techniques were spectral analysis and linear predictive coding [22] and hidden Markov model [38]. However, as the research reaches the language level, the complexity of the language demands more sophisticated models. Stochastic grammar found its place in the language modeling, where its syntax naturally models the structure of language [57]. The semantic meaning of a string of words is provided by its syntactic structure, which is constructed based on a context free grammar [15]. The knowledge of English grammar is coded into the production rules for human-machine interaction and automatic translation. A example grammar is provided next to show the structure of a grammar and its relation to language processing. N ={SENT ENCE, NOUN PHRASE,VERB PHRASE, ADJECTIV E, ADV ERB, NOUN,VERB} T ={green, ideas, sleep, f uriously} S ={SENT ENCE} P ={SENT ENCE → NOUNE PHRASE V ERB PHRASE NOUN PHRASE → ADJECT IV E NOUN PHRASE V ERB PHRASE → V ERB PHRASE ADV ERB ADJECTIV E → green NOUN → ideas V ERB → sleep ADV ERB → f uriously} Another field that is closely related to language processing is in the construction of compilers [1]. Compilers have to parse the syntactic structure of a code, induce its semantic meaning, and generate its corresponding machine assembly code. As an example, a snippet of grammar that  22  identifies the standard control flow, IF-THEN, in many programming languages is expressed as ST MT → if COND then MATCHST MT |MATCHST MT MATCHST MT → if COND then MATCHST MT else ST MT |SIMPLESTMT COND → C1 |C2 |C3 SIMPLESTMT → A1 |A2 |A3 C refers to a condition, and A an action. ST MT is a IF-THEN statement, and SIMPEST MT is a simple statement that can be executed directly. MATCHST MT , on the other hand, allows recursively embedding of IF statements. Arbitrary number of embedding statements can be modeled with this compact set of rules. It should be noted that this is one of the dependencies that Markov models such as HMM have difficulties to represent. Moreover, for compiler related grammars, because preciseness is one of the inherent property of coding, ambiguity of the parse trees play an important role in the construction of grammar in this field. In addition to language processing, grammar has also found its use in biology for DNA and RNA sequencing. DNA and RNA consist of strings of nucleotides or amino acids, and the strings often manifest complicated structure as they fold in certain spatial arrangement. The approach to the analysis of biological sequences is similar to that of the speech processing. Initially the biological sequences are treated as one-dimensional strings of symbols, and hidden Markov model was applied to perform detection and classification. However, when the one-dimensional assumption is removed and the three-dimensional folding of the proteins and nucleic acids has to be modeled, HMM becomes insufficient, and more sophisticated models such as context free grammar becomes essential because of the resulting long range dependency from the complicated spatial folding[27]. The terminals are, for example, the types of amino acids, and the nonterminals are the categorized spatial structure. An example of the sequencing structure that specify the various relationship among the amino acids is given below [67]: S− > aSb  base pairs  S− > aS|Sa|a  unpaired bases  S− > S  deletions  S− > SS  branching secondary bases  1.3.3 SCFG and Tracking From literature survey in Sec. 1.2.1 and Sec. 1.2.2, the inclusion of domain knowledge as meta level information in tracking can greatly enhance the tracking quality, both for the kinematic estimation in numeric tracking and the intent estimation in symbolic tracking. In this thesis, we focus on the application of SCFG to characterize and represent meta information. SCFG has been applied extensively in modeling complex structure of the mechanism that generates 23  sequential observations, where the common theme has been to exploit the expressivity of the grammatical production rules to characterize the generative mechanism of the signal source. Such concept is directly applicable to tracking temporal processes, and a survey of the SCFG based tracking methodologies will be provided in this section. The applications of grammar that is closely related to the work presented in the thesis originates mostly from the plan recognition and the video surveillance community. Even though many tracking related applications have been developed based on SCFG, research in combining the Bayesian tracking and the SCFG is limited. It has been shown that grammar is particularly suited to tracking state sequences with complex multi-scale structure and recursive nature. For example in plan recognition, plans of an agent, typically his actions, have to be inferred from observations. [16] approached the problem with a Bayesian network formulation, but due to the complex structure generating the actions, it is too computationally intensive. An extension of the work is found in [64], where grammar is applied to model the underlying plan. The actions are modeled as a pattern observed in noise, and stochastic grammar is applied to recognize the patterns. In addition, in video surveillance domain, hierarchical hidden Markov model is applied to track sequences of human actions such as “Preparing for dinner” and “Getting food from the fridge” [55], and it can be shown that the hierarchical hidden Markov model is a special case of SCFG [29]. Moreover, in [56], movements of targets such as U-turns are inferred based on measurements collected from a sensor network, and hierarchical structure of the fusion process is relevant in its formulation. In [35], SCFG is applied to video surveillance problems where meta models such as dropping a person off or picking a person up in a parking lot are detected. A subset of the parking lot monitoring grammar is shown below to illustrates the use of grammar. CAR T RACK → CAR T HROUGH|CAR PICKUP|CAR OUT |CAR DROP CAR PICKUP → ENT ER CAR CAR ST OP PERSON LOST CAR EXIT CAR EXIT → car exit|SKIP car lost The generation of the terminals depends on the detected location of the vehicle, and the events of interest are cars passing through and cars picking up a person. The sequence of detected events are parsed, and the resulting parse tree yields the interpretation. The drawback, however, is that the coupling between the grammatical level and the primitive feature generation level is very loose as the track derived modes, i.e. a1:k , are independently generated from sensor measurements, and the temporal constraints are imposed only at the higher inference level.  1.4 Meta Level Tracking Formulation in SCFG From the previous sections, it is observed that the limitation of the numeric processing techniques is that they do not naturally encompass attributes and are not efficient in performing high 24  level data fusion. The limitation of the symbolic processing techniques, on the other hand, is that they deal mainly with static data, and their extension to process sequential temporal data is either very computationally intensive or require great amount of training data. The solution developed in this thesis combines the numeric and symbolic processing in a two layer approach. Meta level tracking builds on top of conventional Bayesian tracking, and aims to infer meta level description from the track estimates. The goal of meta level tracking is to compute the meta level information with the highest posterior probability, i.e.  θˆ = arg max P(θ |a˜1:k ; Gθ ), where a˜1:k = arg max P(a1:k |z1:k ; GRG θ ) θ ∈Θ  a  (1.4)  where Θ is a set of meta level descriptors of interest, and a˜1:k is the estimated feature sequence that can be computed via conventional Bayesian tracking algorithms. Equation (1.4) demonstrates the two fundamental components in a meta level tracking system: • Meta Level Tracker. The term ) P(θ |a1:k ; GCFG θ  (1.5)  is the posterior probability of the meta level descriptor given the feature sequence and it allows for the classification of the feature sequence to a meta level descriptor. The specification of this component requires codification of the domain knowledge into SCFG production rules. • Numeric-Symbol Bayesian Tracker. The estimates of the feature sequence a˜1:k are computed via conventional Markov models. Throughout this thesis, we will approxi) by P(a1:k |z1:k ; GRG mate P(a1:k |z1:k ; GCFG θ θ ), i.e. approximate a SCFG model to its regular grammar counterpart. This approximation is described in detail in Chapter2, and it is important for the following reasons: Tracking feature sequence with long range dependency assumption is computationally very intensive, as exponentially increasing number of filters is required. In addition, the approximation not only makes the computation trackable, it also facilitates tracking with existing legacy Bayesian trackers. The details of the two terms are application specific, and they will be discussed in each of the chapters. According to the discussion presented, the system framework of the meta level tracking system that implements the two layer approach is summarized in Fig. 1.9. It consists of five components: an observation system, a numeric-symbol Bayesian tracker, a meta level tracker, a sensor manager, and a knowledge-base. The observation system may be an ESM (electronic support measure) radar, or an acoustic sensor in a sensor network. The numeric-symbol Bayesian tracker is a component that maps the raw sensor measurements to primitive features that the meta level tracker understands; Meta level tracker imposes domain expert knowledge on the estimated 25  Knowledgebase  Noise Signal  Observation system  Numeric−Symbol Bayesian tracker  Meta level tracker  Estimate  Sensor manager Figure 1.9: System framework of the meta level tracking.  track to interpret its higher level meaning. The sensor manager tunes the parameters of the observation system and the numeric-symbol Bayesian tracker based on the inferred knowledge of the track. Lastly, knowledge-base is a codification of the expert’s domain specific knowledge in SCFG production rules. As an illustrative example, the functionalities of the components can be imagined as an image understanding system. Pixels are grouped by the numeric-symbol Bayesian tracker into higher-level constructs such as lines and edges, and the meta level tracker studies the relations among the constructs and labels the image accordingly. The sensor manager would focus the tracker to the informative part of the image based on the prior knowledge, where the knowledge may includes known objects and their structural information. It should be noted that the maintainability of the system is greatly enhanced by separating the knowledge base and the reasoning process.  1.4.1 Reasoning with Stochastic Grammar This section discusses the procedures of formulating a meta level tracker with SCFG. A number of issues of developing a syntactic pattern recognition framework for the meta level tracker are summarized below [5]: • Selection of grammar type - deterministic, stochastic, or a hybrid grammar, • Selection of description complexity - string, tree, or graph grammar, • Selection of an optimal set of primitives - the number of primitives is a tradeoff between complexity and description power; too many may lead to an unmanageably large set of rules, and too few may result in lack of description power, • Grammatical inference - learning the rules of the grammar and their associated probabilities for a pattern class from sample data, and 26  • Parsing - determine if a given sentence is generated by a certain grammar. The first three steps concerns the definition and construction of the grammar for a particular pattern recognition problem, and they were discussed in Sec. 1.3.1. The last two steps, on the other hand, are about model identification and estimation, and they are discussed in this section. The reasons supporting the formulation of meta level tracking with SCFG is summarized in the next section. The grammatical inference consists of two basic problems: 1) identification of the model structure, and 2) estimation of the model parameters. Model structure identification is still an open research area, where initial result states that the context free grammar cannot be learned from positive samples alone [31], and the problem is especially difficult because the number of possible grammatical structure for a given positive sample is exponential in its length. [68] attempted the problem by adding structure to the training data, which is a string with parentheses inserted to indicate the shape of the derivation tree of a grammar. Moreover, another attempt at the problem applies genetic programming trying to infer directly both the rules and the production rules [41]. However, it is impractical to assume the availability of a large set of training data, especially the structured examples, when the problem domain is in the defense setting. Given the constraints and the theoretical limitation, the grammatical inference problem addressed in the thesis is concentrated in the estimation of model parameters, and the construction of the model structure is relied on the expertise of the domain experts. Manual construction of the grammar relies on three basic techniques: matching, recursive relation, and the introduction of nonterminals for specific languages. The matching technique captures the syntactic constraints of the grammar. For example, for a palindrome, the kth leftmost element of the input string must be equal to the kth rightmost symbol, and such matching constraint is enforced by rules such as S → aSa and S → bSb. The recursive relation aims to express a string with a concatenation of other strings, and which implies the structure of the production rules. The introduction of the nonterminal, on the other hand, allows the designer to express relations among different languages, and the relations are codified in terms of production rules. Bayesian estimation is the chosen technique for the estimation of the model parameters. The optimal parameter values are those maximizing the likelihood of the parse trees. Let τ be a parse tree, and f (A → α ) the number of times the rule A → α is used in generating τ . The probability of the tree is computed with the following expression [18]: p(τ ) =  ∏  p(A → α ) f (A→α ;τ )  (1.6)  (A→α )∈R  Even though such probability assignment can be improper, i.e., the total probability of parses may be less than one, it is proven that the production rule probabilities estimated by the maximum likelihood estimation always impose proper distributions [18]. The EM iteration to com-  27  pute the maximum likelihood estimate of the production rule probability is p(A ˆ → η) =  ∑ni=1 E{ f (A → η ; w)} , ∑η s.t.(A→η ∈P) ∑ni=1 E{ f (A → η ; w)}  where w is string of terminals given to train the grammar. The trained model can be evaluated by computing the entropy of the resulting parse [48], and the empirical entropy can be expressed as ∑K log p(wk ) . H(τ ) = − k=1K ∑k=1 |wk | One practical issue of modeling with SCFG is that the signal generated has finite length, and this finiteness constraint must be satisfied if the model is to be stable. In addition, the finiteness criteria provides a constraint on the SCFG model parameters, which may be used as a bound on the parameter values. We discuss this point by first defining the stochastic mean matrix. Definition Let A, B ∈ N, the stochastic mean matrix MN is a |N| × |N| square matrix with its (A, B)th entry being the expected number of variables B resulting from rewriting A: MN (A, B) =  ∑  η ∈(N∪T )∗ s.t.(A→η )∈P  P(A → η )n(B; η )  where P(A → η ) is the probability of applying the production rule A → η , and n(B; η ) is the number of instances of B in η [17]. The finiteness constraint is satisfied if the grammar in each state satisfies the following theorem. Theorem If the spectral radius of MN is less than one, the generation process of the stochastic context free grammar will terminate, and the derived sentence is finite. Proof The proof can be found in [17]. Given the model specification, the interpretation of the observations can be formulated as a parsing problem. Parsing is the process of finding the derivation in the grammar that generates the sensor measurements or the terminals. Parsers can be roughly categorized into two types: top down and bottom up. The top down parser is goal-oriented; it starts with the start symbol and successively applies the production rules until the parse tree matches the input terminal string. The bottom-up parser, on the other hand, is data-oriented; it starts with the input terminal string and applies the production rules backward until the start symbol is reached [30; 57]. Description of the parsing algorithm is detailed in Chapter 4.  1.4.2 Why SCFG for Meta Level Tracking In this section, the objective is to explicitly state the advantages of the SCFG and list its model structures that support the formulation of meta level tracking. The underlying knowledge repre28  sentation of SCFG is based on two principles: specialization and decomposition. The decomposition follows the principle of compositionality [36], and it states that complex patterns can be decomposed into subpatterns, and the level of abstraction stops at primitive features where no more decomposition is possible [30; 57]. The specification, on the other hand, allows abstraction of concepts to provide interpretation of the sensor measurements. In this representation, not only a large set of complex patterns can be described compactly by a finite number of primitive features and grammatical rules, abstract interpretation of these patterns can be introduced. Direct consequences of the SCFG model structure are as follows: • Intuitive modeling of processes with complicated hierarchical structure and multiple length scale. With state space model, the modeling of sequential data with different modes requires a different state space or a statistical model for each time interval spanned by that mode. When multiple modes are present, competing hypotheses need to absorb different lengths of the input stream raising the need for naturally supported temporal segmentation. The decomposition of processes into subpatterns is naturally support by the SCFG formulation. • Intuitive modeling of processes with semantically equivalent meanings but radically different statistical properties. The abstraction principle supported by SCFG production rules naturally supports this modeling capability as each nonterminal can be mapped to different terminals. • Intuitive and efficient codification of human knowledge. Most often, the structure of the process is difficult to learn but is explicit and a priori known by the domain experts. The decomposition and abstraction are logical mechanisms for domain experts to codify their knowledge. An extra benefit of being able to decompose the process structure is that SCFG is applicable to processes with limited training data. In many real applications, even though the training data of the complete data sets is scarce, the training data of different subpatterns are often readily available. • Increased maintainability of the tracking system with separated knowledge base and reasoning mechanism [72]. The SCFG formulation separates knowledge representation (production rules) from the reasoning mechanisms such as parsing. This separation is valuable to meta level tracking as the meta knowledge is application specific and often maintained by experts other than the human operators using it. • SCFGs are more general models than Markov models and their trained models have lower entropies [48]. From formal language theory, it is proven that Markov model is the most restrictive grammar, and it is incapable of expressing sequential data with branching and recursively embedding dependencies [20; 21].  29  Bibliography [1] A. V. Aho, R. Sethi, and J. D. Ullman. Compilers: principles, techniques, tools. AddisonWesley, 2007. [2] N. Aifanti and A. Delopoulos. Fuzzy-logic based information fusion for image segmentation. In IEEE Internation Conference on Image Processing, pages 1210–1213, 2005. [3] B. D. O. Anderson. Optimal Filtering. Prentice-Hall, 1979. [4] Y. Bar-Shalom and X. Li. Estimation and Tracking: Principles, Techniques, and Software. Artech House, 1993. [5] M. Basu, H. Bunke, and A. Del Bimbo. Guest editor’s introduction to the special section on syntactic and structural pattern recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, pages 1009–1012, 2005. [6] S. S. Blackman. Multiple hypothesis tracking for multiple target tracking. Aerospace and Electronic Systems Magazine, 19(1):5–18, 2004.  IEEE  [7] S. S. Blackman and R. Popoli. Design and Analysis of Modern Tracking Systems. Artech House, 1999. [8] Y. Boers, H. Driessen, and J. Zwaga. Adaptive MFR parameter control: fixed against variable probabilities of detection. IEE Proceedings Radar Sonar and Navigation, pages 2–6, 2006. [9] M. Brand. Coupled hidden markov models for modeling interacting proceses. Technical report, MIT Media Lab, June 1997. [10] A. R. Brenner and J. H. G. Ender. Demonstration of advanced reconnaissance techniques with the airborne sar/gmti sensor pamir. In IEE Proceedings of Radar, Sonar and Navigation, volume 153, pages 152–162, 2006. [11] S. M. Bridges and R. B. Vaughn. Fuzzy data mining and genetic algorithms applied to intrusion detection. In Proceedings of the National Information Systems Security Conference, 2000. [12] R. A. Brooks. A robust layered control system for a mobile robot. IEEE Journal of Robotics and Automation, RA-2(1):14–23, March 1986. [13] H. H. Bui, S. Venkatesh, and G. West. Tracking and surveillance in wide-area spatial environments using the abstract hidden markov model. International Journal of Pattern Recognition and Artificial Intelligence, 15, 2001. 30  [14] F. Casacuberta. Some relations among stochastic finite state networks used in automatic speech recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(7):691–695, 1990. [15] E. Charniak. Statistical Language Learning. MIT Press, 1993. [16] E. Charniak and R. Goldman. A bayesian model of plan recognition. Artificial Intelligence, 64:53–79, 1993. [17] Z. Chi. Statistical properties of probabilistic context-free grammars. Computational Linguistics, 25:131–160, 1999. [18] Z. Chi and S. Geman. Estimation of probabilistic context-free grammars. Computational Linguistics, 24(2):299–305, 1998. [19] N. Chomsky. Three models for the description of language. IRE Transactions on Information Theory, 2(3):113–124, 1956. [20] N. Chomsky. On certain formal properties of grammars. Information and Control, 2(2):137–167, June 1959. [21] N. Chomsky and G. A. Miller. Finite state languages. Information and Control, 1(2):91– 112, May 1958. [22] J. Coleman. Introducing Speech and Language Processing. Cambridge University Press, 2005. [23] D. D. Corkill. Representation and contribution - integration challenges in collaborative situation assessment. In Fusion, volume 1, 2005. [24] B. Das. Representing uncertainties using bayesian networks. Technical report, Information Technology Division Electronics and Surveillance Research Laboratory, 1999. [25] S. Das, R. Grey, and P. Gonsalves. Situation assessment via bayesian belief networks. In Proceedings of the 5th International Conference on Information Fusion, pages 664–671, 2002. [26] J. E. Dickerson and J. A. Dickerson. Fuzzy network profiling for intrusion detection. In Proceedings of the 19th International Conference of the North American Fuzzy Information Processing Socity, 2000. [27] R. Durbin, S. Eddy, A. Krogh, and G. Mitchison. Biological sequence analysis: Probabilistic models of proteins and nucleic acids. Cambridge University Press, 1998.  31  [28] C. Duron and J. M. Proth. Multifunction radar: Task scheduling. Journal of mathematical modelling and algorithms, 1(2):105–116, June 2002. [29] S. Fine, Y. Singer, and N. Tishby. The hierarchical hidden markov model: Analysis and applications. Machine Learning, 32:41–62, 1998. [30] K. S. Fu. Syntactic Pattern Recognition and Applications. Prentice-Hall, Englewood Cliffs, NJ, 1982. [31] E. M. Gold. Language identification in the limit. Information and Control, 10:447–474, 1967. [32] K. K. R. G. K. Hewawasam, K. Premaratne, and M. L. Shyu. Rule mining and classification in a situation assessment application. IEEE Transactions on Systems, Man and Cybernetics, 37(6):1446–1459, 2007. [33] R. R. Hoffman and J. F. Yates. Decision making. IEEE Intelligent Systems Magazine, 20:76–83, 2007. [34] C. Hue, J. L. Cadre, and P. Perez. Sequential monte carlo methods for multiple target tracking and data fusion. IEEE Transactions on Signal Processing, 50(2):309–325, 2002. [35] Y. A. Ivanov and A. F. Bobick. Recognition of visual activities and interactions by stochastic parsing. T-PAMI, 22:852–872, 2000. [36] T. M. V. Janssen. Compositionality. In Handbook of Logic and Language, pages 417– 473. MIT Press, 1997. [37] E. T. Jaynes. Probability Theory The Logic of Science. Cambridge University Press, 2004. [38] F. Jelinek. Statistical Methods for Speech Recognition. MIT Press, 1997. [39] F. Johansson and G. Falkman. Implementation and integration of a bayesian network for prediction of tactical intention into a ground target simulator. In International conference on information fusion, pages 1–7, 2006. [40] B. H. Juang and S. Furui. Automatic recognition and understanding of spoken language - a first step toward natural human-machine communication. Proceedings of IEEE, 88(8):1142–1165, 2000. [41] T. E. Kammeyer and R. K. Belew. Stochastic context-free grammar induction with a genetic algorithm using local search. In Foundations of Genetic Algorithms IV, pages 3–5, 1996. 32  [42] J. Kelemen and G. Paun. Robustness of decentralized knowledge systems: A grammartheoretic view. Journal of Experimental & Theoretical Artificial Intelligence, 12:91–100, 2000. [43] J. Kepler. New Astronomy. Cambridge University Press, 1992. [44] T. Kirubarajan, Y. Bar-Shalom, K. R. Pattipati, and I. Kadar. Ground target tracking with variable structure imm estimator. IEEE Transactions on Aerospace and Electronic Systems, 36:26–46, 2000. [45] R. Klemm. Space-Time Adaptive Processing. IEE Press, Stevenage, UK, 1998. [46] D. E. Knuth. Semantics of context-free languages. Theory of Computing Systems, 2(2):127–145, June 1968. [47] C. Kruegel, D. Mutz, W. Robertson, and F. Valeur. Bayesian event classification for intrusion detection. In Proceedings of the 19th Annual Computer Security Applications Conference, 2003. [48] K. Lari and S. J. Young. The estimation of stochastic context free grammars using the Inside-Outside algorithm. Computer Speech and Language, 4:35–56, 1990. [49] X. R. Li. Multiple-model estimation with variable structure - part ii: Model-set adaptation. IEEE Transactions on Automatic Control, 45(11):2047–2060, 2000. [50] X. R. Li and V. P. Jilkov. Survey of maneuvering target tracking. part i: Dynamic models. IEEE Transactions on Aerospace and Electronic Systems, 39:1333–1364, 2003. [51] Y. M. Liang, S. W. Shih, A. Shih, M. Liao, and C. C. Lin. Learning atomic human actions using variable-length markov models. IEEE Transactions on Systems, Man, and Cybernetics - Part B, 39(1):268–280, 2009. [52] S. H. Liao. Case-based decision support system: architecture for simulating military command and control. European Journal of Operational Research, pages 558–567, 2000. [53] P. U. Lima and G. N. Saridis. Intelligent controller as hierarchical stochastic automata. IEEE Transactions on Systems, Man, and Cybernetics - Part B, 29:151–163, 1999. [54] L. Lin, Y. Bar-Shalom, and T. Kirubarajan. New assignment-based data association for tracking move-stop-move targets. IEEE Transactions Aerospace And Electronic Systems, 40(2):714–725, 2004. [55] S. Luhr, H. H. Bui, S. Venkatesh, and G. A. W. West. Recognition of human activity through hierarchical stochastic learning. In Proceedings of the First IEEE International Conference on Pervasive Computing and Communications, 2003. 33  [56] D. Lymberopoulos, A. S. Ogale, A. Savvides, and Y. Aloimonos. A sensory grammar for inferring behaviors in sensor networks. In International conference on Information processing in sensor networks, pages 251–259, 2006. [57] C. D. Manning and H. Sch¨utze. Foundations of Statistical Natural Language Processing. The MIT Press, 1999. [58] S. Miller, D. Stallard, R. Bobrow, and R. Schwartz. A fully statistical approach to natural language interfaces. In In Proceedings of the 34th Annual Meeting of the Association for Computatioinal Linguistics, pages 55–61, 1996. [59] R. E. Neapolitan. Learning Bayesian Networks. Prentice Hall, 2003. [60] D. F. Noble. Template based data fusion for situation assessment. In 1987 Tri-Service Data Fusion Sympo., pages 156–161, 1987. [61] N. M. Oliver, B. Rosario, and A. P. Pentland. A bayesian computer vision system for modeling human interactions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8):831–843, 2000. [62] J. Pearl. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann Publishers, 1988. [63] G. M. Powell. Tactical situation assessment challenges and implications for computational support. In Fusion, volume 1, 2005. [64] D. V. Pynadath and M. P. Wellman. Probabilistic state-dependent grammars for plan recognition. In Proceedings of the 16th Annual Conference on uncertainty in artificial intelligence, pages 507–514, 2000. [65] L. R. Rabiner. A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of IEEE, 77(2):257–286, February 1989. [66] B. Ristic, S. Arulampalam, and N. Gordon. Beyond the Kalman Filter Particle Filters for Tracking Applications. Artech House, 2004. [67] E. Rivas and S. R. Eddy. The language of RNA: a formal grammar that includes pseudoknots. Bioinformatics, 16:334–340, 2000. [68] Y. Sakakibara. Efficient learning of context-free grammars from positive structural examples. Information and Computation, 97:23–60, 1992. [69] G. N. Saridids and J. H. Graham. Linguistic decision schemata for intelligent robots. Automatica, 20:121–126, 1984. 34  [70] A. Sayed and T. Kailath. A state space approach to adaptive rls filtering. IEEE Signal Processing Magazine, 1994. [71] J. A. Stover, D. L. Hall, and R. E. Gibson. A fuzzy-logic archetecture for autonomous multisensor data fusion. IEEE Transactions on Industrial Electronics, 43(3):403–410, 1996. [72] I. S. Torsun. Foundations of Intelligent Knowledge-Based Systems. Academic Press, 1995. [73] H. Tu, J. Allanach, S. Singh, K. R. Pattipati, and P. Willet. Information integration via hierarchical and hybrid bayesian networks. IEEE Transactions on Systems, Man and Cybernetics, 36(1):19–33, 2006. [74] M. Ulmke and W. Koch. Road-map assisted ground moving target tracking. IEEE Transactions on Aerospace and Electronic Systems, 42(4):1264–1274, 2006. [75] A. Valdes and K. Skinner. Adaptive, model-based monitoring for cyber attack detection. In Recent Advances in Intrusion Detection, pages 80–92, 2000. [76] N. A. Visnevski, V. Krishnamurthy, A. Wang, and S. Haykin. Syntactic modeling and signal processing of multifunction radars: A stochastic context free grammar approach. Proceedings of the IEEE, 95:1000–1025, 2007. [77] E. Waltz and J. Llinas. Multisensor Data Fusion. Artech House, 1990. [78] A. Wang and V. Krishnamurthy. Threat estimation of multifunction radars : modeling and statistical signal processing of stochastic context free grammars. IEEE Transactions on Signal Processin, 56:1106–1119, 2008. [79] A. Wang, V. Krishnamurthy, and B. Balaji. Meta level tracking with multimode spacetime adaptive processing of gmti data. In International Conference on Information Fusion, 2009.  35  Chapter 2  Syntactic Modeling and Signal Processing of Multifunction Radars: A Stochastic Context Free Grammar Approach1 Statistical pattern recognition has been a major tool used in building Electronic Warfare (EW) systems to analyze radar signals. Conventional radars have been historically characterized by fixed parameters such as radio frequency, pulse-width and peak amplitude [54; 53]. For these radar characterizations, parametric models are sufficient for solving signal processing problems such as emitter identification and threat evaluation. [55; 38] discusses template matching of the intercepted radar signal against an EW library for both the emitter type and emitter mode identification. Histogram techniques are described in [39] to study the temporal periodicities in radar signals such as pulse repetition intervals. With the advent of modern radars, especially multi-function radars (MFRs), statistical pattern recognition approaches described above became inadequate. MFRs are radio-frequency sensors that are widely used in modern surveillance and tracking systems, and they have the capability to perform a multitude of different tasks simultaneously. The list of these tasks often includes such activities as search, acquisition, multiple target tracking and weapon guidance [40]. MFRs use electronic beam-steering antennas to perform multiple tasks simultaneously by multiplexing them in time using short time slices [46]. At the same time they have to maintain low probability of being detected and jammed. Indeed, MFRs are an excellent example of highly complex man-made large-scale dynamical systems. MFRs’ ability to adaptively and actively switch modes and change system parameters greatly limits the applicability of the parametric statistical pattern recognition approaches; the dimensionality of different emitters’ representation spaces may be too large, and are heavily overlapped for the statistical approach to be viable. This paper proposes a different approach to radar modeling and radar signal processing 1 A version of this chapter has been published. Visnevski, N. A., Krishnamurthy, V., Wang, A. and Haykin, S. (2007) Syntactic Modeling and Signal Processing of Multifunction Radars: A Stochastic Context Free Grammar Approach. Proceedings of the IEEE. 95:1000-1025.  36  one based on syntactic pattern recognition. The origins of syntactic modeling can be traced to the classic works of Noam Chomsky on formal languages and transformational grammars [14; 17; 16; 15]. The central elements of this work are the concepts of a formal language and its grammar. Languages are typically infinite sets of strings drawn from a finite alphabet of symbols. Grammars, on the other hand, are viewed as finite-dimensional models of languages that completely characterize them. Many different kinds of grammars and languages have been identified and investigated for practical applications. Among them, the Finite-State Grammars (FSG) and the Context-Free Grammars (CFG), as well as their stochastic counterparts, are currently the most widely used classes of grammars. Stochastic Finite-State Grammars (SFSGs), also known as hidden Markov models, achieved a great success in the speech community [18; 35]. They were used in modern tracking systems [9] and in machine vision [1]. On the other hand, Stochastic Context-Free Grammars (SCFGs) are studied in [27] for gesture recognition and the implementation of an online parking lot monitoring task. In [59; 60] they were used in modeling the dynamics of a bursty wireless communications channel. [7; 41] describe syntactic modeling applied to bioinformatics and [21; 37] apply these models to the study of biological sequence analysis and RNA. Finally, application of syntactic modeling to pattern recognition is covered in depth in [25]. In this paper, based on an anti-aircraft defence radar called Mercury, we construct a Markovmodulated SCFG to model MFR. The more traditional approaches such as Hidden Markov and state space models are suitable for target modeling [9; 8] but not radar modeling because MFRs are large scale dynamical systems and their scheduling involves planning and preempting that makes state space approach difficult. In addition to radar modeling, we also consider statistical radar signal processing. The proposed linguistic model of MFRs is naturally divided into two levels of abstraction: task scheduling level and radar control level. We show that the MFRs’ SCFG representation at task scheduling level is self-embedding and cannot be reduced to a finite state form, and thus signal processing has to be performed in the CFG domain. The MFRs’ SCFG representation at the radar control level, on the other hand, is non-self-embedding, and thus a finite-state model is obtainable, and a systematic approach is introduced to convert the SCFG representation to its finite-state counterpart. The advantage of the finite-state radar representation is that it enables direct application of well-known signal processing techniques defined in the finite-state domain. The reason for the two level approach to radar modeling is that of computational cost. Although SCFGs provide a compact representation for a complex system such as MFRs, they are associated with computationally intensive signal processing algorithms [2; 21; 25; 26]. By contrast, finite-state representations are not nearly as compact (the number of states in the finitestate automaton representing an MFR can be very large [49]), but the associated signal processing techniques are much less computationally demanding (see discussion on complexity of syntactic parsing algorithms in [21]). It is therefore advantageous to model MFRs as SCFGs, and perform signal processing on their finite-state equivalents as much as possible. Traditionally, MFRs’ signal modes were represented by volumes of parameterized data 37  records known as Electronic Intelligence (ELINT) [54]. The data records are annotated by lines of text explaining when, why and how a signal may change from one mode to another. This makes radar mode estimation and threat evaluation fairly difficult. In [48], SCFG is introduced as a framework to model MFRs’ signal and it is shown that MFRs’ dynamic behavior can be explicitly described using a finite set of rules corresponding to the production rules of the SCFG. SCFG has several potential advantages: (i) SCFG is a compact formal representation that can form a homogeneous basis for modeling complex system dynamics [2; 25; 26], and with which it allows model designers to express different aspects of MFR control rules in a single framework [48], and automates the threat estimation process by encoding human knowledge in the grammar [52; 50]. (ii) the recursive embedding structure of MFRs’ control rules is more naturally modeled in SCFG. As we show later, the Markovian type model has dependency that has variable length, and the growing state space is difficult to handle since the maximum range dependency must be considered. (iii) SCFGs are more efficient in modeling hidden branching processes when compared to stochastic regular grammars or hidden Markov models with the same number of parameters. The predictive power of a SCFG measured in terms of entropy is greater than that of the stochastic regular grammar [30]. SCFG is equivalent to a multi-type Galton-Watson branching process with finite number of rewrite rules, and its entropy calculation is discussed in [32]. In summary, the main results of the paper are: 1. A careful detailed model of the dynamics of an MFR using formal language production rules. By modeling the MFR dynamics using a linguistic formalism such as a SCFG, an MFR can be viewed as a discrete event system that “speaks” some known, or partially known, formal language [11]. Observations of radar emissions can be viewed as strings from this language, corrupted by the noise in the observation environment. 2. Formal procedure of synthesis of stochastic automaton models from the compact syntactic rules of CFG. Under the condition that the CFG is non-self-embedding, the CFG representation can be converted to its finite state counterpart, where the signal processing is computationally inexpensive. 3. Novel use of Markov modulated SCFGs to model radar emissions generated by MFR. The complex embedding structure of the radar signal is captured by the linguistic model, SCFG, and the MFR’s internal state, its policies of operation, is modeled by a Markov chain. This modeling approach enables the combination of the grammar’s syntactic modeling power with the rich theory of Markov decision process. 4. Statistical signal processing of SCFGs. The threat evaluation problem is reduced to a state estimation problem of HMM. The maximum likelihood estimator is derived based on a hybrid of the forward-backward and the inside-outside algorithm. (Inside-outside algorithm is an extension of HMM’s forward-backward algorithm [6].) 38  The rest of the paper is organized as follows. Sec. 2.1 provides a self-contained theoretical background of syntactic modeling methodology. Sec. 2.2 describes the multifunction radar in detail and its role in electronic warfare. Sec. 2.3 and Sec. 2.4 present the threat estimation algorithms and a detailed description of the synthesis procedure of stochastic automaton models from the syntactic rules of CFG. Finally, Sec. 2.5 concludes the paper.  2.1 Elements of Syntactic Modeling This section presents important elements from the theory of syntactic modeling, syntactic pattern recognition and syntax analysis. The aim is to provide the reader with a brief overview of the concepts that will be used in the rest of the paper, and a more comprehensive discussion can be found in [25]. We use the definitions and notations common to the theory of formal languages and computational linguistics [2; 26]. We will start by introducing the concept of formal languages. These languages are most accurately defined in the set-theoretic terms as collections of strings having a certain predefined structure. In practice, a finite-dimensional model of the language is required, and it should help answering the two fundamental questions of the theory of formal languages: • Given a language, how can we derive any string from this language? (The problem of string generation.) • Given a certain string and a language, how can we tell if this string is part of this language? (The problem of string parsing or syntax analysis) The finite-dimensional models of languages that help answering these fundamental questions are called grammars. If we focus on the problem of string generation, such grammars are typically called generative grammars. If, on the other hand, we are interested in string parsing, it is customary to refer to the language grammars as transformational grammars. 2.1.0.1  Formal Languages  Let A be an arbitrary set of symbols that we will call an alphabet. In general, an alphabet does not have to be finite, but from the practical standpoint we will assume that A is a finite set of symbols. Using symbols from A , one can construct an infinite number of strings by concatenating them together. We call an ε -string an empty string – a string consisting of no symbols. Let us denote by A + an infinite set of all finite strings formed by concatenation of symbols from A , and let us denote by A ∗ = A + ∪ ε . For example, if A = {a, b, c}, then A+= ∗  A =  {a, b, c, aa, ab, ac, ba, bb, bc, ca, cb, cc, aaa, . . .}  (2.1)  {ε , a, b, c, aa, ab, ac, ba, bb, bc, ca, cb, cc, aaa, . . .}  (2.2)  39  The A + operation is called positive (transitive) closure of A , and the A ∗ operation is called Kleene (reflexive and transitive) closure. Definition The language L defined over an alphabet A is a set of some finite-length strings formed by concatenating symbols from A . Evidently, L ⊆ A ∗ , and in particular, ∅, A , and A ∗ are also languages. 2.1.0.2  Grammars  The definition of the formal language (Def. 2.1.0.1) is extremely broad and therefore, has very limited practical application. A more useful way of defining formal languages is through the use of grammars [14; 17; 16; 15]. Definition A deterministic grammar G is a four-tuple G = (A , E , Γ, S0 )  (2.3)  where: A is the alphabet (the set of terminal symbols of the grammar); E is the set of non-terminal symbols of the grammar; Γ is the finite set of grammatical production rules (syntactic rules); S0 is the starting non-terminal. In general, Γ is a partially defined function of type Γ : (A ∪ E )∗ → (A ∪ E )∗ .  (2.4)  However, as we will see later, certain restrictions applied to the production rules Γ allow us to define some very useful types of grammars. In the rest of this paper, unless specified otherwise, we will write non-terminal symbols as capital letters, and symbols of the alphabet using lower case letters. This follows the default convention of the theory of formal languages. Def. 2.1.0.1 provides a set-theoretic definition of a formal language. Now, using Def.2.1.0.2 we can redefine the language in terms of its grammar L L(G). To illustrate the use of grammars, consider a simple language L = L(G) whose grammar G = (A , E , Γ, S0 ) is defined as follows: A = E =  {a, b} S0 → aS1 |b {S0 , S1 } S1 → bS0 |a  (2.5)  40  These are some of the valid strings in this language, and examples of how they can be derived by repeated application of the production rules of2.5: 1. S0 ⇒ b 2. S0 ⇒ aS1 ⇒ aa 3. S0 ⇒ aS1 ⇒ abS0 ⇒ abb 4. S0 ⇒ aS1 ⇒ abS0 ⇒ abaS1 ⇒ abaa 5. S0 ⇒ aS1 ⇒ abS0 ⇒ abaS1 ⇒ ababS0 ⇒ ababb 6. S0 ⇒ aS1 ⇒ abS0 ⇒ abaS1 ⇒ ababS0 ⇒ . . . ⇒ ababab . . . abb 7. S0 ⇒ aS1 ⇒ abS0 ⇒ abaS1 ⇒ ababS0 ⇒ . . . ⇒ ababab . . . abaa This language contains an infinite number of strings that can be of arbitrary length. The strings start with either a or b. If a string starts with b, then it only contains one symbol. Strings terminate with either aa or bb, and consist of a distinct repeating pattern ab. This simple example illustrates the power of the grammatical representation of languages. Very simple grammars can define rather sophisticated languages. 2.1.0.3  Chomsky Hierarchy of Grammars  In Def. 2.1.0.2, the production rules of the grammar are given in a very general form. [16] used the properties of the production rules of grammars to develop a very useful hierarchy that is known in the literature as the Chomsky hierarchy of grammars: • Regular Grammars (RG): Only production rules of the form S → aS or S → a are allowed. This means that the left-hand side of the production must contain one non-terminal only, and the right-hand side could be either one terminal or one terminal followed by one non-terminal. The grammar of the language in the last example of this section is a regular grammar. Regular grammars are sometimes referred to as Finite-State Grammars (FSG). • Context-Free Grammars (CFG): Any production rule of the form S → β is allowed. This means that the left-hand side of the production rule must contain one non-terminal only, whereas the right-hand side can be any string. • Context-Sensitive Grammars (CSG): Production rules of the form α1 Sα2 → α1 β α2 are allowed. Here α1 , α2 ∈ (A ∪ E )∗ , and β = ε . In other words, the allowed transformations of non-terminal S are dependent on its context α1 and α2 .  41  regular context-free context-sensitive unrestricted  Figure 2.1: The Chomsky hierarchy of formal languages.  Grammar FSG CFG  Production rule structure S → aS S→a S→β  CSG  α1 Sα2 → α1 β α2  UG  α1 Sα2 → γ  Language Finite State (Regular) Language (RL) Context-Free Language (CFL) Context-Sensitive Language (CSL) Unrestricted (type-0) Language (UL)  Table 2.1: Deterministic grammars, production rules, and languages.  • Unrestricted Grammars (UG): Any production rules of the form α1 Sα2 → γ are allowed. Here α1 , α2 , γ ∈ (A ∪ E )∗ . The unrestricted grammars are also often referred to as type-0 grammars due to Chomsky [16]. Chomsky also classified languages in terms of the grammars that can be used to define them. Figure 2.1 illustrates this hierarchy of languages. Each inner circle of this diagram is a subset of the outer circle. Thus, Context-Sensitive Language (CSL) is a special (more restricted) form of Unrestricted Language (UL), Context-Free Language (CFL) is a special case of CSL, and Regular Language (RL) is a special case of CFL. Table 2.1 provides a condensed summary of the classes of grammars, their production rule structures, and classes of languages that they define. More detailed treatment of the Chomsky hierarchy is given by [21]. Syntactic modeling of DES (discrete event system) developed in this paper will make extensive use of FSG and CFG. CSG and UG will not be used in our modeling approach. 2.1.0.4  Regular Languages and Finite State Automata  Relationship between Regular Languages and Finite State Automata 42  a S0  b  T  a  S1  b Figure 2.2: FSA equivalent to the grammar example (2.5). State S0 is the starting state, and T is an accepting state, as indicated by the double circle.  Definition A Finite State Automaton (FSA) Λ is a five-tuple Λ = (Q, Σ, δ , q0 , F) ,  (2.6)  where: Q is the set of states of the FSA; Σ is the set of input symbols of the FSA;  δ is the transition function of the FSA; q0 is the initial state of the FSA; F is the set of final (accepting) states of the FSA (F ⊂ Q). FSA were shown to be equivalent to RG, and RL (see [4; 2; 26; 42; 43; 44]). In fact, using Def. 2.1.0.2 and Def. 2.1.0.4 we can observe that if Q = E , Σ = A , and q0 = S0 , we can relate δ and Γ in such a way that L(Λ) = L(G). L(Λ) is also called the language accepted by the FSA Λ. The set of final (accepting) states F is the set of states such that any input string from L(Λ) causes Λ to transition into one of these states. An FSA equivalent of the grammar (2.5) is shown in Fig. 2.2. 2.1.0.5  Context-Free Languages and Context-Free Grammars  The next, less restricted member of the Chomsky hierarchy of grammars is the Context-Free Grammar (CFG). Languages that can be accepted by FSA are limited in terms of strings that they can contain. The most famous example of a language that cannot be accepted by FSA is the language of palindromes2 . It was shown to be a CFL [26]. A simple language of palindromes can, for example, be defined by the following set of production rules: P → bPb|aPa|b|a|ε , 2A  (2.7)  palindrome is a string that reads the same way both left-to-right and right-to-left.  43  and an example string from this language is bababaaababab. According to Table2.1, the grammar in (2.7) is a CFG. CFGs are often associated with tree-like graphs instead of FSA since the dependency between the elements of the strings of the CFL are nested [4; 2; 3; 26]. Due to this fact, the task of processing the strings from a CFL is a more computationally complex procedure than that of a RL. On the other hand, [5] have shown that CFG could be more compact descriptions of the RL than RG. It is often convenient to describe complex finite-state systems in the context-free form, but it is less computationally intensive to perform analysis of these systems using FSA. This fact is at the center of the large scale DES modeling methodology that we are going to develop in the rest of this section. As Fig. 2.1 clearly demonstrates, RL are a proper subset of the class of CFL. However, given a general CFG, one cannot tell if this grammar describes a RL or a CFL (this task was shown to be undecidable [26]). We will now look at the property of self-embedding of the CFG and see how this property helps in determining the class of the languages described by such CFG. 2.1.0.6  Non-Self-Embedding Context-Free Grammars  Definition A CFG G = (A , E , Γ, S0 ) is self-embedding if there exists a nonterminal symbol A ∈ E such that a string α Aβ can be derived from it in a finite number of derivation steps, with α , β = ε being any string of terminal and nonterminal symbols. For example, the nonterminal symbol P in the palindrome grammar (2.7) is such a selfembedding nonterminal, and the CFG of palindromes is self-embedding. Definition A CFG G = (A , E , Γ, S0 ) is Non-Self-Embedding (NSE) if there exists no nonterminal symbols for which the condition of the Def. 2.1.0.6 can be satisfied. [15] has demonstrated that if a Context-Free Grammar is Non-Self-Embedding, it generates a Finite-State Language. In Section 2.4.1, we will describe an algorithm to verify the Non-SelfEmbedding property of Context-Free Grammars, and show how to obtain Finite-State Automata for these grammars. 2.1.0.7  Stochastic Languages and Stochastic Grammars  A number of practical applications contain certain amounts of uncertainty that are often represented by probabilistic distributions. These factors require extension of the concepts described above into the domain of stochastic languages. Definition A weighted grammar Gw is a five-tuple Gw = (A , E , Γ, Pw , S0 )  (2.8)  where: 44  A is the alphabet (the set of terminal symbols of the grammar); E is the set of non-terminal symbols of the grammar; Γ is the finite set of grammatical production rules (syntactic rules); Pw is the set of weighting coefficients defined over the production rules Γ; S0 is the staring non-terminal. Here is a simple example of a weighted grammar: 9  S0  → aS1  S0  → b  1  (2.9)  1  S1  → bS0  S1  → a  9  This grammar has been obtained from grammar (2.5) by associating with its productions the set of weights Pw = {(9, 1), (1, 9)}. Note that the set of weights Pw does not have to be normalized. Definition A stochastic grammar Gs is a five-tuple Gs = (A , E , Γ, Ps , S0 )  (2.10)  where A , E , Γ, and S0 are the same as in Def. 2.1.0.7, and Ps is the set of probability distributions over the set of production rules Γ. Clearly, stochastic grammars are simply a more restricted case of the weighted grammars. Here is a simple example of a stochastic grammar: 0.9  S0  → aS1  S0  → b  0.1  (2.11)  0.1  S1  → bS0  S1  → a  0.9  This grammar has been obtained from grammar (2.5) by applying to its productions the probability distributions Ps = {(0.9, 0.1), (0.1, 0.9)}. Stochastic and weighted grammars are classified and analyzed on the basis of their underlying characteristic grammars [24; 25]. A characteristic grammar Gc is obtained from the stochastic grammar Gs (weighted grammar Gw ) by removing the probability distribution Ps (set of weights Pw ) from the grammar definition.  45  Deterministic grammar  Characteristic grammar  Stochastic grammar  Figure 2.3: Derivation procedure for the stochastic grammars. First, a deterministic grammar for the system is constructed. Then, after considerations of possible sources of uncertainties, the deterministic grammar is modified into a characteristic grammar that accommodates for these uncertainties. Finally, a probability distribution is assigned to the characteristic grammar, yielding a stochastic grammar of the system.  If the resulting characteristic grammar is a Finite-State Grammar, the stochastic grammar is called Stochastic Finite-State Grammar (SFSG). If the characteristic grammar is a Context-Free Grammar, the stochastic grammar is referred to as Stochastic Context-Free Grammar (SCFG). For example, grammar (2.11) is a SFSG, and grammar (2.5) is its characteristic grammar. Characteristic grammars play important roles in deriving syntactic models of real-life systems. The typical procedure is illustrated in Fig. 2.3. The characteristic grammar is a bridge between the internal deterministic rules of the system, and the stochastic environment in which this system is operating or observed. 2.1.0.8  Stochastic Finite-State Languages, Markov chains, and HMM  Stochastic Finite-State Languages, Markov chains, and Hidden Markov Models Just as FSA constitute one of the representation forms of FSLs, discrete-state discrete-time Markov chains are naturally considered the equivalent representations of the SFSLs [11]. This representation has been successfully utilized in bioinformatics and computational genomics [7; 21] as well as in natural language and speech processing [18]. A discrete-state discrete-time Markov chain can be defined as a stochastic timed automaton [11]: Definition A discrete-time Markov chain defined over a discrete state space is a tuple γ = (A, π ) ,  (2.12)  where: A is the N × N state transition probability matrix, 46  0.9 S0  0.1  T  0.9  S1  0.1 Figure 2.4: Example of a Markov chain for the SFSG (2.11). Note that Markov chains only capture the transition dynamics of the grammar since the terminal symbols of the grammar do not feature in the Markov chain structure.  π is the N × 1 vector of initial state probability distribution, and N is the number of states in the Markov chain. We will illustrate the relationship between Stochastic Finite-State Grammars and Markov chains by looking at the transition structure of the grammar (2.11). We can construct a Markov chain that will reflect the transitions within Γ of (2.11) as        0 0.9 0.1 1 γ = A =  0.1 0 0.9  , π =  0  , 0 0 1 0  (2.13)  where A and π are defined with respect to the state ordering {S0 , S1 , T } as shown in Fig. 2.4. The example above illustrates a strong parallel between Finite-State Automata (FSA) in the case of deterministic grammars, and Markov chains in the case of stochastic ones. However, Markov chains defined by Def. 2.1.0.8 do not accommodate for the alphabet A of the grammar. Therefore, Markov chains can only capture transition dynamics of the grammar, but do not address generation and transformation aspects of the SFSGs discussed earlier. HMMs address this issue. Hidden Markov Models (HMMs) [35; 7; 18] are particularly suitable for representing stochastic languages of the finite-state discrete-event systems observed in noisy environments. They separate the uncertainty in the model attributed to the observation process from the uncertainties associated with the system’s functionality. Generally speaking, HMMs are Markov chains indirectly observed through a noisy process [11; 22; 23; 36; 35]. Definition A HMM λ is a three-tuple λ = (A, B, π ) ,  (2.14)  where: A is the N × N state transition probability matrix of the underlying Markov chain, 47  B is the N × M observation probability matrix that establishes probability distributions of observing certain discrete symbols associated with a certain state of the chain,  π is the N × 1 vector of initial state probability distribution of the underlying Markov chain, N is the number of states of the underlying Markov chain, and M is the number of possible discrete observation symbols. To illustrate how HMMs relate to SFSGs, we would like to revisit the grammar (2.11). The Markov chain for this grammar is defined by (2.13). Now we can extend this chain bringing in the alphabet A of the grammar (2.11) through the structure of the observation probability matrix B. However, this extension requires a transformation of the structure of the Markov chain in Fig. 2.4. Def. 2.1.0.8 associates the observation probability matrix B with the states of the chain, whereas SFSGs associate generation of nonterminals with transitions of the state machine. The former case is known in the literature as the Moore machine, and the latter is referred to as the Mealy machine [26]. Therefore, to accommodate for the structural constraints of the HMM, the Markov chain in Fig. 2.4 has to be converted to the Moore machine form as described in detail in [26]. The resulting HMM has the following structure:     0   0.9   λ = A =  0 0  0.1 0 0 0  0.9 0 1 0    0  0.1  ,B =   0  1  1 0 1 0    0 0.9  1  ,π =  0  0 0  1 0.1     ,   (2.15)  where A as well as rows of π and B are defined with respect to the state ordering {S1 , S2 , T1 , T2 } as shown in Fig. 2.5, and columns of B are defined with respect to the ordering {a, b}.  2.2 Electronic Warfare Application - Electronic Support and MFR With the above background in syntactic modeling, we are now ready to study MFRs, and devise electronic support algorithms that deal with their ever increasing sophistication of their remote sensing capabilities. Electronic Warfare (EW) can be broadly defined as any military action with the objective of controlling the electromagnetic spectrum [45]. An important aspect of EW is the radar-target interaction. In general, this interaction can be examined from two entirely different points of view - the viewpoint of the radar and the viewpoint of the target. From the radar’s point of view, its primary goal is to detect targets and to identify their critical parameters. From the target’s point of view, the goal is to protect itself from a radar-equipped threat by collecting radar emissions and evaluating threat in real time (electronic support). In this paper, the target’s viewpoint is the focus, and MFRs are the specific threat considered. 48  0.1 T1/a  0.9  S1/a  S2/b  0.1  T2/b  0.9  Figure 2.5: Example of a HMM for the SFSG (2.11). Each state is labeled by two symbols separated by a slash. The first symbol identifies the state of the system, and the second determines the output produced by the system in this state. To accommodate for the terminal symbols of the grammar (2.11) through the use of the observation probability matrix B, the structure of the Markov chain in Fig. 2.4 had to be transformed to the Moore machine. Consequently, the underlying Markov chain of this HMM has different set of discrete states {S1 , S2 , T1 , T2 }.  The framework of EW considered in this paper consists of three layers: Receiver/ Deinterleaver, Pulse train analyzer and Syntactic processor [47]. The layers are depicted in Fig. 2.6 and a brief description is given here: The receiver processes the radar pulses intercepted by the antenna, and outputs a sequence of pulse descriptor words, which is a data structure containing parameters such as carrier frequency, pulse amplitude or pulse width. The deinterleaver processes the pulse descriptor words, groups them according to their possible originating radar emitters and stores them in their corresponding track files. The pulse train analyzer processes the track file, and further groups the pulse descriptor words into radar words. (See Sec.2.2.1 for definitions.) Finally, the syntactic processor analyzes the syntactic structure of the radar words, estimates the state of the radar system and its threat level, and outputs the results on a pilot instrumentation panel. Because the receiver, deinterleaver and pulse train analyzer have been well studied, the syntactic processor is the focus of this paper. The syntactic processor captures the knowledge of the “language” that MFRs speak. It is a complex system of rules and constraints that allow radar analysts to distinguish “grammatical” radar signal from “ungrammatical” one. In other words, an analogy is drawn between the structural description of the radar signal and the syntax of a language, and the structural description could, therefore, be specified by the establishment of a grammar [26]. As far as EW is concerned, the optimal approach is to collect a corpus of radar samples, and induce the grammar directly without human intervention. However, because of the degree of complexity and potential lack of data on the MFR signal, grammatical induction approach is impractical. As a result, in this paper, the grammar is constrained to be stochastic context free grammar, and its context-free backbone is specified by radar analysts from studying MFRs’ signal generation mechanism. Sec. 2.2.1 describes MFRs’ system architecture and the building blocks making up the radar signal, and Sec. 2.2.2 discusses the shortcomings of HMM and explains why SCFG is preferred. 49  Figure 2.6: The Electronic Warfare (EW) framework considered in this paper. The radar signal emitted by the MFR is captured at the EW system on board the target after being corrupted by the stochastic environment. The EW system consists of an antenna, a receiver/deinterleaver, a pulse train analyzer, and a syntactic processor.  2.2.1 MFR Signal Model and Its System Architecture In order to describe MFRs’ system architecture, we begin with the building blocks making up MFRs’ signal generation process, and they are defined as follows: • Radar word: A fixed arrangement of finite number of pulses that is optimized for extracting a particular target information. For example pulses with a fixed pulse repetition frequency. • Radar phrase (radar task): Concatenation of finite number of radar words. Each phrase may be implemented by more than one concatenation of radar words. Examples are search and target acquisition. • Radar policy: Pre-optimized schemes that allocate resources to radar phrases. An example is rules of engagement or policies of operation. Fig. 2.7 illustrates how a radar phrase and radar words are related. Fig. 2.7(a) shows two radar words that are represented by symbols ’a’ and ’b’, where vertical bars represent radar pulses. Fig. 2.7(b) illustrates a sequence of radar words for a radar phrase, and which is constructed from concatenation of a and b into a sequence ’abaa’. The generation process of radar words is governed according to MFRs’ system architecture3 as illustrated in Fig. 2.8. A MFR consists of three main components: Situation assessment, System manager and Phrase scheduler/Radar controller. The situation assessment module provides feedback of the tactic environment, and the system manager, based on the feedback, selects a 3 The  system architecture does not include multiple target tracking functionalities such as data association. The paper focuses on a single target’s self protection and threat estimation, and thus models only the radar signal that a single target can observe.  50  Figure 2.7: Radar words can be viewed as fundamental building blocks of the MFR signal. (a) shows two distinct radar words labeled as a and b. (b) illustrates how a radar phrase as represented by a pulse sequence can be decomposed into a series of radar words as defined in (a).  radar policy. Each radar policy is a resource allocation scheme that represents trade-offs between different performance measures, and it dictates how the phrase scheduler/radar controller will operate. Examples of the radar policies are long range track acquisition and short range self protect target acquisition policies: the major performance measures in these two policies are false alarm rate and track latency; track latency(false alarm rate) is tolerable in long range track acquisition policy (short range self protect target acquisition policy) and may be sacrificed for lower false alarm rate (track latency). The scheduling and generation of radar words, on the other hand, is dictated by two controllers, phrase scheduler and radar controller, and their corresponding queues, planning queue and command queue, respectively. The reason for having the queues is driven by the need for MFR to be both adaptive and fast [10]. The planning queue stores scheduled radar phrases that are ordered by time and priority, and it allows the scheduling to be modified by phrase scheduler. Due to the system’s finite response time, radar phrases in the planning queue are retrieved sequentially and entered to the command queue where no further planning or adaptation is allowed. Radar controller maps the radar phrases in the command queue to radar words and which are fixed for execution. More specifically, the phrase scheduler models MFRs’ ability to plan ahead its course of action and to pro-actively monitor the feasibility of its scheduled tasks [13]. Such an ability is essential because MFR switches between radar phrases, and conflicts such as execution order and system loading must be resolved ahead of time based on the predicted system performance and the tactic environment. In addition, planning is also necessary if MFR is interfaced with an external device, where the execution of certain phrases needs to meet a fixed time line. Radar controller, on the other hand, models MFR’s ability to convert radar phrases to a multitude of different radar words depending on the tactic environment. Such an arrangement follows the macro/micro architecture as described in Blackman and Popoli [9]; the phrase scheduler determines which phrase the MFR is to perform that best utilize the system resources to achieve the 51  Random Environment  Situation Assessment  Radar Controller  Command Queue  System Manager  Phrase Scheduler  Planning Queue  Figure 2.8: The figure illustrates MFRs’ system architecture. The situation assessment module evaluates the tactic environment and provides feedback to the system manager. The system manager, based on the feedback, selects the radar policy in which the phrase scheduler/radar controller will operate. The phrase scheduler initiates and schedules radar phrases in the planning queue and the phrases fixed for execution are moved to the command queue. The phrases in the command queue are mapped to appropriate radar words by the radar controller and are sent to MFR for execution.  mission goal, and the radar controller determines how the particular phrase is to be performed. The MFR’s operational details that are to be modeled are described here. Phrase scheduler processes the radar phrases in the planning queue sequentially from left to right. (If the queue is empty, an appropriate radar phrase is inserted.) To process different types of radar phrases, phrase scheduler calls their corresponding control rules; the rule takes the radar phrase being processed as input, and responds by appending appropriate radar phrases into the command queue and/or the planning queue. The selection of the control rules is a function of radar policies, and which are expressed by how probable each rule would be selected. Similar to phrase scheduler, the radar controller processes the radar phrases in the command queue sequentially and maps the radar phrases to radar words according to a set of control rules.  2.2.2 Inadequacy of HMM for modeling MFR The distinguishing features of MFRs compared to conventional radars are their ability to switch between radar tasks, and their use of schedulers to plan ahead the course of action [10]. In order to model such features, as will be shown later in the next section, partial production rules of the form i) B → b B and ii) B → A B | B C | b B 52  are devised (See Sec. 2.2.4 for details); Fig. 2.9 illustrates the production rules and their derivations. The significance of the rules is that since HMM is equivalent to stochastic regular grammar [31] (rules of the form i), and MFRs follow rules that strictly contain regular grammar (rules of the form ii cannot be reduced to i), HMM is not sufficient to model MFRs’ signal. Furthermore, for sources with hidden branching processes (MFRs), stochastic context free grammar is shown to be more efficient than HMM; the estimated SCFG has lower entropies than that of HMM [30]. Remark: The set of production rules presented above is a self-embedding context free grammar and thus its language is not regular, and cannot be represented by a Markov chain [16]. For the rules presented, self-embedding property can be shown by a simple derivation B → A B → A B C. In addition to the self-embedding property derived from the scheduling process, the generation of words by the radar controller poses another problem. For each radar phrase scheduled, a variable number of radar words may be generated. If HMM is applied to study the sequence of radar words, the Markovian dependency may be of variable length. In this case, maximum length dependency needs to be used to define the state space, and the exponential growing state space might be an issue.  2.2.3 A Syntactic Approach to MFR In terms of natural language processing, we model the MFR as a system that “speaks” according to a stochastic context free grammar. Based on the discussion in Sec. 2.2.1, the syntactic approach is suitable in modeling the phrase scheduler because the scheduler operates according to a set of formal rules. On the other hand, the radar controller is suitably modeled because of two reasons: i) each sequence of radar words is semantically ambiguous, i.e., many statistically distinct patterns (radar word sequence) possess the same semantic meanings (radar phrase), and ii) each radar phrase consists of finite number of radar words, and radar words are relatively easy to discriminate. A Simple Example of MFR As an illustrative example showing the correspondence between the grammar and the MFR, consider production rules of the form i) A → a A and ii) A → B A, where A and B are considered as radar phrases in the planning queue and a as a radar phrase in the command queue. The rule A → a A is interpreted as directing the phrase scheduler to append a to the queue list in the command queue, and A in the planning queue. Similarly, A → B A is interpreted as delaying the execution of A in the scheduling queue and inserting B in front of it. Suppose the planning queue contains the radar phrase A, a possible realization of the radar words’ generation process is illustrated in Fig. 2.10. (The figure also illustrates the operation of the radar controller; the 53  a)  b) B  bB  B  B  AB BC bB  B b  b  B  B B B b A B C  Figure 2.9: The figure illustrates the derivation process with two different types of production rules. a) The derivation of the rule of regular grammar type. b) The derivation of the rule of context free grammar type.  A Phrase Scheduler  {  A B a b  Radar Controller  {  BA  wy yw Command queue  Planning queue  Figure 2.10: A possible realization of the scheduling process represented by a grammatical derivation process. A and B are radar phrases in the planning queue, a and b are radar phrases in the command queue and w and y are radar words.  triangle represents the mapping of the radar phrase to the radar words, which are denoted by y and w.) It can be seen that as long as the command queue phrases appear only to the left of planning queue phrases in the rule, the command queue and the planning queue are well represented.  2.2.4 A Syntactic Model For a MFR called Mercury The syntactic modeling technique is discussed in this subsection, and the discussion is based on an anti-aircraft defence radar called Mercury. The radar is classified and its intelligence report is sanitized and provided in Appendix A. Table 2.2 provides an exhaustive list of all possible Mercury phrases, and associates them with the functional states of the radar. This table was obtained from specifications in the sanitized intelligence report and is central to grammatical 54  Phrase Four-Word Search  Three-Word Search  Acquisition  NAT or TM  Range Resolution Acq., NAT, TM  Words [w1 w2 w4 w5 ] [w2 w4 w5 w1 ] [w4 w5 w1 w2 ] [w5 w1 w2 w4 ] [w1 w3 w5 w1 ] [w3 w5 w1 w3 ] [w5 w1 w3 w5 ] [w1 w1 w1 w1 ] [w2 w2 w2 w2 ] [w3 w3 w3 w3 ] [w4 w4 w4 w4 ] [w5 w5 w5 w5 ] [w1 w6 w6 w6 ] [w2 w6 w6 w6 ] [w3 w6 w6 w6 ] [w4 w6 w6 w6 ] [w5 w6 w6 w6 ] [w7 w6 w6 w6 ] [w8 w6 w6 w6 ] [w9 w6 w6 w6 ] [w6 w6 w6 w6 ]  Phrase  Track Maintenance (TM)  Words [w1 w7 w7 w7 ] [w2 w7 w7 w7 ] [w3 w7 w7 w7 ] [w4 w7 w7 w7 ] [w5 w7 w7 w7 ] [w6 w7 w7 w7 ] [w1 w8 w8 w8 ] [w2 w8 w8 w8 ] [w3 w8 w8 w8 ] [w4 w8 w8 w8 ] [w5 w8 w8 w8 ] [w6 w8 w8 w8 ] [w1 w9 w9 w9 ] [w2 w9 w9 w9 ] [w3 w9 w9 w9 ] [w4 w9 w9 w9 ] [w5 w9 w9 w9 ] [w6 w9 w9 w9 ] [w7 w7 w7 w7 ] [w8 w8 w8 w8 ] [w9 w9 w9 w9 ]  Table 2.2: List of all Mercury emitter phrase combinations according to the functional state of the radar.  derivations that follow. The stochastic context free grammar modeling the MFR is G = {Np ∪ Nc , Tc , Pp ∪ Pc , S}, where Np and Nc are nonterminals representing the sets of radar phrases in the planning queue and command queue respectively, Tc is terminals representing the set of radar words, Pp is production rules mapping Np to (Nc ∪ N p )+ , Pc is the set of production rules mapping Nc to Tc+ and S is the starting symbol. It should be noted that the selection of production rules is probabilistic because MFR’s inner workings cannot be known completely. In this subsection, each components of MFR as illustrated in 2.8 will be discussed in turn. 2.2.4.1  Phrase Scheduler  The phrase scheduler models the MFR’s ability to plan and to preempt radar phrases based on the radar command and the dynamic tactic environment. Its operational rules for the scheduling and rescheduling of phrases are modeled by the production rule Pp , and it is found that Pp could 55  be constructed from a small set of basic rules. Suppose Np = {A, B,C} and Nc = {a, b, c}, The basic control rules that are available to the phrase scheduler are listed below. Markov Adaptive Terminating  B → bB|bC B → AB | BC B → b  The interpretation of the rules follows the example given at the end of the previous subsection. A rule is Markov if it sent a radar phrase to the command queue, and re-scheduled either the same or different radar phrase in the planning queue. A rule is Adaptive if it either preempted a radar phrase for another radar phrase or if it scheduled a radar phrase ahead of time in the radar’s time line after the current phrase. A rule is Terminating if it sent a radar phrase to the command queue without scheduling any new phrases. The significance of the Markov rule is obvious, as it represents the dynamics of finite state automaton. A simple example of the Markov rule is illustrated in Fig. 2.11 based on Mercury’s functional states. According to the specifications in Appendix A, relative to each individual target, the Mercury emitter can be in one of the seven functional states – Search, Acquisition, Non-Adaptive Track (NAT), three stages of Range Resolution, and Track Maintenance (TM). The transitions between these functional states can be captured by the state machine illustrated in Fig. 2.11. The Mercury’s state machine is generalized by including the adaptive and terminating rules. The inclusion of the adaptive rule models MFRs’ ability to reschedule radar phrases when the system loading or the tactic environment changes. The two adaptive rules model after the MFRs’ ability to i) Preempt and ii) Plan the radar phrases. The preempt ability is demonstrated by the rule B → A B where the radar phrase B is preempted when a higher priority task A enters the queue. On the other hand, the ability to plan is captured in the rule B → B C where the phrase C is scheduled ahead of time if its predicted performance exceeds a threshold. On the other hand, the terminating rule reflects the fact that the queues have finite length, and the grammatical derivation process must terminate and yield a terminal string of finite length. All the control rules, except the adaptive rule, could be applied to any radar phrases available. The adaptive rule schedules phrases ahead of time and thus requires a reasonable prediction on the target’s kinematics; it would not be applicable to phrases where the prediction is lacking. Applying the rules to Mercury’s radar phrases, the production rule Pp could be constructed and it is listed in Fig. 2.12. 2.2.4.2  Radar Controller and the Stochastic channel  In this section, we develop deterministic, characteristic, and stochastic phrase structure grammars of the Mercury’s radar controller as described in AppendixA. The grammar is derived as a word-level syntactic model of the emitter. We consider how the dynamics of radar words that make up one of the individual vertical slots in Fig. A.1 captures internal processes occurring 56  Figure 2.11: Mercury emitter functionality at the high level. There are seven functional states of the emitter (Search, Acquisition, NAT, three stages of Range Resolution, and TM). The transitions between the states are defined according to the specification of Appendix A. The state machine is shown as the Moor automaton with outputs defined by the states. A corresponding phrase from Table 2.2 is generated in every state of the automaton.  within the radar emitter. Deterministic grammar According to Def. 2.1.0.2, a deterministic grammar is defined through its alphabet, the set of nonterminals and the set of grammatical production rules. Using the Mercury specification of Appendix A, we can define the alphabet as: A = {w1 , w2 , w3 , w4 , w5 , w6 , w7 , w8 , w9 }  (2.16)  where w1 , . . . , w9 are the words of the Mercury emitter. At every functional state, the radar emits a phrase consisting of four words drawn from the Table 2.2. Some phrases are unique and directly identify the functional state of the radar (i.e. [w1 w2 w4 w5 ] can only be encountered during search operations). Other phrases are characteristic to several radar states (i.e. [w6 w6 w6 w6 ] can be utilized in Acquisition, Non-Adaptive Track (NAT), and Track Maintenance (TM)). These phrases form strings in the radar language that we are interested in modeling syntactically. 57  <S>  →  < FourW Search > | < T hreeW Search > | < ACQ > | < NAT > | < RR1 > | < RR2 > | < RR3 > | < T M >  < T hreeW Search >  →  < T hreeW SearchPhrase >< T hreeW Search > | < T hreeW SearchPhrase >< ACQ > | < T hreeW SearchPhrase >  < FourW Search >  →  < FourW SearchPhrase >< FourW Search > | < FourW SearchPhrase >< ACQ > | < FourW SearchPhrase >  < ACQ >  →  < AcqPhrase >< ACQ > | < AcqPhrase >< NAT > | < AcqPhrase >< T hreeW Search > | < AcqPhrase >< FourW Search > | < AcqPhrase >  < NAT >  →  < NAT Phrase >< NAT > | < NAT Phrase > < NAT Phrase >< RR1 > | < NAT Phrase >< T hreeW Search > | < NAT Phrase >< FourW Search >  < RR1 >  →  < RR1 Phrase >< RR1 > | < RR1 Phrase > < RR1 Phrase >< RR2 > | < RR1 Phrase >< T hreeW Search > | < RR1 Phrase >< FourW Search >  < RR2 >  →  < RR2 Phrase >< RR2 > | < RR2 Phrase > < RR2 Phrase >< RR3 > | < RR2 Phrase >< T hreeW Search > | < RR2 Phrase >< FourW Search >  < RR3 >  →  < RR3 Phrase >< RR3 > | < RR3 Phrase > < RR3 Phrase >< T M > | < RR3 Phrase >< T hreeW Search > | < RR3 Phrase >< FourW Search >  < TM >  →  < T MPhrase >< T M > | < T MPhrase > < RR3 >< T M > | < T MPhrase >< T hreeW Search > | < T MPhrase >< FourW Search >  Figure 2.12: Production rules of Mercury’s phrase scheduler.  58  To complete the derivation of the grammar, we must define the rules for the < XPhrase > nonterminals where X stands for the corresponding name of the emitter state in which this phrase is emitted. Using data from Table 2.2, we define the triplets: T6 → w6 w6 w6  T8 → w8 w8 w8  T7 → w7 w7 w7  T9 → w9 w9 w9  and the blocks of four words: Q1 → w1 w1 w1 w1  Q4 → w4 w4 w4 w4  Q7 → w7 w7 w7 w7  Q2 → w2 w2 w2 w2 Q3 → w3 w3 w3 w3  Q5 → w5 w5 w5 w5 Q6 → w6 w6 w6 w6  Q8 → w8 w8 w8 w8 Q9 → w9 w9 w9 w9  The < T hreeW SearchPhrase > and < FourW SearchPhrase > rules are: < FourW SearchPhrase >  →  < T hreeW SearchPhrase >  →  w1 w2 w4 w5 |w2 w4 w5 w1 |w4 w5 w1 w2 | w5 w1 w2 w4 w1 w3 w5 w1 |w3 w5 w1 w3 |w5 w1 w3 w5  The < AcqPhrase > rules are: < AcqPhrase >→ Q1 |Q2 |Q3 |Q4 |Q5 |Q6  The < NAT Phrase > rules are: < NAT Phrase > S1  → S1 T6 |Q6 → w1 |w2 |w3 |w4 |w5  The Range Resolution rules are: < RR1 Phrase > < RR2 Phrase >  → w7 T6 → w8 T6  < RR3 Phrase >  → w9 T6  Finally, the Track Maintenance rules are: < T MPhrase >  →  < FourW Track > | < T hreeW Track >  < FourW Track >  →  Q6 |Q7 |Q8 |Q9  < T hreeW Track >  →  S1 T6 |S2 T7 |S2 T8 |S2 T9  S2  →  S1 |w6  S1  →  w1 |w2 |w3 |w4 |w5  According to the Chomsky hierarchy discussed in Section 2.1, this grammar is a ContextFree Grammar. 59  Characteristic grammar The characteristic grammar of the Mercury emitter must extend the deterministic grammar to accommodate for the possible uncertainties in the real life environment. These uncertainties are due to the errors in reception and identification of the radar words. Conceptually, this process can be described by the model of the stochastic erasure channel with propagation errors. To accommodate for the channel impairment model, we have to make two modifications to the deterministic grammar of the Mercury emitter. First of all, the alphabet (2.16) has to be expanded to include the ∅ character – the character indicating that no reliable radar signal detection was possible: Ac = {∅} ∪ A = {∅, w1 , w2 , w3 , w4 , w5 , w6 , w7 , w8 , w9 } .  (2.17)  Finally, we introduce an additional level of indirection into the grammar by adding nine new nonterminals W1 , . . . ,W9 and nine new production rules: Wi → ∅|w1 |w2 |w3 |w4 |w5 |w6 |w7 |w8 |w9 ,  (2.18)  where i = 1, . . . , 9. The reason for this modification will become apparent when we associate probabilities with the production rules of the grammar. Stochastic grammar The channel impairment model has the following transition probability matrix: Po =  p1  p2  p3  p4  p5  p6  p7  p8  p9  T  (2.19)  where pi = [pi, j ]0≤ j≤9 is the probability that the transmitted radar word i being inferred by the pulse train analysis layer of the EW receiver as radar word j via the noisy and corrupted observations. The stochastic radar grammar can be obtained from the characteristic grammar by associating probability vectors of (2.19) with the corresponding productions of (2.18): p  Wi →i ∅|w1 |w2 |w3 |w4 |w5 |w6 |w7 |w8 |w9 ,  (2.20)  where i = 1, . . . , 9. Thus, the complete stochastic grammar of the Mercury emitter is shown in Fig. 2.13. Strictly speaking, this grammar should be called weighted grammar rather than stochastic. As shown by [25], a stochastic CFG must satisfy the limiting stochastic consistency criterion. However, [25] demonstrates that useful syntactic pattern recognition techniques apply equally well to both the stochastic, and the weighted CFGs. Therefore, we are not concerned with satisfying the stochastic consistency criterion at this point.  60  1  < AcqPhrase > → Q1 |Q2 |Q3 |Q4 |Q5 |Q6 1  < NAT Phrase > → S1 T6 |Q6 1  < RR1 Phrase > → W7 T6 1  < RR2 Phrase > → W8 T6 1  < RR3 Phrase > → W9 T6 1  < T MPhrase > → < FourW Track > | < T hreeW Track > 1  < FourW Search > → W1W2W4W5 |W2W4W5W1 |W4W5W1W2 |W5W1W2W4 1  < T hreeW Search > → W1W3W5W1 |W3W5W1W3 |W5W1W3W5 1  < FourW Track > → Q6 |Q7 |Q8 |Q9 1  < T hreeW Track > → S1 T6 |S2 T7 |S2 T8 |S2 T9 1  S2 → S1 |W6 1  S1 → W1 |W2 |W3 |W4 |W5 1  T6 → W6W6W6 1  T7 → W7W7W7 1  T8 → W8W8W8 1  T9 → W9W9W9 1  Qi → WiWiWiWi pi  Wi → ∅|w1 |w2 |w3 |w4 |w5 |w6 |w7 |w8 |w9 i = 1, . . . , 9 Figure 2.13: Weighted grammar of the Mercury emitter. This grammar, like its deterministic counterpart, is a CFG.  2.2.5 MFR and System Manager - Markov Modulated SCFG From the previous section, the phrase scheduler and the radar controller are modeled by constructing the context-free backbone of the MFR grammar. The third component of the MFR as described in Sec. 2.2.1 is the system manager. The operation of the system manager is deciding on the optimal policy of operation for each time period, and which is modeled by assigning production rule probabilities to the context free backbone. The evolution of the policies of operation is driven by the interaction between the targets and the radar, and the the production rules’ 61  probabilities conveniently represent the resource allocation scheme deployed. The state space of MFR is constructed based on different tactic scenarios. Let k = 0, 1, . . . denote discrete time. The policies of operation, xk , is modeled as a M-state Markov chain. Define the transition probability matrix as A = [aji ]M×M , where a ji = P(xk = ei |xk−1 = e j ) for i, j ∈ {1, 2, . . . , M}. For example, depending on the loading condition of the MFR, two states may be defined according to the amount of resources allocated to MFR’s target searching and target acquisition functions. More specifically, one state may consider the scenario where the number of targets detected is low, and thus higher proportion of radar resources are allocated to search functions. The other state may consider the scenario where the number of targets detected is high, and resources are allocated to target acquisition and tracking functions. In each state, the MFR will “speaks” a different “language”, and which is defined by its state-dependent grammar. Denote the grammar at state ei as Gi = {N p ∪ Nc, Tc , Ppi ∪ Pci , S}, and it is noted that the grammars’ context-free backbone is identical for all i except the probability distributions defined over their production rules. Each state of MFR is characterized by its policy of operations, and which determines the resource allocation to the targets. Obviously, the more resource allocated to the target, the higher the threat MFR poses on the target. From this perspective, threat estimation of MFR is reduced to a state estimation problem. One practical issue is that the signal generated by radar systems has finite length, and this finiteness constraint must be satisfied by the SCFG is the model is to be stable. We discuss this point by first defining the stochastic mean matrix. Definition Let A, B ∈ N, the stochastic mean matrix MN is a |N| × |N| square matrix with its (A, B)th entry being the expected number of variables B resulting from rewriting A: MN (A, B) =  ∑  η ∈(N∪T )∗ s.t.(A→η )∈P  P(A → η )n(B; η )  where P(A → η ) is the probability of applying the production rule A → η , and n(B; η ) is the number of instances of B in η [12]. The finiteness constraint is satisfied if the grammar in each state satisfies the following theorem. Theorem If the spectral radius of MN is less than one, the generation process of the stochastic context free grammar will terminate, and the derived sentence is finite. Proof The proof can be found in [12].  2.3 Signal Processing in CFG Domain 2.3.1 Overview of MFR Signal Processing at Two Layers of Abstractions In the previous section, we have considered the representation problem where the MFR is specified as a Markov modulated SCFG, and its production rules derived from the radar words’ 62  A Phrase Scheduler  {  A B a b  Radar Controller  {  CFG Domain  wy yw Command queue  BA Finite State Domain Planning queue  Figure 2.14: Signal processing in two levels of abstractions, the word level and the phrase level.  syntactic structure. In this and the next section, we will deal with the signal processing of the MFR signal and present algorithms for state and parameter estimation. The signal processing problem is illustrated in Fig. 2.14, where it is decomposed into two layers of abstractions; the radar controller layer and the phrase scheduler. The higher layer of processing is discussed in Sec. 2.3, where the state and parameter estimation are both processed in the CFG framework. Based on the estimated radar phrases passed from the radar control layer (hard decision is applied in the passing of radar phrases), the MFR’s policies of operation is estimated by a hybrid of the Viterbi and the inside algorithm. In addition, maximum likelihood parameter estimator of the unknown system parameters will be derived based on the Expectation Maximization algorithm. In Sec. 2.4, a principled approach is discussed to deal with the signal processing of the non-self-embedding CFG. A synthesis procedure that converts a non-self-embedding CFG to its finite state counterpart in polynomial time is introduced. The radar controller will be shown to be non-self-embedding, and its finite state representation will be derived. Once the radar controller is in it finite state form, its signal processing could be performed with a finite state automaton. Following the state space notation introduced in the previous section, let x0:n = (x0 , x1 , . . . , xn ) be the (unknown) state sequence, and γ1:n = (γ1 , γ2 , . . . , γn ) be the corresponding intercepted radar signal stored in the track file. Each γk = (w1 , w2 , . . . , wmk ) is a string of concatenated terminal symbols (radar words), and mk is the length of γk . In order to facilitate the discussion of the estimation algorithms, it is convenient to introduce the following variables: • Forward variable: 63  A1 Outside probability P (w1 p−1Ajpq wq+1 m ) Ajpq w1 · · ·  wp · · ·  wp−1  wq  wq+1  ···  wm  P (wpq |Ajpq ) Inside Probability Figure 2.15: Inside and outside probabilities in SCFG.  fi (k) = P(γ1 , γ2 , . . . , γk , xk = ei ) • Backward variable: bi (k) = P(γk+1 , γk+2 , . . . , γn |xk = ei ) • Inside variable: j β ji (k, p, q) = P(w pq |A pq , xk = ei ) • Outside variable: j α ij (k, p, q) = P(w1(p−1), A pq , w(q+1)m |xk = ei ) j is where w pq is the subsequence of terminals from pth position of γi to qth position, and Apq ∗ j j the nonterminal A ∈ N p which derives wpq , or A ⇒ w pq . Fig. 2.15 illustrates the probabilities associated with inside and outside variables.  2.3.2 Bayesian Estimation of MFR’s State via Viterbi and Inside Algorithms The estimator of MFR’s state at time k is xˆk = arg max P(xk = ei |γ1:n ), i  and which could be computed using the Viterbi algorithm. Define  δi (k) =  max  x0 ,x1 ,...,xk−1  P(x0 , x1 , . . . , xk = i, γ1 , γ2 , . . . , γk ),  the Viterbi algorithm computes the best state sequence inductively as follows: 64  1. Initialization: δi (1) = πi oi (γ1 ), for 1 ≤ i ≤ M. 2. Induction:  δi (k + 1) = max δi (k)a ji (ψ ) oi (γk+1 ), 1≤ j≤M  where 1 ≤ k ≤ n − 1 and 1 ≤ i ≤ M  ψi (k + 1) = arg max δi (k)a ji , 1≤ j≤M  where 1 ≤ k ≤ n − 1 and 1 ≤ i ≤ M.  3. Termination: xˆn = arg max δ j (n) 1≤ j≤M  4. Path backtracking: xˆk = ψk+1 (xˆk+1 ),  k = n − 1, n − 2, . . . , 1  where oi (γk ) is the output probability of the string γk generated by the grammar Gi . An efficient way to calculate the probability is by the inside algorithm, a dynamic programming algorithm that inductively calculates the probability. The inside algorithm computes the probability, oi (γk ), inductively as follows: 1. Initialization: β j (k, k) = P(A j → wk |Gi ). 2. Induction:  q−1  β j (p, q) = ∑ ∑ P(A j → Ar As )βr (p, d)βs (d + 1, q), r,s d=p  for ∀ j, 1 ≤ p < q ≤ mk . 3. Termination: oi (γk ) = β1 (1, mk ). Running both the Viterbi and the inside algorithms, the posterior distribution of the states given the observation could be computed.  2.3.3 MFR Parameter Estimation using EM Algorithm The Expectation Maximization (EM) algorithm is a widely used iterative numerical algorithm for computing maximum likelihood parameter estimates of partially observed models. Suppose we have observations Γn = {γ1 , . . . , γn } available, where n is a fixed positive integer. Let 65  {Pφ , φ ∈ Φ} be a family of probability measures on (Ω, F ) all absolutely continuous with respect to a fixed probability measure Po . The likelihood function for computing an estimate of the parameter φ based on the information available in Γn is Ln (φ ) = Eo [  dPφ |Γn ], dPo  and the maximum likelihood estimate (MLE) is defined by  φˆ ∈ arg max Ln (φ ). φ ∈Φ  The EM algorithm is an iterative numerical method for computing the MLE. Let φˆo be the initial parameter estimate. The EM algorithm generates a sequence of parameter estimates {φj }, j ∈ Z+ , as follows: Each iteration of the EM algorithm consists of two steps: Step 1. (Expectation-step) Set φ˜ = φˆ j and compute Q(·, φ˜ ), where dPφ |Γn ]. Q(φ , φ˜ ) = Eφ˜ [log dPφ˜ Step 2. (Maximization-step) Find  φˆ j+1 ∈ arg max Q(φ , φ j ). φ ∈Φ  Using Jensen’s inequality, it can be shown (see Theorem 1 in [19]) that sequence of model estimates {φˆ j }, j ∈ Z+ from the EM algorithm are such that the sequence of likelihoods {Ln (φˆ j )}, j ∈ Z+ is monotonically increasing with equality if and only if φˆ j+1 = φˆ j . In Sec. 2.3.2, MFR’s state estimation problem was discussed. However, the algorithm introduced assumes complete knowledge of system parameters, i.e., the Markov chain’s transition matrix and the SCFG’s production rules. In reality, such parameters are often unknown. In this subsection, EM algorithm is applied to a batch of noisy radar signal in the track file, and system parameters are estimated iteratively. In EM’s terminology, the intercepted radar signal (radar words), γ 1:n , is the incomplete observation sequence, and we have complete data if it is augmented with {x 0:n ,C1:n }. x0:n is the state sequence of the MFR system. C1:n = (C1 , . . . ,Cn ) is the number of counts each production rule is used to derive γ1:n , and in particular, Ck = (C1 (A → η ; γk ),C2 (A → η ; γk ), . . . ,CM (A → η ; γk )) for k = 1, 2, . . . , n and Ci (A → η ; γk ) for i = 1, 2, . . . , M is the number of counts grammar Gi applies the production rule A → η in deriving γk . Denote the parameter of interest as Φ = {aji , P1 (A → η ), P2 (A → η ), . . . , PM (A → η )}, where Pi (A → η ) is set of probabilities of all the production rules defined in grammar i, the 66  complete-data likelihood is written as n  Ln (φ ) = ∏ P(γk ,Ck |xk , φ )P(xk |xk−1 , φ )P(xo |φ ). k=1  In order to facilitate the discussion of the EM algorithm, the following two variables are introduced:  χi (k) = P(xk = ei |γ1:n ) fi (k)bi (k) = 3 ∑i=1 fi (k)bi (k) ξ ji (k) = P(xk = e j , xk+1 = ei |γ1:n ) f j (k)a ji oi (γk+1 )bi (k + 1) = 3 ∑ j=1 ∑3i=1 f j (k)a ji oi (γk+1 )bi (k + 1) The Expectation step of the EM algorithm yields the following equation: Eφ˜ (log Ln (φ )) = n  ∑ ∑ ∑ ∑ Eφ˜ (Cx (A → η ; γk )) log Px (A → η )χx (k) k  k  k  k=1 xk Axk T xk n  + ∑ ∑ ∑ log(axk |xk−1 )ξxk−1 xk (k − 1) k=1 xk xk−1 n  + ∑ ∑ log P(x0 )χx0 (k) k=1 x0  where Eφ˜ (Cxk (A → η ; γk )) can be computed using inside and outside variables [31]. The Maximization step of the EM algorithm could be computed by applying Lagrange Multiplier. Since the parameters we wish to optimize are independently separated into three terms in the sum, we can optimize the parameter term by term. The estimates of the probabilities of the production rules can be derived using the first term of the equation, and the updating equation is Pxk (A → η ) =  ∑nk=1 Eφ˜ (Cxk (A → η ; γk ))χxk (k) . ∑η ∑nk=1 Eφ˜ (Cxk (A → η ; γk ))χxk (k)  Similarly, the updating equation of the transition matrix aji is a ji =  ∑n−1 k=1 ξ ji (k) ∑n−1 k=1 χ j (k) 67  Under the conditions in [57], iterative computations of the expectation and maximization steps above will produce a sequence of parameter estimates with monotonically nondecreasing likelihood. For details of the numerical examples, the parameterization of the Markov chain’s transition matrix by Logistic model, and the study of the predictive power (entropy) of SCFGs, please see [51].  2.4 Signal Processing in Finite-State Domain In this section, we deal with the lower layer of signal processing as described in Sec. 2.3.1. Before we discuss finite-state modeling and signal processing of syntactic MFR models, we need to provide some additional definitions. 2.4.0.1  Context-Free Grammars and Production Graphs  The property of self-embedding of CFGs introduced in Section2.1 is not very easy to determine through a simple visual inspection of grammatical production rules. More precise and formal techniques of the self-embedding analysis rely on the concept of CFG production graphs. Definition A production graph P(G) for a CFG G = (A , E , Γ, S0 ) is a directed graph whose vertices correspond to the nonterminal symbols from E , and there exists an edge from vertex A to vertex B if and only if there is a production in Γ such that A → α Bβ . Definition A labeled production graph Pl (G) for a Context-Free Grammar (CFG) G = (A , E , Γ, S0 ) is a production graph P(G) with the set of labels lab(Γ) defined over the set of edges of P(G) in the following way:  l if for every A → α Bβ    r if for every A → α Bβ lab(A → B) = b if for every A → α Bβ    u if for every A → α Bβ  ∈ Γ, α = ε , β ∈ Γ, α = ε , β ∈ Γ, α = ε , β ∈ Γ, α = ε , β  = ε, = ε, = ε, = ε.  (2.21)  Note that the production graphs of Def. 2.4.0.1 and Def. 2.4.0.1 are not related to FiniteState Automata or Finite-State Languages described earlier. They are simply useful graphical representations of CFGs.  68  l  A b  S  l u  r  B  l  b  E  F  u l  D  r  l  C  Figure 2.16: Production graph for the grammar in (2.22).  Let us consider an example grammar (reproduced from [20] with modifications): A = {a, b, c, d} , E = , {S, A, B,C, D, E, F} S→DA             A→bEaB       B→aE|S C→bD . Γ=     D→daC|a            E→D|Cc|aF|Fc    F→bd  (2.22)  The labeled production graph for this grammar is illustrated in Fig.2.16. Definition A transition matrix M(G) for the labeled production graph Pl (G) of a CFG G = (A , E , Γ, S0 ) is an N × N matrix whose dimensions are equal to the number of non-terminal symbols in E (number of vertices in the production graph), and whose elements are defined as follows: mi, j (G) =  0 if Ai → α A j β ∈ / Γ, lab(Ai → A j ) if Ai → α A j β ∈ Γ.  (2.23)  The transition matrix of the labeled production graph in Fig.2.16 with respect to the vertex ordering {S, A, B,C, D, E, F} has the following structure:      M(G) =       0 0 u 0 0 0 0  l 0 0 0 0 0 0  0 l 0 0 0 0 0  0 0 0 0 l r 0  r 0 0 l 0 u 0  0 b l 0 0 0 0  0 0 0 0 0 b 0       .      (2.24)  69  In Section 2.4.1, we will use production graphs and transition matrices in analysis of selfembedding property of CFG.  2.4.1 CFG-based Finite-State Model Synthesis This subsection is devoted to development of a procedure that allows to automatically synthesize a finite-state model of a DES using its CFG model as an input. We introduce a theoretical framework for determining whether a specified CFG of the system actually represents a FSL and provide an automated polynomial-time algorithm for generating the corresponding finitestate automaton. This synthesis procedure consists of four basic steps: 1. Test of self-embedding. A CFG that is determined to be Non-Self-Embedding (NSE) describes a FSL (see Section 2.1). Therefore, a finite-state automaton can be synthesized from this grammar. 2. Grammatical decomposition. First, the NSE CFG is broken down into a set of simpler FSGs. 3. Component synthesis. Once the grammar has been decomposed into a set of simpler grammars, a finite-state automaton can be synthesized for every one of these FSGs. 4. Composition. Finally, the components from the previous step are combined together to form a single Finite-State Automaton that is equivalent to the original NSE CFG of the MFR. This procedure is based on combined results published in [5; 33]. 2.4.1.1  Verification of Non-Self-Embedding  As was mentioned earlier, if a CFG of a system is in the NSE form, this CFG has an equivalent finite-state representation. However, given an arbitrarily complex CFG, it is not possible to verify the NSE property of this grammar by simple visual inspection. In this section, we present a formal verification procedure of the NSE property of an arbitrary CFG. This procedure is based on the one described in [5], but it has been modified to suite the needs of the DES grammars. Let us start by defining the concept of a semi-ring [29]: Definition A semi-ring is a set S together with addition “+” and multiplication “×” operations defined over the elements of this set in such a way that they satisfy the following properties: 1. additive associativity: (∀e, g, f ∈ S) (e + g) + f = e + (g + f ), 70  + l r b 0 u  l l b b l l  Sum r b b b r b b b r b r b  0 l r b 0 u  × l r b 0 u  u l r b u u  l l b b 0 l  Product r b b b r b b b 0 0 r b  0 0 0 0 0 0  u l r b 0 u  Table 2.3: Semi-ring operations of sum and product.  2. additive commutativity: (∀e, g ∈ S) e + g = g + e, 3. multiplicative associativity: (∀e, g, f ∈ S) (e × g) × f = e × (g × f ), and 4. left and right distributivity: (∀e, g, f ∈ S) e × (g + f ) = (e × g) + (e × f ) and (e + g) × f = (e × f ) + (g × f ). Let us now define a semi-ring over the set of labels of production graphs of CFGs. This set of labels is introduced by Def. 2.4.0.1 and Def. 2.4.0.1. The sum and product operations of this semi-ring are listed in Table 2.3. If M(G) is a transition matrix of the CFG G (see Def. 2.4.0.1), then, using the semi-ring operations of Table 2.3, we define the steady-state matrix of the production graph of this grammar as N  M≤N (G) = ∑ [M(G)]i ,  (2.25)  i=1  where N is the dimension of the transition matrix M(G). [5] have proven that if diag M≤N (G)  does not contain labels ‘b’, the corresponding grammar G is NSE. This demonstrates that the NSE property of a CFG can be verified in polynomial time. To illustrate the application of this algorithm, let us revisit the example CFG (2.22). The labeled production graph for this grammar is shown in Figure 2.16, and the transition matrix of this graph is given by (2.24). diag M≤N (G) = [l, l, l, l, l, 0, 0], therefore, CFG (2.22) is NSE. In the remainder of this section, we will describe a 3 step procedure that accepts an NSE CFG and automatically synthesizes a Finite-State Automaton (FSA) that is equivalent to this grammar. 2.4.1.2  Grammatical Decomposition  We start by introducing the concept of the grammatical ⊕-composition. 71  Definition If G1 = (A1 , E1 , Γ1 , S1 ) and G2 = (A2 , E2 , Γ2 , S2 ) are two CFGs with E1 ∩ E2 = 0/ / then ⊕-composition of these grammars is defined as and E1 ∩ A2 = 0, G = G1 ⊕ G2 = (A , E , Γ, S)  where A = A1 \ E2 ∪ A2 , E = E1 ∪ E2 , Γ = Γ1 ∪ Γ2 , and S = S1 . [5] have demonstrated that for any NSE CFG G there exist n FSGs G1 , G2 , . . . , Gn such that G = G1 ⊕ G2 ⊕ . . . ⊕ Gn . They have also shown that every FSG Gi of this decomposition is equivalent to some strongly-connected component of the production graph P(G). The grammatical decomposition procedure consists of the following steps. 1. Let P1 (G), P2 (G), . . . , Pn (G) be n strongly-connected components of the production graph P(G). Then Ei of the FSG Gi is the same as the set of vertices in Pi (G). 2. The set of terminal symbols of the FSG Gi is found through the following recursive relationship: An An−1 .. . A2 A1  = A = A ∪ En = A ∪ E3 ∪ . . . ∪ En = A ∪ E2 ∪ . . . ∪ En  3. The set of grammatical production rules Γi ⊆ Γ is defined as Γi = {A → α |A ∈ Ei }. 4. Finally, the start symbol S1 for the FSG G1 is chosen as S0 of the original NSE CFG G, and Si for i = 2, . . . , n is chosen to be an arbitrary nonterminal from the corresponding set Ei . One of the most efficient procedures to decompose a directed graph into a set of stronglyconnected components involves Dulmage-Mendelsohn decomposition [34] of the production graph’s adjacency matrix. This decomposition finds a permutation of the vertex ordering that renders the adjacency matrix into upper block triangular form. Each block triangular component of the transformed adjacency matrix corresponds to a strongly-connected component of the production graph. Now consider an example of the decomposition procedure applied to grammar (2.22). Its production graph includes four strongly-connected components shown in Figure2.17. The four  72  #1 l  A b  S  l #2  u  r  B  l E  #4 b F  u l  D  r  l  C  #3  Figure 2.17: Strongly-connected components of the production graph shown in Figure 2.16.  FSG components of this CFG are:   A1  E1   =   Γ1   S1  G1   G2  =  A2  E2   Γ2 S2  = = = = = = = =    G3  A3  E3  =   Γ3  S3   G4  =  A4  E4   Γ4 S4  {a, b, c, d,C, D, E} , {S,  A, B} ,   S→DA ,  A→bEaB , ,   B→aE|S S {a, b, c, d,C, D, F} , {E} , E→D|Cc|aF|Fc E  = {a, b, c, d} , = {C, D} , C→bD , = D→daC|a = {X|X ∈ E3 } = = = =      ,     (2.26a)    , ,   (2.26b)     , ,     {a, b, c, d} ,  {F} , , F→bd ,  F  (2.26c)  (2.26d)  where, in (2.26c), X may represent either of the nonterminals contained in E3 .  73  D S  A  b C E E  a  B a  a  F  da  b  C  D  F  D F  c  c  E  (a)  (b)  bd  a  (d)  (c)  Figure 2.18: Components of the finite-state automaton for the grammar (2.22). States that are not labeled in these graphs are considered intermediate and have no direct representation in the set of nonterminals of the grammars. (a) corresponds to the grammar (2.26a), (b) corresponds to the grammar (2.26b), (c) corresponds to the grammar (2.26c), and (d) corresponds to the grammar (2.26d).  2.4.1.3  Synthesis of Finite-State Components  The next step involves synthesis of individual FSA for each of the FSGs G1 , G2 , . . . , Gn obtained at the step of grammatical decomposition. This is a straightforward mechanical procedure well described in the literature [4; 2; 24; 25; 26]. The FSA for the example grammars (2.26) are shown in Figure 2.18. They have the following structure:  Λ1  =         Λ2  =   Λ3    =           Σ3 Q3 δ3 q0 F3  Σ1 Q1 δ1 q0 F1  = = = = =  A1 , E 1 ∪ Q1 , δ (Γ1 ) , {S} , H ∈ Q1  Σ2 Q2 δ2 q0 F2  = = = = =  A2 , E 2 ∪ Q2 , δ (Γ2 ) , {E} , H ∈ Q2  = = = = =     ,    (2.27a)     ,    A3 , E 3 ∪ Q3 , δ (Γ3 ) , {{X}|X ∈ E3 } , H ∈ Q3  (2.27b)     ,    (2.27c)  74  Algorithm 1 Procedure createFSA creates a Finite-State Automaton. 1: procedure CREATE FSA(S0 , Λ1 , . . . , Λn ) Creates an FSA from components Λ1 , . . . , Λn 2: Σ ← {S0 } Initializing set of transitions 3: Q ← {q , H} Adding intermediate states 4: δ ← {δ (q , S0 ) = {H}} Adding initial transition 5: q0 ← q Setting initial state 6: F ← {H} Setting terminal states 7: Λ ← (Σ, Q, δ , q0 , F) Initializing the FSA 8: Λ ← expandFSA(Λ, Λ1 , . . . , Λn ) Calling the expansion procedure 9: end procedure  Λ4  =        Σ4 Q4 δ4 q0 F4  = = = = =  A4 , E 4 ∪ Q4 , δ (Γ4 ) , {F} , H ∈ Q4     ,    (2.27d)  where Qi are the sets of intermediate states required for construction of the automaton i that are not present in the set of nonterminals of the corresponding grammar Gi . Note that we present here simplified versions of FSA highlighting only the important structural aspects. Specifically, in Figures 2.18 (c) and (d) transitions labeled with nonterminals ‘da’ and ‘bd’, when rigorously treated, require the insertion of an intermediate state. Also, before constructing a FSA, the grammar should have been converted to the unit production free form so that every edge in the graphs in Figure 2.18 corresponds to the generation of a single nonterminal. We will suppress these intermediate steps in the rest of this paper, and, without loss of generality, will adopt a slightly simplified form of FSA representation. 2.4.1.4  Composition of the Finite-State Automaton  The final step of the FSA synthesis procedure for a given NSE CFG involves composition of the FSA components obtained at the previous step. A recursive “depth-first” algorithm that performs this operation was developed by [33] and its modified version was presented by [20]. Here we present an alternative, “breadth-first” algorithm. The FSA composition procedure is formalized in terms of the two algorithms createFSA and expandFSA detailed below. The main procedure createFSA initializes the composition operation and calls the function expandFSA that performs the actual FSA composition. We will illustrate this procedure by an example composing FSA components shown in Fig.2.18. The initialization step involves creation of a “dummy” FSA that contains two states – an intermediate state and a terminal state. It also contains one transition from the intermediate state to the terminal state on symbol S0 = S from the original NSE CFG (2.22). This FSA is shown 75  Algorithm 2 Function expandFSA inserts FSA components into the Finite-State Automaton Λ. 1: function EXPAND FSA(Λ, Λ1 , . . . , Λn ) Inserts FSA components into Λ 2: for all Λi and α ∈ Σ do 3: if α ∈ Qi then If transition matches a state 4: qfrom ← argq (δ (qα , α )) Saving from state 5: qto ← δ (q f rom , α ) Saving to state 6: Σ ← Σ \ α ∪ Σi Expanding set of transitions 7: Q ← Q ∪ Qi Expanding set of states 8: δ ← δ ∪ δi Appending transition structure 9: for all δ (q j , β ) == qfrom do 10: δ (q j , β ) ← α Rerouting qfrom inputs 11: end for 12: for all δ (q j , β ) ∈ Fi do 13: δ (q j , β ) ← qto Rerouting qto inputs 14: end for 15: Q ← Q \ Fi Removing term. states of Λi 16: Q ← Q \ qfrom Removing qfrom state 17: end if 18: end for 19: return Λ Returning the complete FSA Λ 20: end function da D  D A  S  q'  D  D  S  A b  S  A b  b D S  B  a  E  F  a  F  C  c c E  D  a  C F  E  F  c  c  C bdc c  D a  C b B  bdc  a  D abd  C  A b  da C  abd  a B  a  S E  D  B  a  E  E a  b  a  a  a  c  E  D ba  E abd bdc c  C  abd  a  b D  da  b  b  C da  D a  bdc  a  C c  (a)  (b)  (c)  (d)  (e)  Figure 2.19: Synthesis procedure of the Finite-State Automaton for the example CFG (2.22).  in Fig. 2.19 (a). The function “expandFSA” accepts the “dummy” FSA as well as four FSA components shown in Fig. 2.18. It then transforms the “dummy” FSA into a real automaton by consecutively inserting FSA components and re-wiring transitions. The step-by-step composition procedure is illustrated in Fig. 2.19 (b)–(e). First, the FSA shown in Fig. 2.18 (a) is inserted into the FSA in Fig. 2.19 (a) instead of the transition labeled S. 76  The resulting intermediate FSA is shown in Fig. 2.19 (b). Next, all the E-transitions are replaced with the FSA in Fig. 2.18 (b). The resulting intermediate FSA is shown in Fig. 2.19 (c). Then, all the F-transitions are replaced with the FSA in Fig. 2.18 (d). The resulting intermediate FSA is shown in Fig. 2.19 (d). Finally, all the C- and D-transitions are replaced with the FSA in Fig. 2.18 (c). The final automaton equivalent to the grammar (2.22) is shown in Fig. 2.19 (e). Note that the labels of the states in Fig. 2.19 are not the unique state identifiers. These labels are shown to illustrate the composition procedure and to provide linkage with the states of the original FSA shown in Fig. 2.18.  2.4.2 State Machine Synthesis of the Mercury Radar Controller As stated in [25], the analysis of stochastic and weighted grammars must be performed using their characteristic counterparts. However, since the characteristic grammar is so close to the original deterministic grammar, we can perform the NSE property test directly on the deterministic CFG. Following the verification procedures as described in Sec.2.4.1.1, the diagonal elements of the steady-state analysis matrix can be shown to be diag M≤N (G) =  o  o  l  l  l  l  l  l  l  o  ...  o  T  ,  and which confirms that the Mercury’s radar controller grammar is a NSE CFG. Therefore, a Finite-State Automaton of the radar controller can be synthesized from the grammar of Fig.2.13. Using the Dulmage-Mendelsohn decomposition, 29 strongly-connected components of the production graph of the CFG of the Mercury emitter were obtained. As shown in Sec.2.1, each strongly connected component of the production graph corresponds to a Finite-State Grammar. Finite-State Automata for each of these FSGs are shown in Fig. 2.21 and Fig. 2.22. The complete state machine of the Mercury emitter can be obtained by applying the FSA composition operation4 . The final step of the synthesis procedure involves transformation of the deterministic state machine of the Mercury emitter into a stochastic model, taking into account the probability distributions determined by the structure of the stochastic grammar shown in Fig.2.13. At this stage, the probabilistic elements of the problem that led to the development of the stochastic radar grammar (e. g.the channel impairment probability distribution (2.19)) are brought into the structure of the radar model. This conversion procedure is illustrated in Section2.1 in (2.12) – (2.15). The procedure is essentially a simple mechanical operation of converting the Mealy state machine to the Moor automaton and assigning the probabilities of transitions as shown in (2.13), and the probabilities of observations as demonstrated in (2.15). 4 Due  to very large size of the final Mercury state machine, we do not include it in this chapter.  77  Figure 2.20: Production graph of the Mercury grammar.  2.4.3 Finite-State Signal Processing The finite-state model of the radar controller grammar is based on Hidden Markov Model (HMM). Once this model is obtained, the forward-backward algorithm also known as HMMfilter can be applied to statistical signal processing based on this model. This algorithm is well-known and studied in detail in [49; 23].  2.5 Conclusion The main idea of this paper is to model and characterize multi-function radars (MFRs) as a string generating device, where the control rules are specified in terms of a stochastic context free grammar (SCFG). This is unlike modeling of targets, where Hidden Markov and state space models are adequate [9; 8]. The threat of the MFR is recognized as its policy of operation, and it is modeled as a Markov chain modulating the probabilities of the MFR’s production rules. The paper shows how a large scale dynamical system such as multifunction radars is expressed by a compact representation, and demonstrates the flexibility of translating from one representation to another if the self-embedding property is satisfied. Based on the SCFG representation, a maximum likelihood sequence estimator is derived to evaluate the threat poses by the MFR, and a maximum likelihood parameter estimator is derived to infer the unknown system parameters with the Expectation Maximization algorithm. Since SCFGs are multi-type Galton-Watson branching processes, the algorithms proposed in this paper can be viewed as filtering and es78  Figure 2.21: Mercury state machine components.  timation of a partially observed multi-type Galton-Watson branching processes. For details of numerical examples of the constructed model, and the study of the predictive power (entropy) study, please see [51]. Several extensions of the ideas in this paper are worthwhile considering: 1. The algorithms studied in this paper are inherently off-line. It is of interest to study stochastic approximation algorithms for adaptive learning of the MFR grammar and real79  Figure 2.22: Mercury state machine components.  time evaluation of the threat. For HMMs, such real-time state and parameter estimation algorithms are well known [28]. 2. In this paper we have constructed SCFG models for the MFR radar as it responds to the dynamics of a target. Recall from Fig. 2.6 that in this paper we are interested in electronic warfare from the target’s point of view. An interesting extension of this paper that we are currently considering is optimizing the trajectory of the target to maximize the amount of information obtained from the CFG. Such trajectory optimization can be formulated as a stochastic control problem involving a SCFG (or equivalently a Galton-Watson branching process). 3. The SCFG signal processing algorithms presented in this paper consider an iid channel 80  impairment model. It is important to extend this to Rayleigh fading channels. Sequential Monte Carlo method, such as particle filtering, can be applied to cope with fading channel. 4. In this paper we deal exclusively with optimizing the radar signal at the word level. Analogous to cross layer optimization on communication system [58], cross layer optimization can be applied to radar signal processing at the pulse and word level. 5. This paper deals exclusively with modeling and identifying MFRs in open loop. That is, we do not model the MFR as a feedback system which optimizes its task allocation in response to the target. See [56] for a Lagrangian MDP formulation of radar resource allocation. Modeling, identifying and estimating a MFR in closed loop is a challenging task and will require sophisticated real time processing (See point 1 above).  81  Bibliography [1] J. K. Aggarwal and Q. Cai. Human motion analysis: A review. Computer vision and image understanding, 73:428–440, 1999. [2] A. V. Aho and J. D. Ullman. The Theory of Parsing, Translation and Compiling, volume I: Parsing. Prentice-Hall, Englewood Cliffs, NJ, 1972. [3] A. V. Aho and J. D. Ullman. The Theory of Parsing, Translation and Compiling, volume II: Compiling. Prentice-Hall, Englewood Cliffs, NJ, 1973. [4] Alfred V. Aho, Ravi Sethi, and Jeffrey D. Ullman. Compilers: principles, techniques, tools. Addison-Wesley, Reading, MA, 1986. [5] Marcella Anselmo, Dora Giammarresi, and Stefano Varricchio. Finite automata and nonself-embedding grammars. In 7th International Conference on Implementation and Application of Automata (CIAA’02), volume 2608 of Lecture notes in Computer Science, pages 47–56, Tours, France, July 2003. [6] J. K. Baker. Trainable grammars for speech recognition. Speech Communication Papers for the 97th Meeting of the Acoustical Society of America, pages 547–550, 1979. [7] Pierre Baldi and Soren Brunak. Bioinformatics: The Machine Learning Approach. MIT Press, Cambridge, MA, second edition, 2001. [8] Y. Bar-Shalom and X. Li. Estimation and Tracking: Principles, Techniques, and Software. Artech House, 1993. [9] S. S. Blackman and R. Popoli. Design and Analysis of Modern Tracking Systems. Artech House, 1999. [10] P. L. Bogler. Radar Principles with Applications to Tracking Systems. John Wiley & Sons, 1990. [11] C. G. Cassandras and S. Lafortune. Introduction to Discrete Event Systems. Springer, 1999. [12] Z. Chi. Statistical properties of probabilistic context-free grammars. Computational Linguistics, 25:131–160, 1999. [13] T. Chiu. The development of an intelligent radar tasking system. Intelligent Information Systems, pages 437–441, 1994. [14] N. Chomsky. Three models for the description of language. IRE Transactions on Information Theory, 2(3):113–124, 1956. 82  [15] N. Chomsky. A note on phrase structure grammars. Information and Control, 2(4):393– 395, December 1959. [16] N. Chomsky. On certain formal properties of grammars. Information and Control, 2(2):137–167, June 1959. [17] N. Chomsky and G. A. Miller. Finite state languages. Information and Control, 1(2):91– 112, May 1958. [18] John R. Deller, John H. L. Hansen, and John G. Proakis. Discrete-Time Processing of Speech Signals. Willey-IEEE, 1999. [19] A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, 39:1–38, 1977. [20] Fred Dilkes and Nikita Visnevski. Non-self-embedding context-free grammars for electronic warfare. Technical Report DREO TM 2004-157, Defence Research & Development Canada, Ottawa, Canada, October 2004. [21] R. Durbin, S. Eddy, A. Krogh, and G. Mitchison. Biological sequence analysis: Probabilistic models of proteins and nucleic acids. Cambridge University Press, 1998. [22] Robert J. Elliott, Lakhdar Aggoun, and John B. Moore. Hidden Markov Models: Estimation and Control. Springer Verlag, New York, NY, 1995. [23] Yariv Ephraim and Neri Merhav. Hidden Markov processes. IEEE Trans. Inform. Theory, 48(6):1518–1569, June 2002. [24] K. S. Fu. Syntactic Methods in Pattern Recognition. Academic Press, Inc., 111 Fifth Avenue, New York, 1974. [25] K. S. Fu. Syntactic Pattern Recognition and Applications. Prentice-Hall, Englewood Cliffs, NJ, 1982. [26] J. E. Hopcroft, R. Motwani, and J. D. Ullman. Introduction to Automata Theory, Languages, and Computation. Addison-Wesley, Reading, MA, second edition, 2001. [27] Y. A. Ivanov and A. F. Bobick. Recognition of visual activities and interactions by stochastic parsing. T-PAMI, 22:852–872, 2000. [28] V. Krishnamurthy and G. Yin. Recursive algorithms for estimation of hidden Markov models and autoregressive models with Markov regime. IEEE Transactions on Information Theory, 48(2):458–476, 2002.  83  [29] Werner Kuich and Arto Salomaa. Semirings, Automata, Languages. EATCS Monographs on Theoretical Computer Science. Springer Verlag, New York, NY, 1986. [30] K. Lari and S. J. Young. The estimation of stochastic context free grammars using the Inside-Outside algorithm. Computer Speech and Language, 4:35–56, 1990. [31] C. D. Manning and H. Sch¨utze. Foundations of Statistical Natural Language Processing. The MIT Press, 1999. [32] M. I. Miller and A. O’Sullivan. Entropies and combinatorics of random branching processes and context-free languages. IEEE Transactions on Information Theory, 38:1292– 1310, 1992. [33] Mark Nederhof. Regular Approximations of CFLs: A Grammatical View, chapter 12, pages 221–241. Advances in Probabilistic and other Parsing Technologies. Kluwer Academic Publishers, Boston, MA, January 2000. [34] Alex Pothen and Chin-Ju Fan. Computing the block triangular form of a sparse matrix. ACM Transactions on Mathematical Software, 16(4):303–324, December 1990. [35] L. R. Rabiner. A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of IEEE, 77(2):257–286, February 1989. [36] Lawrence Rabiner and Biing-Hwang Juang. Fundamentals of Speech Recognition. Prentice Hall, Englewood Cliffs, NJ, 1993. [37] E. Rivas and S. R. Eddy. The language of RNA: a formal grammar that includes pseudoknots. Bioinformatics, 16:334–340, 2000. [38] J. Roe and A. Pudner. The real-time implementation of emitter identification for ESM. Signal Processing in Electronic Warfare, IEE Colloquium on, pages 7/1–7/6, 1994. [39] J. A. V. Rogers. ESM processor system for high pulse density radar environments. IEE proceedings F, 132:621–625, 1985. [40] S. Sabatini and M. Tarantino. Multifunction Array Radar System Design and Analysis. Artech House, 1994. [41] Y. Sakakibara. Grammatical inference in bioinformatics. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27:1051–1062, 2005. [42] Arto Salomaa. Theory of Automata. Pergamon Press, Ltd., Elmsford, NY, 1969. [43] Arto Salomaa. Formal Languages. Academic Press, New York, NY, 1973. 84  [44] Arto Salomaa. Computation and Automata. Cambridge University Press, Cambridge, UK, 1985. [45] D. C. Schleher. Electronic warfare in the information age. Artech House, 1999. [46] M. I. Skolnik. Introduction to Radar Systems. McGraw-Hill, 2002. [47] N. A. Visnevski. Syntactic Modeling of Multi-Function Radars. PhD thesis, McMaster University, 2005. [48] N. A. Visnevski, F. A. Dilkes, S. Haykin, and V. Krishnamurthy. Non-self-embedding context-free grammars for multi-function radar modeling - electronic warfare application. International Radar Conference, pages 669–674, 2005. [49] Nikita Visnevski, Vikram Krishnamurthy, Simon Haykin, Brian Currie, Fred Dilkes, and Pierre Lavoie. Multi-function radar emitter modelling: A stochastic discrete event system approach. In Proc. IEEE Conference on Decision and Control (CDC’03), pages 6295– 6300, Maui, Hawaii, USA, December 2003. [50] A. Wang and V. Krishnamurthy. Threat estimation of multifunction radars : modeling and statistical signal processing of stochastic context free grammars. In ICASSP, volume 3, pages III–793–III–796, 2007. [51] A. Wang and V. Krishnamurthy. Threat estimation of multifunction radars : modeling and statistical signal processing of stochastic context free grammars. IEEE Transactions on Signal Processin, 56:1106–1119, 2008. [52] A. Wang, V. Krishnamurthy, F. A. Dilkes, and N. A. Visnevski. Threat estimation by electronic surveillance of multifunction radars: a stochastic context free grammar approach. In IEEE Conference on Decision and Control, pages 2153–2158, December 2006. [53] N. J. Whittall. Signal sorting in ESM systems. IEE proceedings F, 132:226–228, 1985. [54] R. G. Wiley. Electronic Intelligence: The analysis of radar signals. Artech House, 1993. [55] D. R. Wilkinson and A. W. Watson. Use of metric techniques in ESM data processing. IEE Proceedings F, 132:229–232, 1985. [56] J. Wintenby and V. Krishnamurthy. Hierarchical resource management in adaptive airborne surveillance radars. IEEE Transactions Aerospace Electronic Systems, pages 401– 420, 2006. [57] C.F.J. Wu. On the convergence properties of the EM algorithm. Annals of Statistics, 11(1):95–103, 1983. 85  [58] F. Yu and V. Krishnamurthy. Optimal joint session admission control in integrated WLAN and CDMA cellular network. IEEE/ACM Transactions Mobile Computing, accepted. [59] W. Zhu and J. Garcia-Frias. Modeling of bursty channels using stochastic context-free grammars. In Proceedings of the 55th Vehicular Technology Conference (VTC 2002), volume 1, pages 355–359, Birmingham, AL, USA, May 2002. [60] W. Zhu and J. Garcia-Frias. Stochastic context-free grammars and Hidden Markov Models for modeling of bursty channels. IEEE Transactions On Vehicular Technology, 53(3):666–676, May 2004.  86  Chapter 3  Signal Interpretation of Multifunction Radars: Modeling and Statistical Signal Processing with Stochastic Context Free Grammar1 Electronic support measure, a division of electronic warfare, involves intercepting and interpreting radiated electromagnetic energy for an operational commander to locate and identify radar sources, and evaluate their potential threats. The electronic support algorithm described in this paper considers the self protection of the target from radar threats, and a major component of which is the interpretation of the intercepted radar pulses in terms of the possible radar modes, such as “search” and “track maintenance”. In the current problem setup, because we focus on the target perspective, the radar model is simplified by removing its multiple target tracking capability, and we limit the scenario to having only one multifunction radar in the proximity of the target. In building electronic support systems to analyze radar signals, statistical pattern recognition has been used extensively. Conventional radars could be characterized by fixed parameters such as radio frequency, pulse-width and peak amplitude [43; 42]. For such radar characterizations, decision-theoretic approach as in statistical pattern recognition is sufficient for solving signal processing problems such as emitter identification and threat evaluation. [44; 30] discusses template matching of the intercepted radar signal against an EW library for both emitter type and emitter mode identification. Histogram techniques are described in [31] to study temporal periodicities in radar signals such as pulse repetition intervals. However, modern radars, especially multi-function radars (MFRs), makes the statistical pattern recognition approach inadequate. MFRs are radio-frequency sensors with beam-steering antennas that are widely used in modern surveillance and tracking systems, and they have the capability to perform a multitude of different tasks simultaneously by multiplexing them in time using short time slices [36]. The list of these tasks includes search, acquisition, multiple target 1 A version of this chapter has been published. Wang, A. and Krishnamurthy, V. (2008) Signal Interpretation of Multifunction Radars : Modeling and Statistical Signal Processing with Stochastic Context Free Grammars. IEEE Transactions on Signal Processing. 56:1106-1119.  87  tracking and weapon guidance [32]. At the same time they maintain low probability of being detected and jammed. The reasons for the inadequacy of the statistical pattern recognition are two folds. The first concerns with the exploding dimension of the feature space due to the versatility of the radar. The possible variation of the radar parameters such as the carrier frequency and radar pulse width makes the statistical pattern recognition infeasible. The second reason deals with the possible time varying feature space necessary for correct recognition. Because of the time multiplexing capability of the radar, the underlying representation of the radar may need to vary in order to capture the dynamics of the radar. This paper considers a hybrid algorithm of both statistical and syntactical pattern recognition techniques. The methodology is to codify all a priori knowledge available and analyze observable within the context of the a priori knowledge. Because of the success of formal language in codifying human language, we propose to embody radar domain knowledge in a modified language representation, and implement signal interpretation as a parsing operation through the radar pulses. In this representation, radar pulses are analogous to English letters, and control rules of pulse generation to English grammar. The origins of syntactic modeling can be traced to the classic works of Noam Chomsky on formal languages and transformational grammars [11; 14; 13; 12]. Among the many grammars and languages that have been investigated for practical applications, finite state grammar (FSG) and context free grammar (CFG), as well as their stochastic counterparts, stochastic FSG and stochastic CFG, are currently the most widely used classes of grammars. The application of the grammars to syntactic pattern recognition is covered in depth in [17]. In [21], SCFG is applied to study gesture recognition and monitoring of an online parking lot. In [46; 47], the dynamics of a bursty wireless communications channel is modeled in SCFG. [3; 33] describe syntactic modeling applied to bioinformatics and [16; 29] apply these models to the study of biological sequence analysis and RNA. In addition, on a more related topic to our paper, SCFG is studied in [37; 27] as an alternative approach to plan recognition. In this paper, we model MFRs as Markov-modulated SCFGs to take into account the MFR’s mode dependent behaviour, its hierarchical control, and the control law consisting of operational rules. The more traditional approach such as Hidden Markov and state space models are suitable for target modeling [6; 4], but not radar modeling. Traditionally, MFRs’ signal modes were represented by volumes of parameterized data records known as Electronic Intelligence (ELINT) [43]. The data records are annotated by lines of text explaining when, why and how a signal may change from one mode to another. This makes radar mode estimation and threat evaluation fairly difficult. In [38; 40], SCFG is introduced as a framework to model MFRs’ signal and it is shown that MFRs’ dynamic behavior can be explicitly described using a finite set of rules corresponding to the production rules of the SCFG. SCFG has several potential advantages: (i) SCFG is a compact formal representation that forms a homogeneous basis for modeling and storing complex system domain knowledge [1; 17; 20], and in which it is simpler and more natural for the model designer to express the control rules of MFR [38]. Specifying the production 88  rules of the SCFG allows convenient modeling of the human computer interface. (ii) SCFG is more efficient in modeling hidden branching processes when compared to a stochastic regular grammars or hidden Markov models with the same number of parameters. The predictive power of a SCFG measured in terms of entropy is greater than that of the stochastic regular grammar [23]. SCFG is equivalent to a multi-type Galton-Watson branching process with finite number of rewrite rules, and its entropy calculation is discussed in [26]. (iii) the recursive embedding structure of MFRs’ control rules is more naturally modeled in SCFG. As we will show later, the Markovian type model has dependency that has variable length, and the growing state space is difficult to handle since the maximum range dependency must be considered. In summary, the main results of the paper are: 1. A careful detailed model of the dynamics of an MFR using formal language production rules. By modeling the MFR dynamics using a linguistic formalism such as a SCFG, a MFR can be viewed as a discrete event system that “speaks” some known, or partially known, formal language [9]. Observations of radar emissions can be viewed as strings from this language, corrupted by the noise in the observation environment. 2. Novel use of Markov modulated SCFGs to model radar emissions generated by MFR. The complex embedding structure of the radar signal is captured by the linguistic model, SCFG, and the MFR’s internal state is modeled by a Markov chain. This modeling approach enables the combination of the grammar’s syntactic modeling power with the rich theory of Markov decision process. 3. Statistical signal processing of SCFGs. The threat evaluation problem is reduced to a state estimation problem. Maximum likelihood estimator is derived based on a hybrid of the forward-backward and the inside-outside algorithm. (Inside-outside algorithm is an extension of HMM’s forward-backward algorithm [2].) 4. Parameterizing the MFR model with the target’s maneuvering models, the interaction between the target and the MFR is studied. The target’s probing of the MFR in order to find a maneuvering model that maximize its safety is formulated as a discrete stochastic approximation problem, and simulation study of the problem is performed. The rest of the paper is organized as follows. Sec. 3.1 describes the multifunction radar in detail and its role in electronic warfare. Sec. 3.2 models the MFR’s command generation mechanism, where the construction of the Markov chain in terms of the MFR’s goals and subgoals, and MFRs’ hierarchical control as a set of syntactic rules are detailed. Sec. 3.3 presents the threat estimation algorithm and the discrete stochastic approximation algorithm, and Sec. 3.4 provides the numerical studies. Finally, Sec. 3.5 concludes the paper.  89  3.1 Electronic Support and MFR Electronic Warfare (EW) can be broadly defined as any military action with the objective of controlling the electromagnetic spectrum [34]. An important aspect of EW is the radar-target interaction. In general, this interaction can be examined from two entirely different viewpoints, that of the radar and of the target. From the radar’s viewpoint, the goal is to detect and identify targets, and to maintain a firm track. From the target’s viewpoint, the goal is to protect itself from radar-equipped threat by interpreting intercepted radar emissions and evaluating their threat (electronic support or ES). In this paper, the target’s viewpoint is the focus, and MFRs are the specific threat considered. The approach taken in this paper to interpret the MFR signal is knowledge-based. The raw radar signal is interpreted with respect to a grammatical model that describes its characteristics; the characteristics of interest is the order of the events detected, and the event occurrence time is not of much importance. The signal interpretation consists of two main components, a signalto-symbol transformer and a symbolic inference engine. Fig. 3.1 illustrates the two components in the context of the ES architecture, and a brief description of which is given here: The receiver processes the radar pulses intercepted by the antenna, and outputs a sequence of pulse descriptor words (PDW), where a PDW is a data structure containing parameters such as carrier frequency, pulse amplitude and pulse width of an individual pulse. The PDWs are then processed by the deinterleaver, and segregated according to their originating radar emitters. The pulse train analyzer further processed the deinterleaved PDWs, and classify them into abstract symbols called radar words. (See Sec. 3.1.1 for definitions.) Finally, the symbolic inference engine analyzes the syntactic structure between the radar words, interprets its threat level, and outputs the results on a pilot instrumentation panel. Because the receiver, deinterleaver and pulse train analyzer have been well studied, the signal-to-symbol transformer is not covered in this paper, and we only focus on the symbolic inference engine. Using an analogy between the structural description of the radar signal and the syntax of a human language, a symbolic inference engine is said to contain the prior domainspecific knowledge of the “language” MFRs “speak”. The knowledge consists of the operational rules and constraints captured by the radar analysts that are believed to be applied in the generation of the radar signal for each specific mission goal, and such knowledge allows the radar analysts to distinguish “grammatical” radar signal from “ungrammatical” one, and to reason about the particular mission goal the MFR is executing. In today’s modern radar systems, the operational rules are often implemented with fuzzy logic or expert system [6], and conventional mathematical formalisms such as differential and difference equations are not effective in analyzing them. Instead, in order to compactly store the syntactic knowledge of the MFR’s language, formal language theory is applied, and the MFR language would be fully specified by the establishment of a grammar [20]. As far as ES is concerned, the optimal approach is to collect a corpus of radar samples, and induce the grammar directly without human intervention. However, because of the degree of 90  Instrumentation Panel Stochastic  Symbolic Inference Engine  Environment  Multi−Function Radar  Signal to Symbol Transformer  Airborne ES System  MFR is still searching  Pulse train Analyzer  Receiver & Deinterleaver  Figure 3.1: The figure illustrates the Electronic Support (ES) framework considered in this paper. The radar signal emitted by the MFR is captured by the ES system on board the target after being corrupted by the stochastic environment. The system consists of an antenna, a signal-to-symbol transformer and a symbolic inference engine. The signal-to-symbol transformer consists of a receiver/deinterleaver and a pulse train analyzer, and its main purpose is to map raw radar signal to abstract symbols that are recognizable by the symbolic inference engine. The symbols are identified as a and b in the figure.  complexity and potentially lack of data on the MFR signal, grammatical induction approach is impractical. In this paper, stochastic context free grammar is chosen to model the MFR signal for each of its mission goal because of its generality over the Hidden Markov and state space models, and the existence of algorithms for parameter estimation. The context-free backbone is constructed from the domain-specific knowledge of the MFRs’ signal generation mechanism. Sec. 3.1.1 describes the MFR’s domain-specific knowledge that would be used to construct the model for knowledge-based signal processing.  3.1.1 MFR System Architecture and Its Signal Generation Mechanism Before discussing the MFR architecture, we begin by describing the radar signal that is generated by different layers of the MFR command generation hierarchy. The list below begins by the actual radar pulses generated by the MFR, to the software objects that are scheduled by the MFR processor, and ends with the radar policy that governs the scheduling process: • Radar word: A fixed arrangement of finite number of pulses. For example pulses with a fixed pulse repetition frequency. • Radar command: Catenation of finite number of radar words that is optimized for extracting certain target information. Examples are target acquisition and non-adaptive track. • Radar task: The three main radar tasks are search, target identification and target tracking, and each is implemented by a template of radar commands designed to achieve the tactical goal. 91  Radar Task  Radar Commands  Target identification for an existing track  Alert Nonadaptive Track Range Resolution  Radar Words  Figure 3.2: Radar signal corresponds to different layers of radar command generation hierarchy. A radar task consists of a sequence of radar commands that would best achieve a tactic goal, and each radar command can be mapped to a certain catenation of radar words that MFR is to execute.  • Radar mode: The constraints or emphasis on the execution of certain radar tasks due to the mission requirements or resource allocations. An example of the above radar signal is illustrated in Fig. 3.2. The radar task and the radar commands in the example are self-explanatory, and the letters ’a’ and ’b’ denote radar words. The vertical bars represent radar pulses, and a particular arrangement of them makes up the radar words. Following the macro/micro architecture as described in Sec. 15.5.6 of Blackman and Popoli [6], the generation of the radar signal is modeled by a MFR composed of four basic components2 : a situation assessment, a radar manager, a command scheduler, and a radar controller, and which are illustrated in Fig. 3.3. The chain of commands starts with the situation assessment which provides evaluation of the tactic environment to the radar manager. The radar manager evaluates the threat accordingly, and enters the appropriate radar task to the planning queue for scheduling. The radar task consists of a sequence of macro radar commands, and the commands can be repeated or preempted in the planning queue by the command scheduler. The commands that are fixed for execution are passed to the radar controller, where they will be mapped to the appropriate radar words and retrieved by the radar for execution. In the rest of the section, we will discuss the operational details of each of the MFR components, and their relationship to the macro/micro architecture. More specifically, the macro sensor management, as described in [6], requires the MFR to have three basic components: an operating scheme, a performance standard, and an adaptation procedure, and the micro sensor management requires the MFR to be able to select combination of radar pulses that best accomplish the performance requested by the macro tasks given the system status. We will describe how each of the requirements are satisfied by the MFR components. 2 The  system architecture does not include multiple target tracking functionalities such as data association. The paper focuses on a single target’s self protection and threat estimation, and thus models only the radar signal that a single target can observe.  92  Stochastic Environment  Situation Assessment  Radar Controller  Command Queue  Radar Manager  Command Scheduler  Planning Queue  Figure 3.3: The figure illustrates the MFR system architecture. The situation assessment provides the evaluation of tactical environment to the radar manager. The radar manager, based on the evaluation, selects a radar task on which the command scheduler/radar controller will operate. The command scheduler plans and preempts the tasks in the planning queue depending on the radar load, and the moves the tasks fixed for execution to the command queue. The radar controller maps the tasks in the command queue to appropriate radar commands, and which is retrieved by the radar for final execution.  The macro management is accomplished by the radar manager and the command scheduler. Radar manager sets the operating scheme and the performance standard for the MFR. It is a finite state machine that transitions among a set of tasks, with the transition probabilities determined by the radar mode. It sets the guidance to which radar commands are to be created by mapping each radar task to a template of radar commands. The mapping can be mission dependent, and such dependency models the performance standard. For example, a radar task “Target identification for an existing track”, depending on the performance standard, may be mapped to an template of radar commands such as {Alert, Non-adaptive track, Range resolution 1} or {Alert, Non-adaptive track, Range resolution 2}, where Range resolution 1 and 2 differ in carrier frequency and the radar waveforms used. The command scheduler models the adaptation procedure, and the adaptation is modeled by the scheduler’s ability to plan and preempt radar commands in the planning queue. The command scheduler processes the radar commands stored in the planning queue sequentially, and it plans, if the current command requests it, by appending radar commands in the planning queue, and preempts by inserting commands in front of the current command. The planning and preempting will be discussed according to some rules to be specified. The micro sensor management, on the other hand, is accomplished by the radar controller. Similar to the command scheduler, the radar controller processes the radar commands in the command queue sequentially and maps the radar commands to radar words according to a set of control rules. Each radar command may be mapped to a multitude of different radar words depending on the tactic environment, and the mapping will be specified explicitly later in terms of the grammar’s productions in Sec. 3.2. 93  As a remark, the control is separated into the command scheduler and the radar controller because of the MFR needs to be both adaptive and fast [7]. The command scheduler orders radar commands by time and priority, and stores them in the planning queue for it allows real time rescheduling. On the other hand, due to the system’s finite response time, radar commands in the planning queue are retrieved sequentially and placed in the command queue where no further planning or adaptation is allowed. The radar controller maps the radar commands in the command queue to radar words and which are retrieved by the radar for execution.  3.2 A Syntactic Representation of MFR Domain Knowledge In terms of natural language processing, we model the MFR as a system that “speaks” according to a stochastic grammar, and more specifically, we place the domain knowledge discussed in the previous section in a compact mathematical formalism called the stochastic context free grammar. In Sec. 3.2.1, an overview of the formal language theory is provided. In Sec. 3.2.2, the radar manager, the command scheduler and the radar controller are modeled, and the details of the Markov modulated SCFG are provided. In Sec. 3.2.3, a well posedness issue of the grammatical model is discussed.  3.2.1 Formal Languages and Transformational Grammars A formal language can be broadly defined as any set of strings consisting of concatenations of symbols. The complete set of distinguishable symbols in the language is known as the alphabet and is denoted here by T . For example, an alphabet might be T = {a, b}, and one language over this alphabet might consist of all finite (or null) repetitions of the combinations ’ab’ followed by either ’b’ or ’aa’; in this language, the strings ’b’, ’aa’, ’ababaa’ and ’ababb’ are valid strings but ’aba’ is not. The general notion of a formal language is impractically broad. It is much more useful, and intuitive, to specify a language in terms of its structural patterns. This is often accomplished by defining a grammar [11; 12; 13], sometimes known in the literature as a transformational grammar. In grammatical terminology, a grammar is a four-tuple < N, T, P, S >. N is a finite set of nonterminal symbols, T is a finite set of terminal symbols, and N ∩ T = 0. / P is a finite set of production rules, and S ∈ N is the starting symbol. The grammars are divided into four different types according to the forms of their production rules [11; 16]. Specifically, context free grammar has production rules P of the form A → η where A ∈ N and η ∈ (N ∪ T )+ ; the superscript Σ+ indicates the set of all finite length strings of symbols in a finite set of symbols Σ, excluding the string of length 0. The rule A → η indicates the replacement of the nonterminal A by η . In addition, as shown in [13], any context free grammar may be reduced to Chomsky Normal form, and which has production rules of the form Ai → A j Ak and Ai → w, where Ai , A j and Ak ∈ N and w ∈ T . An example of context free grammar in the Chomsky Normal form 94  consists of the following elements T = {a, b}  N = {A0 , A1 }  S = {A0 }  P = {A0 → A0 A1 |b, A1 → a}, where the bar | separates the two production rules, meaning that the nonterminal A0 may be mapped to either A0 A1 or b. Starting from the nonterminal A0 , the strings can be derived by applying production rules to iteratively replace nonterminal symbols with substrings. The preceding example admits the following derivations A0 ⇒ b A0 ⇒ A0 A1 ⇒ bA1 ⇒ ba etc. As a shorthand notation, the multiple derivation steps in the last derivation above may also ∗ be expressed as S0 ⇒ ba. Furthermore, please note that the notation → is used to express production rules, and ⇒ is used to represent derivation or replacement of nonterminals in a string. In addition, as is often the case, a certain amount of uncertainty exists in the process under study. In order to make the model more robust, and also to capture the random effect in the model, probabilities are added to the set of production rules P. Stochastic context free grammar is a four-tuple < N, T, Ps, S > with all elements identical to the context free grammar except Ps is a finite set of stochastic production rules. Let A be a nonterminal in N, the probability of its production rule A → η in Ps is denoted as P(A → η ), and the probabilities must satisfy  ∑ P(A → η ) = 1  η ∈Θ  where Θ is the set of all right hand sides for A in Ps . For example, the grammar given above may be converted into a stochastic one by assigning the following probabilities to the production rules 0.8 0.2 0.1 0.9 A0 −→ A0 A1 A0 −→ b A1 −→ aA1 A1 −→ a A Simple Example of MFR and Inadequacy of HMM As compared to conventional radars, MFRs are distinguished by their ability to switch between radar tasks, and plan ahead their courses of actions [7]. As an illustrative example showing the correspondence between the grammar and the MFR, consider production rules of the form i) B → b B and ii) B → A B| B C, where A, B and C are nonterminals representing radar commands in the planning queue and b is a radar command in the command queue. The rule B → b B is interpreted as directing the command scheduler to append b to the command queue, and B in the planning queue. Similarly, B → A B is interpreted as delaying the execution of 95  B Phrase Scheduler  Radar Controller  { {  B B b b  B C  wy yw Command queue  Planning queue  Figure 3.4: The figure illustrates a possible realization of the scheduling process represented by a grammatical derivation process. B and C are nonterminals and b is a terminal. The triangle represents the mapping of the radar command b to the radar words, y and w, by the radar controller.  B in the planning queue and insert A in front. Suppose the planning queue contains the radar command B, a possible generation of the radar words is illustrated in Fig. 3.4. (The figure also illustrates the mapping of the radar commands to some radar words by the radar controller.) It can be seen that as long as the command queue commands appear only to the left of planning queue commands in the rule, the command queue and the planning queue are well represented. In addition to the interpretation of the production rules, another important property is their generative power, and why a more established method such as hidden Markov model cannot be used. As shown in [25], the rules of the form i have the syntax of regular grammar and they can be used to represent hidden Markov models, i.e., stochastic regular grammar. The rules of the form ii, on the other hand, have the syntax of context free grammar. In other words, the MFR grammar has rules that strictly contain regular grammar (rules of the form ii cannot be reduced to i), and thus the MFR grammar cannot be sufficiently modeled by HMM. The production rules presented in this example is a self-embedding context free grammar and it cannot be represented by a Markov chain [13]. A context-free grammar is self-embedding if there exists a nonterminal ∗ A such that A ⇒ η Aβ with η , β ∈ (N ∪ T )+ . For the rules presented, self-embedding property can be shown by a simple derivation B → A B → A B C. In addition to the self-embedding property, HMM is not suitable because the radar controller may generate variable length radar words. If HMM is to model the radar words, the Markovian dependency may be of variable length. In this case, maximum length dependency needs to be used to define the state space, and the exponential growing state space might be an issue. Furthermore, for sources with hidden branching processes (MFRs), stochastic context free grammar is shown to be more efficient than HMM in the sense that the estimated SCFG has  96  lower entropies [23].  3.2.2 A Syntactic Model For a MFR called Mercury In this subsection, because the MFR domain knowledge is application dependent, for illustrative purpose, the grammatical representation is discussed based on a particular type of MFR called Mercury (The declassified version of the Mercury’s textual intelligence report can be found in [41]). The output of the MFR is modeled by a set of terminals, and the hierarchical command generation mechanism is modeled by a set of production rules that map the top level radar tasks to radar commands, and from radar commands to radar words. The MFR grammar is {Nr ∪ N p ∪ Nc, Tc , Pp ∪ Pc, S}. Nr is the set of radar tasks. Np and Nc are identical sets of radar commands available to the MFR, and they are differentiated only by their residing queues; Np are the commands in the planning queue and Nc are in the command queue. Pp is the set of production rules mapping Np to (Nc ∪ N p )+ . Pc is the set of production rules mapping Nc to Tc+ , where Tc is the set of radar words. In SCFG, S is the starting symbol, however, in our formulation, S is a Markov chain with state space defined by Nr . The output of the Markov chain is in Np+ and it is the starting symbols for Pp . Specific to Mercury, the set of radar words Tc consists of nine distinct elements {w1 , . . . , w9 }. The set of available radar commands is {Three-word search, Four-word search, Acquisition, Non-adaptive track, three stages of Range resolution, Track maintenance, Fine track maintenance}, and it is written in shorthand as {3WSt , 4WSt , At , NATt , RR1t , RR2t , RR3t , TMt , FTMt }, where t = p or c denoting Np or Nc respectively. Table 3.2.2 lists the radar commands and their corresponding radar words. The Mercury’s grammar will be introduced according to the framework depicted in Fig. 3.3. The radar manager is modeled as a Markov chain whose state space is Nr , the command scheduler is represented by the production rule Pp (self-embedding), and the radar controller, introduced along with the effects of the stochastic channel, is modeled by the production rule Pc . We will describe each MFR component in detail. 3.2.2.1  Radar Manager  The radar manager, for each time period, determines the overall task or tactical goal the MFR is to accomplish. The time evolution of the radar manager is modeled as a Markov chain, and its state space, Nr ={Search for new targets, Target identification for existing tracks, Track update for existing tracks}, is defined based on the major radar task categories [6]. Let k = 0, 1, . . . denote discrete time. The state of the MFR, xk ∈ Nr , is a three state discrete time Markov chain. The output of each state is defined by templates of radar commands that specify the type and the order of the radar commands the MFR is to complete in order to accomplish the tactical goal. The templates for the states are expressed in the production rules listed below:  97  Command Four-word search  Three-word search  Acquisition (ACQ)  Non-adaptive Track (NAT) or Track maintenance (TM) Range resolution ACQ, NAT or FTM  Words [w1 w2 w4 w5 ] [w2 w4 w5 w1 ] [w4 w5 w1 w2 ] [w5 w1 w2 w4 ] [w1 w3 w5 w1 ] [w3 w5 w1 w3 ] [w5 w1 w3 w5 ] [w1 w1 w1 w1 ] [w2 w2 w2 w2 ] [w3 w3 w3 w3 ] [w4 w4 w4 w4 ] [w5 w5 w5 w5 ] [w1 w6 w6 w6 ] [w2 w6 w6 w6 ] [w3 w6 w6 w6 ] [w4 w6 w6 w6 ] [w5 w6 w6 w6 ] [w7 w6 w6 w6 ] [w8 w6 w6 w6 ] [w9 w6 w6 w6 ] [w 6 w6 w6 w6 ]  Command  Track maintenance (TM)  Fine track maintenance (FTM)  Words [w1 w7 w7 w7 ] [w2 w7 w7 w7 ] [w3 w7 w7 w7 ] [w4 w7 w7 w7 ] [w5 w7 w7 w7 ] [w6 w7 w7 w7 ] [w1 w8 w8 w8 ] [w2 w8 w8 w8 ] [w3 w8 w8 w8 ] [w4 w8 w8 w8 ] [w5 w8 w8 w8 ] [w6 w8 w8 w8 ] [w1 w9 w9 w9 ] [w2 w9 w9 w9 ] [w3 w9 w9 w9 ] [w4 w9 w9 w9 ] [w5 w9 w9 w9 ] [w6 w9 w9 w9 ] [w7 w7 w7 w7 ] [w8 w8 w8 w8 ] [w9 w9 w9 w9 ]  Table 3.1: List of MERCURY radar commands and their corresponding radar words  Search for new targets → 3WSp | 4WS p Target identification for existing tracks → Ap NAT p RR1 p Track update for existing tracks → TMp Each state may output multiple templates and they are separated by bars. Different templates are characterized by their computational cost and accuracy, and their selection is modeled probabilistically. Define the transition probability matrix as A = [aji ]3×3 , where a ji = P(xk = ei |xk−1 = e j ), and ei and e j are MFR states in Nr . The transition of the MFR is assumed to be driven by the interaction between the MFR and targets. For example, if the target is far away from the MFR and flies with constant velocity, the probability of the MFR jumping to “Track update for existing tracks” might be low. On the other hand, when the target is close and shows high maneuverability, the probability of being tracked might be higher because MFR would allocate more resources to it. In order to characterize the interaction between the MFR and a target, the target behaviour 98  Type of Motion Models Constant velocity model Time correlated acceleration model Horizontal turn model  s1 0 1 0  s2 0 0 1  Table 3.2: List of target’s motion models Pstay Pup Search for new targets  Target identification for existing target  Track update for existing target  Pdown  Figure 3.5: MFR states and transition probabilities.  pattern is described first. A target state process is ψk = (zk , sk ), where zk refers to its kinematics and sk is a staircase-type trajectory indicating its motion models such as constant velocity model [24]. In this paper, zk ∈ R denotes distance of the target with respect to the MFR, and sk = (s1k , s2k ) is an indicator vector featuring the motion model in which the target is maneuvering. The dependency between the MFR and targets is established by parameterizing the transition matrix A with (zk , sk ). Table 3.2.2.1 lists the values of sk and their corresponding motion models. The list of representative motion models are used in [5] to study the benchmark tracking problem. The first model, constant velocity model, characterizes the periods of non-maneuverability, and it is described in [8]. The other two models are to account for target maneuvers. The time correlated acceleration model is first proposed in [35] and the horizontal turn model is described in [18]. Because of its generality and utility interpretation, Logit model is selected to parameterize the transition matrix. Let Pup (Pdown ) be the probability of the MFR system to move up (down) a state and Pstay is the probability of the MFR system remaining in the current state. The probabilities are illustrated in Fig. 3.5 and they are shown below: exp(a ψk ) 1 + exp(a ψk ) + exp(b ψk ) exp(b ψk ) Pdown = 1 + exp(a ψk ) + exp(b ψk ) 1 Pstay = 1 + exp(a ψk ) + exp(b ψk ) Pup =  where a, and b are vectors of regressor parameters. The justification of the logit model is given in Appendix B. 99  3WS p → 4WS p → Ap → NAT p → RR1 p → RR2 p → RR3 p → TM p → FTM p →  3WSc 3WS p |3WSc 4WSc 4WS p |4WSc Ac A p |Ac NATc NAT p |NATc RR1c RR1 p |RR1c RR2 p |RR1c RR2c RR2 p |RR2c RR1 p |RR2c RR3 p |RR2c RR3c RR3 p |RR3c RR2 p |RR2 p RR3 p |RR3c TMc TM p |TMc FTM p |TM p FTM p |FTM p TM p |TMc FTMc FTM p |FTMc TM p |TM p FTM p |FTMc  Table 3.3: Production rules of Mercury’s command scheduler.  3.2.2.2  Command Scheduler  The command scheduler models the MFR’s ability to plan and to preempt radar commands based on the radar task and the dynamic tactic environment. With the template of radar commands in place, the main operation of the command scheduler is to implement the scheduling of radar commands in the command queue and/or the rescheduling of commands in the planning queue. The operational rules for the scheduling and rescheduling could be constructed based on a small set of basic rules. Suppose Np = {A, B,C} and Nc = {a, b, c}, the basic control rules that are available to the command scheduler are listed below. Markov B → bB|bC Adaptive B → AB | BC Terminating B → b The interpretation of the rules follows the example given at the end of the previous subsection. A rule is Markov if it sent a radar command to the command queue, and re-scheduled either a same or a different radar command in the planning queue. A rule is Adaptive if it either preempted a radar command for another radar command or if it scheduled a radar command ahead of time in the radar’s time line after the current command. A rule is Terminating if it sent a radar command to the command queue without scheduling any new commands. The significance of the Markov rule is obvious. It represents the completion of one radar command and the scheduling of another. The two adaptive rules model the MFRs’ ability to i) Preempt and ii) Plan the radar commands. The preempt rule is B → A B, where the command B is preempted when a higher priority task A enters the queue. On the other hand, the plan rule is B → B C, where the command C is scheduled ahead of time. The terminating rule reflects the fact that the queues have finite length, and the grammatical derivation process must terminate and yield a terminal string of finite length. Applying the basic control rules to the templates, the production rule Pp could be constructed. With some constraints in place, the complete set of rules is listed in Table 3.2.2.2.  100  3.2.2.3  Radar Controller and the Stochastic Channel  The radar command is mapped to the radar words by the radar controller, and the words could be corrupted by the stochastic channel before it’s intercepted. In this part of the subsection, production rules of the radar controller are devised, and the effect of the stochastic channel is incorporated. The production rules of the radar controller are derived from visual inspection of the radar commands listed in Table 3.2.2. The syntactic structure of the radar commands are captured by defining the nonterminals and their corresponding production rules. We begin by defining the triplets: T6 → w6 w6 w6  T8 → w8 w8 w8  T7 → w7 w7 w7  T9 → w9 w9 w9  and blocks of four words: Q1 → w1 w1 w1 w1  Q4 → w4 w4 w4 w4  Q7 → w7 w7 w7 w7  Q2 → w2 w2 w2 w2  Q5 → w5 w5 w5 w5  Q8 → w8 w8 w8 w8  Q3 → w3 w3 w3 w3  Q6 → w6 w6 w6 w6  Q9 → w9 w9 w9 w9  Furthermore, we introduce two new nonterminals: S1 → w1 |w2 |w3 |w4 |w5  S2 → S1 |w6  The nonterminals introduced specifies the complete set of the production rules for the radar controller. Based on the radar controller’s production rules, the effects of the stochastic channel could be easily incorporated. For each radar word wi , define a new nonterminal Wi and the production rule Wi → w1 |w2 |w3 |w4 |w5 |w6 |w7 |w8 |w9 [Pi ] for i = 1, . . . , 9, where Pi = [Pi1 , Pi2 , Pi3 , Pi4 , Pi5 , Pi6 , Pi7 , Pi8 , Pi9 ]T is a vector of probabilities indicating how likely Wi would be corrupted and intercepted as one of the other radar words. When compiled together, the complete set of production rules are specified and they are listed in Table3.2.2.3. As will be illustrated in later sections, the probabilities of the production rules could be estimated based on training data. In addition, since each wi is a pulse train, a pulse train analysis can be conducted to assign prior probabilities to the channel probabilities Wi [39].  101  4WS → 3WS → A→ NAT → RR1 → RR2 → RR3 → TM → FTM → S2 → S1 → T6 → T8 → T7 → T9 → Qi → pi Wi − →  W1W2W4W5 |W2W4W5W1 |W4W5W1W2 |W5W1W2W4 W1W3W5W1 |W3W5W1W3 |W5W1W3W5 Q1 |Q2 |Q3 |Q4 |Q5 |Q6 S1 T6 |Q6 W7 T6 W8 T6 W9 T6 S1 T6 |S2 T7 |S2 T8 |S2 T9 Q6 |Q7 |Q8 |Q9 S1 |W6 W1 |W2 |W3 |W4 |W5 W6W6W6 W8W8W8 W7W7W7 W9W9W9 WiWiWiWi w1 |w2 |w3 |w4 |w5 |w6 |w7 |w8 |w9 for i = 1, . . . , 9  Table 3.4: Production rules of Mercury’s radar controller.  3.2.3 Well Posedness of the Model One practical issue of modeling with SCFG is that the signal generated by radar systems has finite length, and this finiteness constraint must be satisfied if the model is to be stable. In addition, the finiteness criteria provides a constraint on the SCFG model parameters, which may be used as a bound on the parameter values. We discuss this point by first defining the stochastic mean matrix. Definition Let A, B ∈ N, the stochastic mean matrix MN is a |N| × |N| square matrix with its (A, B)th entry being the expected number of variables B resulting from rewriting A: MN (A, B) =  ∑  η ∈(N∪T )∗ s.t.(A→η )∈P  P(A → η )n(B; η )  where P(A → η ) is the probability of applying the production rule A → η , and n(B; η ) is the number of instances of B in η [10]. The finiteness constraint is satisfied if the grammar in each state satisfies the following theorem. 102  Task 1  Task 1  Task 1  Task 2  Task 2  Task 2  Task 3  Task 3  Task 3  TASK1  TASK2 RR1 RR2 RR2 RR3 RR2  SEARCH ACQUISITION NAT  3wsearch 4wsearch acquisition nat 3wsearch rr1 rr2 rr2 rr3 w3 w5 w1 w3 w2 w4 w5 w1  w3 w3 w3 w3  w1 w3 w5 w1  w7 w6 w6 w6  w6 w8 w8 w8  TASK3 TM FTM TM FTM FTM 3wsearch tm tm ftm ftm w8 w8 w8 w8  w8 w8 w8 w8  w9 w9 w9 w9  Time  Figure 3.6: A string of radar words are intercepted by the MFR, and the signal interpretation problem is, based on the domain specific knowledge on the MFR’s control hierarchy, how to infer the tasks MFR is performing from the radar words. Task 1 is searching for new targets, task 2 is target identification for existing tracks, and task 3 is track maintenance for existing tracks.  Theorem If the spectral radius of MN is less than one, the generation process of the stochastic context free grammar will terminate, and the derived sentence is finite. Proof The proof can be found in [10].  3.3 Statistical Signal Interpretation of the MFR Signal and Control Given the MFR knowledge representation as discussed in the previous section, we are now in the position to describe the symbolic inference engine. (Recall the ES framework in Fig. 3.1.) The input to the engine is a batch of noisy radar words stored in a track file, and the aim is to extract the embedded syntactic pattern that is described by the domain specific knowledge. Fig. 3.6 illustrates the inference problem we are to solve. In general, with such an assumption, any pattern recognition technique is automatically a signal interpretation technique. Specific to our case, because the knowledge is stored as a Markov modulated SCFG, a hybrid of the insideoutside and the forward-backward algorithm will be used. In this section, we describe the state estimation algorithm with the assumption of complete system knowledge (known parameter values) in Sec. 3.3.1, and the application of EM algorithm to estimate the system parameters in Sec. 3.3.2. In Sec. 3.3.3, we extend the estimation algorithm to the control of the target’s maneuvering models. Notation: The following notation will be used throughout the section. Let x0:n = (x0 , x1 , . . . , xn ) 103  A1 Outside probability P (w1 p−1Ajpq wq+1 m ) Ajpq w1 · · ·  wp−1  wp · · ·  wq  wq+1  ···  wm  P (wpq |Ajpq ) Inside Probability  Figure 3.7: Inside and outside probabilities in SCFG.  be the (unknown) state sequence, where xk ∈ Nr (See Sec. 3.2.2.1), and γ1:n = (γ1 , γ2 , . . . , γn ) be the intercepted radar commands. Each γk = (w1 , w2 , . . . , wmk ) is a string of concatenated terminal symbols (radar words), and mk is the length of γk . It is convenient to introduce the following variables: • Forward variable: fi (k) = P(γ1 , γ2 , . . . , γk , xk = ei ) • Backward variable: bi (k) = P(γk+1 , γk+2 , . . . , γn |xk = ei ) j  • Inside variable: β ji (k, p, q) = P(w pq |A pq , xk = ei ) j , w(q+1)m |xk = ei ) • Outside variable: α ij (k, p, q) = P(w1(p−1) , A pq j is the where w pq is the subsequence of terminals from pth position of γi to qth position, and Apq ∗ nonterminal A j ∈ N p that derives wpq , or A j ⇒ w pq . Fig. 3.7 illustrates the inside and outside probabilities. (Details of forward and backward algorithms can be found in [28], and inside and outside in [23].)  3.3.1 MLE of MFR’s State via Viterbi and Inside Algorithms Maximum Likelihood Estimation of MFR’s State via Viterbi and Inside Algorithms The estimator of MFR’s state at time k is xˆk = arg maxi P(xk = ei |γ1:n ), and which could be computed using the Viterbi algorithm. Define δi (k) = maxx0 ,x1 ,...,xk−1 P(x0 , x1 , . . . , xk = i, γ1 , γ2 , . . . , γk ), the Viterbi algorithm computes the best state sequence inductively as follows: 1. Initialization: δi (1) = πi oi (γ1 ), for 1 ≤ i ≤ M. 2. Induction:  δi (k + 1) = max1≤ j≤M δ j (k)a ji (τ ) oi (γk+1 ), for 1 ≤ k ≤ n − 1, 1 ≤ i ≤ M τi (k + 1) = arg max1≤ j≤M δ j (k)a ji , for 1 ≤ k ≤ n − 1, 1 ≤ i ≤ M. 104  3. Termination: xˆn = arg max1≤ j≤M δ j (n) 4. Path backtracking: xˆk = τk+1 (xˆk+1 ), for k = n − 1, n − 2, . . . , 1 where oi (γk ) is the output probability of the string γk generated by the grammar Gi . An efficient way to calculate the probability is by the inside algorithm, a dynamic programming algorithm that inductively calculates the probability. The inside algorithm computes the probability, oi (γk ), inductively as follows: 1. Initialization: β ji (k, p, p) = P(A j → w p |Gi ). 2. Induction: q−1  β ji (k, p, q) = ∑ ∑ P(A j → Ar As )βri (k, p, d)βsi (k, d + 1, q), r,s d=p  for ∀ j, 1 ≤ p < q ≤ mk . 3. Termination: oi (γk ) = β1i (k, 1, mk ). Running both the Viterbi and the inside algorithms, the posterior distribution of the states given the observation could be computed.  3.3.2 Model Parameter Estimation using EM Algorithm In Sec. 3.3.1, MFR’s state estimation problem was discussed assuming complete knowledge of the system parameters, i.e., the Markov chain’s transition matrix and the SCFG’s production rules. In reality, such parameters are often unknown. In this subsection, EM algorithm is applied for parameter estimation and it is discussed in detail in [15]. Let γ1:n be the incomplete data, and let {x0:n ,C1:n } be the missing (or hidden) data. For a Markov chain with M states, Ck = (C1 (A → η ; γk ),C2 (A → η ; γk ), . . . ,CM (A → η ; γk )) and Ci (A → η ; γk ) is the number of counts the production rule A → η is applied in deriving γk with grammar i. Let Φ = {a ji , P1 (A → η ), . . . , PM (A → η )} be the model parameters, where Pi (A → η ) is the set of production rules probabilities for grammar i. The EM algorithm iteratively computes the maximum likelihood parameter estimates by computing the expression Φ(i+1) = arg max EΦ(i) {log Ln (Φ)|γ1:n , Φ}, Φ  where the complete-data likelihood Ln (Φ) is ∏nk=1 P(γk ,Ck |xk , Φ)P(xk |xk−1 , Φ)P(xo |Φ). In order to facilitate the discussion of the EM algorithm, the following two variables are introduced: fi (k)bi (k) χi (k) = P(xk = ei |γ1:n ) = 3 ∑i=1 fi (k)bi (k) 105  and  ξ ji (k) = P(xk = e j , xk+1 = ei |γ1:n ) =  f j (k)a ji oi (γk+1 )bi (k + 1) . 3 ∑ j=1 ∑3i=1 f j (k)a ji oi (γk+1 )bi (k + 1)  The Expectation step of the EM algorithm yields the following equation: EΦ(i) (log Ln (Φ)) =  n  ∑ ∑ ∑ ∑ EΦ  (i)  (Cxk (A → η ; γk )) log Pxk (A → η )χxk (k)  k=1 xk Axk T xk  n  n  + ∑ ∑ ∑ log(axk |xk−1 )ξxk−1 xk (k − 1) + ∑ ∑ log P(x0 )χx0 (k) k=1 xk xk−1  k=1 x0  where EΦ(i) (Cxk (A → η ; γk )) can be computed using inside and outside variables [25]. The Maximization step of the EM algorithm could be computed by applying Lagrange Multiplier. Since the parameters we wish to optimize are independently separated into three terms in the sum, the three terms are the estimates of the prior distribution, the transition matrix, and the production rule probabilities, we can optimize the parameter term by term. The estimates of the probabilities of the production rules can be derived using the first term of the equation, and the updating equation is Pxk (A → η ) =  ∑nk=1 EΦ(i) (Cxk (A → η ; γk ))χxk (k) . ∑η ∑nk=1 EΦ(i) (Cxk (A → η ; γk ))χxk (k)  Similarly, the updating equation of the transition matrix aji is a ji =  ∑n−1 k=1 ξ ji (k) ∑n−1 k=1 χ j (k)  Under the conditions in [45], iterative computations of the expectation and maximization steps above will produce a sequence of parameter estimates with monotonically nondecreasing likelihood.  3.3.3 Optimization of Target-MFR Interaction Dynamics Based on the interpretation of the radar signal and the interaction dynamics between the MFR and the target, autonomous control of the aircraft’s maneuvering model is devised in this subsection. Recall the Target-MFR interaction as discussed in Sec. 3.2, where each maneuvering model triggers a particular radar mode, and the mode is characterized by the transition probabilities of the radar tasks. With this assumption, the maneuvering model selection is formulated as an optimization problem of finding an efficient adaptive search (sampling) plan with the objective of staying in the “safest” mode most often, and the problem setup is illustrated in Fig. 3.8.  106  Control Strategy  Model 1  Target Maneuvers  Stochastic Environment  MFR  Threat Evaluation  Figure 3.8: The selection of maneuvering model induces a particular radar mode. The mode is observed indirectly from the intercepted radar pulses and its threat evaluated. Based on the evaluation, the control strategy selects maneuvering models such that the ownship safety is maximized.  Let the discrete time l = 1, 2, . . . indexes the sequence of maneuvering models selected by the aircraft. Let X[l, s] be the single performance measure, the MFR’s average occupancy in track mode when the target is maneuvering in model s, and which can be computed from the stationary distribution of the estimated Markov chain. The aim is to find s∗ such that s∗ = arg min E{X[l, s]}, s∈S  where S is the set of all possible maneuvering models. The model selection is not straightforward because the performance of the maneuvering cannot be evaluated analytically, and it must be estimated or measured based on the intercepted radar pulses. We treat this problem as a discrete stochastic approximation problem. The problem is also called the multi-armed bandit where the aim is to find the best slot machine out of a finite number of such machines. Other approaches such as multiple comparison also exist [19], but this approach is preferred because of its ability to adapt to slowly time-varying radar conditions. Two discrete stochastic approximation algorithms will be applied, and their detailed description can be found in [22]. The target begins in an arbitrarily chosen motion model, and probabilistically explore the model space. The idea is to implement an efficient adaptive sampling plan that allows one to find the maximizer with as few samples as possible by not making unnecessary observations at nonpromising models. The following notations are used in the algorithms. {s(l) } ∈ S is a sequence of maneuvering models generated by the algorithm that can be thought as the state of the algorithm at time l. It is convenient to map {sl } to a sequence of unit vectors {Y[l]} where it has 1 in the jth component if s(l) = s( j), and zeros elsewhere. In addition, let π [l] = 1l [W (l) [s(1)], . . . ,W (l) [s(|S |)]]T denotes the empirical state occupation probability measure, where | · | gives the number of elements in the set and W(l) [s] is a counter that measures the number of times the state sequence visits the state s. Finally, sˆ(l) is the estimate of the optimal mode s∗ generated by the algorithm at time l. It is the main output of the algorithm and it is used to control the aircraft’s mode changes. The two algorithms are summarized here: 107  Aggressive Search 1. Initialization: At time l = 0, select initial state s(0) ∈ S . Set π [0, s(0) ] = 1, π [l, s] = 0 for all s ∈ S , s = s(0) . Set sˆ(0) = s(0) . 2. Sampling and Evaluation: Given the state s(l) , compute X[l, s(l) ]. Generate a candidate state s˜(l) from S − {s(l) } according to a uniformly distributed random variable. Compute X[l, s˜(l) ]. 3. Acceptance: If X[l, s˜(l) ] > X[l, s(l) ], then set s(l+1) = s˜(l) ; otherwise set s(l+1) = s(l) . 4. Adaptive filter for updating state occupation probabilities: Update state occupation probabilities π [l + 1] = π [l] + u[l + 1](Y[l + 1] − π [l]) with the decreasing step size u[l] = 1/l, where Y is indicator function. 5. Update estimate of optimal radar mode: If  π [l + 1, s(l+1) ] > π [l + 1, sˆ(l) ] then set sˆ(l+1) = s(l+1) ; otherwise, set sˆ(l+1) = sˆ(l) . Set l ← l + 1 and go to Step 1. Conservative Search 1. Initialization: At frame time l = 0, initialize state |S |-dimensional vectors H[0], L[0] to ¯ = 1 (vector of ones). Select initial state s(0) ∈ S . zero, and K[0] 2. Sampling and Evaluation: Given the state s(l) , generate, as in Step 1 of Aggressive Search, s˜(l) , X[l, s(l) ], and X[l, s˜(l) ]. Update the accumulated cost, occupation times and average cost as L[l + 1, s˜(l) ] = L[l, s˜(l) ] + X[l, s˜(l)],  L[l + 1, s(l) ] = L[l, s(l) ] + X[l, s(l)]  ¯ s˜(l) ] + 1, ¯ + 1, s˜(l) ] = K[l, K[l  ¯ + 1, s(l) ] = K[l, ¯ s(l) ] + 1 K[l  ¯ + 1, s˜(l) ], H[l + 1, s˜(l) ] = L[l + 1, s˜(l) ]/K[l  ¯ + 1, s(l) ]. H[l + 1, s(l) ] = L[l + 1, s(l) ]/K[l  3. Acceptance: If H[l + 1, s˜(l) ] > H[l + 1, s(l) ], set s(l+1) = s˜(l) ; otherwise set s(l+1) = s(l) . 4. Update estimate of optimal radar mode: sˆ(l) = s(l) . Set l ← l + 1 and go to Step 1. The aggressive search explores the model space S by jumping between the models as a irreducible Markov chain, and it does not converge. However, it is shown in [22] that sˆ(l) → s∗ almost surely, meaning the algorithm spends most time at the global maximizer than any other 108  state, and it is consistent. On the other hand, the conservative search converges almost surely to the globally optimal model. The convergence analysis of the conservative search holds for any size of the maneuvering model sequence, as long as it’s greater than 0, where the aggressive search requires long sequence. In addition, one advantage of the aggressive search is that, if we keep the step size constant for both algorithms to make them adaptive to time-varying parameters, it is faster than the conservative search because it aggressively explore the state space. The numerical studies of the algorithms are discussed in the next section.  3.4 Numerical Studies of the Algorithms A software testbed is implemented in C++ for MFR signal simulation and interpretation. In this section, the data structure used to implement the algorithms, and some numerical results will be discussed.  3.4.1 Implementation of the Software The grammatical derivation process requires recursive embedding of terminals, repeated readings of nonterminals and modification of the output string. In order to have efficient repeated memory access, the production rules and their probabilities are both stored as a map data structure indexed by nonterminals, and with their right hand sides implemented with linked lists. In addition, the nonterminals and the terminals are stored as vectors, and the starting symbol as a string. With this set-up, the grammatical derivation can be easily implemented by repeatedly accessing and joining the linked lists of the production rules. In addition, because any context free grammars can be reduced to Chomsky Normal Form [13], the testbed is written to accept only grammars in Chomsky normal form.  3.4.2 Model Complexity and Its Modeling Power Here we describe briefly several implementation issues of our testbed and the possible remedies. The major implementation issue of the testbed is with the inside-outside algorithm: the computation complexity of the algorithm and the number of local maxima in the likelihood function. Suppose the MMSCFG has M states, and the states are represented by a grammar with L nonterminals. Suppose further that the observation sequence has length n, and each observation has, on average, mˆ i radar words for 1 ≤ i ≤ M. The average case complexity of each iteration of the EM parameter estimation algorithm is O(nM mˆ 3 L3 ) (The complexity of the inside-outside algorithm for radar words of length m is O(m3 L3 ) [25]), where mˆ = E{mˆ i }. However, because the inside and outside algorithms could be run against the data independently, parallel computation is possible and the computation time could be reduced substantially. In order to deal with the local maxima problem, one of the approaches is to pick the initial parameter value more  109  cleverly with pre-training method introduced in [23], where significant computational savings is recorded and EM typically converges to the global maximum. One important implementation detail regarding the modeling power of the SCFG is its predictive power against branching processes. In [23], study is done to compare the SCFG and the HMM on their capability in modeling branching processes in terms of entropy argument. In their study, a SCFG and a HMM model are inferred against simulation data from a branching process, and it is observed that the estimated SCFG consistently has lower entropy than the estimated HMM model. Since our MFR grammar is a multi-type Galton Watson branching process, SCFG has higher predictive power than HMM.  3.4.3 Numerical Results of State and Parameter Estimation In this subsection, the state and the parameter estimation algorithms derived in Sec. 3.3.1 and 3.3.2 are evaluated against simulation data. The model parameters such as the transition probabilities and the production rule probabilities are estimated and, based on the estimated values, the hidden state sequence is inferred. For simplicity, the MFR is characterized by a subset of the MFR grammar developed. The set of nonterminals is {RR1p , RR2 p , RR3 p }, and the set of terminals is {RR1c , RR2c , RR3c }. The grammars used in the numerical studies are shown in Table 3.5 in its Chomsky Normal form, and they characterize two different range resolution algorithms with different performance standards. Because the grammar is reduced, only two Markov states are considered, and the templates used to define the states are identical except their production rule probabilities. The Markov transition matrix is assumed fixed in this study. Fig. 3.9 shows the evolution of the likelihood values from the parameter estimation algorithm, and the state estimation error probability with the parameter values for each iteration of the algorithm. The final estimated parameter values are listed in Table 3.5, and it can be seen that the estimated parameter values are very close to their true values. In addition, the effect of the initial values on the parameter and state estimation is also studied. We initialize the estimation algorithms with values of different square-distance from the true values, and run the parameter and state estimation algorithms. It is found that the algorithm is not sensitive to the initial values of the transition matrix, but it is sensitive to the initial values of the production rule probabilities. One observation is that if the grammars of different states are initialized too close to each other, the Markov chain degenerates into a IID sequence and the estimation algorithm updates only one state instead of two. For transition matrix along, the RMS (root mean squared) error of the initial values to the true values, and of the estimated parameter values to the true model parameters are listed below. The RMS error of the estimated model parameters are very close to each other despite of the differences in the initial values. Moreover, the state estimation error probabilities of the cases below all approach zero. RMS error of initial values 0.02 0.03 0.4 0.6 0.8 RMS error of estimated parameters 0.14225 0.148172 0.137346 0.107406 0.144219 110  Log−likelihood of the Parameter Estimation  Error Probability of the State Estimation  −50  0.19 0.18  −60  Error Probability  Log−likelihood Value  −55  −65 −70 −75  0.17 0.16 0.15  −80 0.14  −85 −90 0  10  20  30  40  50  EM Iterations  0.13 0  10  20  30  40  50  EM Iterations  Figure 3.9: The left figure shows the likelihood values obtained from iterating the parameter estimation algorithm, and the right figure is the state estimation error probability with the parameter values for each iteration of the algorithm.  Estimated SCFG  Source SCFG  Grammar 1  Grammar 2  0.8  0.2  0.823  0.185  0.2  0.8  0.177  0.815  1  RR1 p2 → RR1c  Grammar 1  Grammar 2  RR1 p → RR1 p2 RR1 p RR1 p → RR1 p2 RR1 p RR1 p → RR1 p2 RR1 p RR1 p → RR1 p2 RR1 p RR1 p → RR1 p2 RR2 p RR1 p → RR1 p2 RR2 p RR1 p → RR1 p2 RR2 p RR1 p → RR1 p2 RR2 p 1  1  RR1 p2 → RR1c  1  RR1 p2 → RR1c  RR1 p2 → RR1c  0.2  0.8  0.172  0.806  0.8  0.2  0.828  0.194  1  RR2 p2 → RR2c  RR2 p → RR2 p2 RR1 p RR2 p → RR2 p2 RR1 p RR2 p → RR2 p2 RR1 p RR2 p → RR2 p2 RR1 p RR2 p → RR2 p2 RR3 p RR2 p → RR2 p2 RR3 p RR2 p → RR2 p2 RR3 p RR2 p → RR2 p2 RR3 p 1  RR2 p2 → RR2c  1  RR2 p2 → RR2c  0.3  RR3 p → RR2 p RR3 p  0.7  RR3 p → RR3c  RR3 p → RR2 p RR3 p RR3 p → RR3c Transition Matrix 0.7 0.3 0.4 0.6  RR2 p2 → RR2c  0.3  RR3 p → RR2 p RR3 p  0.7  RR3 p → RR3c  1  0.257  RR3 p → RR2 p RR3 p  0.236  0.743  RR3 p → RR3c  0.764  Transition Matrix 0.711 0.289 0.397 0.603  Table 3.5: The source and estimated parameter values of the Markov modulated SCFG.  111  3.4.4 Numerical Results of Autonomous Selection of Maneuvering Models In the second numerical study, we look at the interaction between the radar and the target maneuvers, and how the target selects its maneuvering models according to discrete stochastic approximation algorithms introduced in Sec. 3.3.3. The scenario is illustrated in Fig. 3.10. We assume that the target intends to follow a circular path, circumventing the MFR, to reach a location labeled by X in the figure. The path is planned before the mission, and the target switches between its maneuvering models to maximize its safety. In this study, the target is assumed to be able to maneuver in four different motion models, and the MFR would respond with four corresponding radar modes characterized by their Markov modulated SCFG representations. Because the target’s distance from the MFR stays fixed along the circular path, the MFR’s transition between modes depends only on the target’s maneuvering models. The SCFGs, because they correspond to the micro control, are identical across the modes (the grammar used here is the same as the one in the previous subsection), but the transition matrix of the radar manager varies depending on the target’s maneuvering model. In this scenario, the simulation results from both algorithms look virtually identical, and only one set of results will be presented. Fig. 3.11 illustrates a sample path of the maneuvering models obtained from the algorithm, and Fig. 3.10 is the flight trajectory of the target following the maneuvering models. It can be seen that high maneuvering models are deployed at the end to ensure its survivability. Fig. 3.12 shows the empirical distribution of the mode occupancies after running the algorithms for 10 times, and it is observed that the maneuvering model with the highest empirical distribution is the one with the least threat, i.e., least average tracking time. One implementation detail of the algorithm is the initialization of the Markov chain and the SCFGs. The initial parameter values are fixed for each computation of the cost function because the stochastic approximation algorithm requires the estimator to be consistent. The Markov chain is initialized uniformly, and the SCFG is initialized according to the pre-training method introduced in [23]. Briefly, the training data is first used to train a hidden Markov model with start and terminating states. The trained HMM is mapped to its approximated SCFG counterpart, and that is used as the initial configuration for the SCFG.  3.5 Conclusion The main idea of this paper is to model and characterize multi-function radars (MFR) as a string generating device, where the control rules are specified in terms of stochastic context free grammars (SCFG) modulated by the radar’s current tactical goal, and which is modeled by a Markov chain. This is unlike modeling of targets, where Hidden Markov and state space models are adequate [6; 4]. The modeling is knowledge based, where each production rule corresponds 112  150  100  Y−Coordnate MFR  50  0  −100 0  100  500  400  300  200  X−Coordinate  Figure 3.10: The scenario of the numerical study sets a target to follow a circular path, circumventing the MFR, to reach the location labeled by X. The target’s trajectory following the sequence of maneuvering models as shown in Fig. 3.11 is illustrated in this figure.  Sample Path of the Discrete Stochastic Approximation Algorithm  0.7  3  Empirical Distribution  0.6  Modes  2  1  0.5 0.4 0.3 0.2 0.1  0  1  2  3  4  5  6  7  8  9  10 11 12 13 14 15  Discrete Time  Figure 3.11: The sample path of maneuvering models obtained from the discrete stochastic approximation algorithm.  0  0  1  2  3  Maneuvering Models  Figure 3.12: Empirical distribution of the occupancies in the four maneuvering models.  113  to a operational rule employed by the MFR to generate its radar words, and such domain specific knowledge is assumed to be supplied by expert radar analysts. The signal interpretation of the MFR, under our formulation, is reduced to a state estimation by parsing through radar words, and a maximum likelihood sequence estimator is derived to evaluate the threat poses by the MFR. A maximum likelihood parameter estimator is also derived to infer the unknown model parameters with the Expectation Maximization algorithm. In addition, based on the interpreted radar signal, the interaction dynamics of the MFR and the target is studied and the control of the aircraft’s maneuvering models is formulated as a discrete stochastic approximation problem. Since SCFGs are multi-type Galton-Watson branching processes, the algorithms proposed in this paper can be viewed as filtering and estimation of a partially observed multi-type GaltonWatson branching processes.  114  Bibliography [1] A. V. Aho and J. D. Ullman. The Theory of Parsing, Translation and Compiling, volume I: Parsing. Prentice-Hall, Englewood Cliffs, NJ, 1972. [2] J. K. Baker. Trainable grammars for speech recognition. Speech Communication Papers for the 97th Meeting of the Acoustical Society of America, pages 547–550, 1979. [3] Pierre Baldi and Soren Brunak. Bioinformatics: The Machine Learning Approach. MIT Press, Cambridge, MA, second edition, 2001. [4] Y. Bar-Shalom and X. Li. Estimation and Tracking: Principles, Techniques, and Software. Artech House, 1993. [5] S. S. Blackman, M. T. Busch, and R. F. Popoli. IMM/MHT tracking and data association for benchmark trackig problem. Proc. American Control Conf., pages 2606–2610, 1995. [6] S. S. Blackman and R. Popoli. Design and Analysis of Modern Tracking Systems. Artech House, 1999. [7] P. L. Bogler. Radar Principles with Applications to Tracking Systems. John Wiley & Sons, 1990. [8] M. T. Busch and S. S. Blackman. Evaluation of IMM filtering for an air defense system application. Proc. SPIE, 2561:435–447, 1995. [9] C. G. Cassandras and S. Lafortune. Introduction to Discrete Event Systems. Springer, 1999. [10] Z. Chi. Statistical properties of probabilistic context-free grammars. Computational Linguistics, 25:131–160, 1999. [11] N. Chomsky. Three models for the description of language. IRE Transactions on Information Theory, 2(3):113–124, 1956. [12] N. Chomsky. A note on phrase structure grammars. Information and Control, 2(4):393– 395, December 1959. [13] N. Chomsky. On certain formal properties of grammars. Information and Control, 2(2):137–167, June 1959. [14] N. Chomsky and G. A. Miller. Finite state languages. Information and Control, 1(2):91– 112, May 1958.  115  [15] A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, 39:1–38, 1977. [16] R. Durbin, S. Eddy, A. Krogh, and G. Mitchison. Biological sequence analysis: Probabilistic models of proteins and nucleic acids. Cambridge University Press, 1998. [17] K. S. Fu. Syntactic Pattern Recognition and Applications. Prentice-Hall, Englewood Cliffs, NJ, 1982. [18] J. L. Gertz. Multisensor surveillance for improved aircraft tracking. The Lincoln Laboratory Journal, 2:381–396, 1989. [19] D. Goldsman and B. L. Nelson. Statistical screening, selection, and multiple comparison procedures in computer simulation. In Proceedings of the 1998 Winter Simulation Conference, 1998. [20] J. E. Hopcroft, R. Motwani, and J. D. Ullman. Introduction to Automata Theory, Languages, and Computation. Addison-Wesley, Reading, MA, second edition, 2001. [21] Y. A. Ivanov and A. F. Bobick. Recognition of visual activities and interactions by stochastic parsing. T-PAMI, 22:852–872, 2000. [22] V. Krishnamurthy, X. Wang, and G. Yin. Spreading code optimization and adaptation in cdma via discrete stochastic approximation. IEEE Transactions on Information Theory, pages 1927–1949, 2004. [23] K. Lari and S. J. Young. The estimation of stochastic context free grammars using the Inside-Outside algorithm. Computer Speech and Language, 4:35–56, 1990. [24] X. R. Li and V. P. Jilkov. Survey of maneuvering target tracking. part v: Multiple-model methods. IEEE Transactions on Aerospace and Electronic Systems, pages 1255–1321, 2005. [25] C. D. Manning and H. Sch¨utze. Foundations of Statistical Natural Language Processing. The MIT Press, 1999. [26] M. I. Miller and A. O’Sullivan. Entropies and combinatorics of random branching processes and context-free languages. IEEE Transactions on Information Theory, 38:1292– 1310, 1992. [27] D. V. Pynadath and M. P. Wellman. Probabilistic state-dependent grammars for plan recognition. In Proceedings of the 16th Annual Conference on uncertainty in artificial intelligence, pages 507–514, 2000.  116  [28] L. R. Rabiner. A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of IEEE, 77(2):257–286, February 1989. [29] E. Rivas and S. R. Eddy. The language of RNA: a formal grammar that includes pseudoknots. Bioinformatics, 16:334–340, 2000. [30] J. Roe and A. Pudner. The real-time implementation of emitter identification for ESM. Signal Processing in Electronic Warfare, IEE Colloquium on, pages 7/1–7/6, 1994. [31] J. A. V. Rogers. ESM processor system for high pulse density radar environments. IEE proceedings F, 132:621–625, 1985. [32] S. Sabatini and M. Tarantino. Multifunction Array Radar System Design and Analysis. Artech House, 1994. [33] Y. Sakakibara. Grammatical inference in bioinformatics. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27:1051–1062, 2005. [34] D. C. Schleher. Electronic warfare in the information age. Artech House, 1999. [35] R. A. Singer. Estimating optimal tracking filter performance for manned maneuvering targets. IEEE Transactions on Aerospace and Electronic Systems, AES-5:473–483, 1970. [36] M. I. Skolnik. Introduction to Radar Systems. McGraw-Hill, 2002. [37] M. Vilain. Getting serious about parsing plans: a grammatical analysis of plan recognition. In Proceedings of the Eighth National Conference on Artificial Intelligence, pages 190–197, 1990. [38] N. A. Visnevski, F. A. Dilkes, S. Haykin, and V. Krishnamurthy. Non-self-embedding context-free grammars for multi-function radar modeling - electronic warfare application. International Radar Conference, pages 669–674, 2005. [39] N. A. Visnevski, S. Haykin, V. Krishnamurthy, F. A. Dilkes, and P. Lavoie. Hidden markov models for radar pulse train analysis in electronic warfare. In IEEE International Conference on Acoustics, Speech, and Signal Processing, pages 597–600, 2005. [40] N. A. Visnevski, V. Krishnamurthy, A. Wang, and S. Haykin. Syntactic modeling and signal processing of multifunction radars: A stochastic context free grammar approach. Proceedings of the IEEE, 95:1000–1025, 2007. [41] A. Wang, V. Krishnamurthy, F. A. Dilkes, and N. A. Visnevski. Threat estimation by electronic surveillance of multifunction radars: a stochastic context free grammar approach. In IEEE Conference on Decision and Control, pages 2153–2158, December 2006. 117  [42] N. J. Whittall. Signal sorting in ESM systems. IEE proceedings F, 132:226–228, 1985. [43] R. G. Wiley. Electronic Intelligence: The analysis of radar signals. Artech House, 1993. [44] D. R. Wilkinson and A. W. Watson. Use of metric techniques in ESM data processing. IEE Proceedings F, 132:229–232, 1985. [45] C.F.J. Wu. On the convergence properties of the EM algorithm. Annals of Statistics, 11(1):95–103, 1983. [46] W. Zhu and J. Garcia-Frias. Modeling of bursty channels using stochastic context-free grammars. In Proceedings of the 55th Vehicular Technology Conference (VTC 2002), volume 1, pages 355–359, Birmingham, AL, USA, May 2002. [47] W. Zhu and J. Garcia-Frias. Stochastic context-free grammars and Hidden Markov Models for modeling of bursty channels. IEEE Transactions On Vehicular Technology, 53(3):666–676, May 2004.  118  Chapter 4  Syntactic Tracking and Ground Surveillance with GMTI Radar1 4.1 Introduction Context and Main Results For tracking ground-based maneuvering targets, conventional tracking systems deal with the following switched mode state space model [3; 30; 6] xk+1 =Fxk−1 + Gvk−1 (ak ) zk =h(xk ) + wk .  (4.1)  Here xk denotes the kinematic target state such as position and velocity, zk denotes the sensor detections (observations). The random processes vk and wk denote the state and observation noise respectively. The mode sequence {a1:k } summarizes a sequence of maneuvers or modes that causes the ground-based target to move in a two dimensional spatial trajectory. Conventional tracking of maneuvering targets assumes that the mode sequence {ak } is a finite state Markov chain, and aims to compute the posterior distribution P(xk , ak |z1:k ) so as to compute conditional mean estimates of xk and ak . This is typically done by a state-of-the -art tracking algorithm involving particle filters, Interacting Multiple Models (IMM), and variable structure IMM [3; 37; 27]. (VS-IMM is the more sophisticated model where the kinematic model of the moving objects depend on the road direction and the terrain type. ) These Bayesian recursions exploit the Markovian assumption of the mode sequence {ak } to estimate xk , ak . Motivated by GMTI (ground moving target indicator) applications, this paper deals with a higher level of abstraction which we call Syntactic Tracking. Suppose we are interested in whether a target is circling a restricted area (perimeter surveillance), or alternatively if a vessel is loitering near the coast (possible smuggling attempt). In such cases, the human operator is primarily interested in determining specific patterns in target trajectories from conventional track estimates, which he can use to infer the possible intent of the target and situation assessment [6]. Examples of such specific patterns include loops, arcs, circles, squares, and combination 1 A version of this chapter has been submitted for publication. Wang, A., Krishnamurthy, V. and Balaji, B. Syntactic Tracking and Ground Surveillance with GMTI Radar.  119  of these, and they exhibit complex spatial dependencies. As is well known in syntactic pattern analysis [21], the Markovian assumption of a1:k in (4.1) is incapable of modeling such complex spatial patterns. The key modeling contribution of this paper is to construct a syntactic model called a stochastic context free grammar (SCFG) for modeling the mode sequence a1:k . Such models are more general than Markov models (finite automata) and can capture long range dependencies and recursively embedded structures in patterns. Thus the main goal of this paper is to devise SCFG models and associated polynomial time Bayesian syntactic parsing algorithms to estimate the mode sequence a1:k given estimates from the conventional target tracker. In other words, this paper develops models and automated syntactic tracking algorithms to assist the human operator in determining specific target patterns. Main Results Motivated by civilian and military applications, we consider syntactic tracking for ground surveillance with GMTI radar. (We give a brief outline of GMTI in the appendix). Because of the vast amount of data generated by GMTI trackers, there is strong motivation to develop automated algorithms that yield a high level interpretation from the tracks. The main results of the paper are: 1. Combined Tracking and Trajectory Inference: Sec.4.2 sets the stage by describing our entire framework for syntactic tracking using conventional track estimates. We review SCFGs, formulate the elementary tracklets that lead to trajectories such as arcs and modified rectangles, and describe how syntactic tracking fits into the complete tracking system. 2. SCFG Modulated State Space Model: Sec.4.3 presents a SCFG modulated state space model that permits modeling of complex spatial trajectories. We derive probabilistic production rules that characterize the target motion patterns, and present a detailed structural analysis of the SCFG model. Using formal language techniques such as the pumping lemma, we show specific syntactic pattern like an arc generates a context free language, and it cannot be modeled by Markov models efficiently. Moreover, the well-posedness of the syntactic model is studied based on the branching rate of the model, and conditions over which the language distribution is proper are given. These conditions ensure that the distribution of the language generated by the model is proper, i.e. the total probability of parses sum to one. 3. Bayesian Syntactic Tracking: Sec.4.4 presents the Bayesian syntactic tracking algorithm. The interpretation of the syntactic patterns are represented by parse trees built on top of the target trajectories, which is tracked at the detection level by Bayesian filters such as particle filter and IMM/extended Kalman filter, and at the tracklet level by a generalized Earley Stolcke Bayesian parser [27; 31]. The Earley Stolcke algorithm is a generalization of the Forward-Backward algorithm for HMM, and it allows real time forward parsing. The complexity of the algorithm is O(l 3 ), where l is the length of the input string. 4. Experimental Validation of Syntactic Tracking: Sec. 4.5 gives a detailed experimental analysis of the syntactic tracking algorithm on a real life GMTI example. The GMTI data was col120  regular context-free context-sensitive unrestricted  Figure 4.1: The Chomsky hierarchy of formal languages. The context-sensitive and unrestricted languages are not discussed in this paper because polynomial algorithms do not exist to estimate and parse them.  lected using the DRDC Ottawa’s X-band Wideband Experimental Airborne Radar (XWEAR)[16; 15], and numerical studies of the syntactic tracking algorithms are performed using the data. The experiment results show that syntactic tracker not only accurately estimate the target’s trajectory pattern, but also can be used to improve the accuracy of conventional trackers.  Why Use Stochastic Context Free Grammars (SCFGs)? As depicted in Fig. 4.1, grammars can be classified into four different types depending on the forms of their production rules [12; 17]. Stochastic regular grammar or finite state automata is equivalent to hidden Markov model. Stochastic context grammar is significant generalization of regular grammar. See Sec.4.2.1 for definitions. Of the four grammars, only stochastic regular and stochastic context free have polynomial complexity estimation algorithms and are used in this paper. The implementation of the syntactic tracking system with SCFG has several potential advantages: (i) SCFG is a compact formal representation that can form a homogeneous basis for modeling complex system dynamics [1; 21; 22], and with which it allows model designers to express different meta level trajectory characterization in a single framework, and to automate the situation awareness process by encoding human knowledge in the grammar [42; 41]. The ability for the designer to encode knowledge is important because the lack of field data in defense setting often hinders the application of Bayesian filters as they often require a great amount of training data. (ii) the recursive embedding structure of the possible target geometric patterns is more naturally modeled in SCFG. As we show later, the Markovian type model has dependency that has variable length, and the growing state space is difficult to handle since the maximum range dependency must be considered. (iii) SCFGs are more efficient in modeling 121  hidden branching processes when compared to stochastic regular grammars or hidden Markov models with the same number of parameters. The predictive power of a SCFG measured in terms of entropy is greater than that of the stochastic regular grammar [29]. SCFG is equivalent to a multi-type Galton-Watson branching process with finite number of rewrite rules, and its entropy calculation is discussed in [35].  4.1.1 Literature Review SCFG SCFGs have widely been used in language processing. The complexity of the language in sentence structure and grammatical dependency made state space models such as linear predictive coding [14] and hidden Markov model [25] inadequate, and the application of stochastic grammar in language modeling has been researched extensively, where its syntax naturally models the language’s grammar structure [34]. The semantic meaning of a string of words is provided by its syntactic structure, which is constructed based on a context free grammar [9]. In addition to language processing, SCFG has been a major computational tool in biology for DNA and RNA sequencing [38]. Because of the three-dimensional folding of the proteins and nucleic acids, HMM becomes insufficient, and SCFG is essential because of the resulting long range dependency from the complicated spatial folding[17]. SCFG in Tracking In conventional tracking, effort has been spent to enhance the tracker by incorporating information other than the kinematic states. In [6], attribute tracking is discussed where target class information such as wing span and jet engine modulation are utilized for data association. Even though it is tangential to our work as we are interested in target information derived from target tracks, it demonstrates higher level information is very helpful in establishing track and intent inference. In contrast to attribute tracking, the applications of grammar is particularly suitable to tracking model sequences with complex multi-scale structure and recursive nature. For example, in plan recognition, plans of an agent, typically the actions, have to be inferred from observations. [10] approached the problem with Bayesian network, but due to the complex structure generating the actions, it is too computationally intensive. The extension of the work can be found in [36], where grammar is applied to model the underlying plan. In addition, in video surveillance, hierarchical hidden Markov model is applied to track sequences of human actions [32], and it can be shown that the hierarchical hidden Markov model is a special case of SCFG [19]. SCFG can be applied directly to establish high level inferences from primitives generated from observations. In [24], SCFG is applied to detect sequences such as dropping a person off or picking a person up in a parking lot. Moreover, in [33], movements of targets such as U-turns are inferred based on measurements collected from a sensor network. For those SCFG based tracking, the focus is on the high level inference, and the coupling between the high level inference and the Bayesian tracking is typically very loose, i.e. a1:k , are independently generated from sensor measurements, and the temporal constraints are imposed only at the higher inference level. Even though many tracking related applications have been developed based on SCFG, research in combining the tracking and the SCFG is limited. 122  GMTI Conventional sensors deployed to perform ground surveillance are mainly the synthetic aperture radar (SAR) and the electro-optical (EO) sensors, and they are limited in the sense that they are only capable of performing detection and identification of stationary targets. Relatively recently, GMTI radar with space-time adaptive processing (STAP) enables the nearreal time detection of ground moving objects over a large area. STAP is a generalization of adaptive array signal processing techniques based on the Wiener filter [7; 43; 28], and it incorporates techniques such as eigenvector projection and the least-squares method. In conventional adaptive array signal processing, a Wiener filter is formed for a signal vector whose components are the signals received at multiple apertures from a single pulse. In STAP, on the other hand, the Wiener filter is formed for a received signal vector whose components are some function of signals received at multiple apertures, which are moving, for more than one pulse. In other words, STAP provides a two-dimensional adaptive filter where the apertures and pulses furnish the spatial and temporal samples. It is noted that although STAP-based GMTI is considered here, the techniques developed can be used in conjunction with other detection techniques, such as detection algorithms in the image domain, i.e., synthetic aperture radar (SAR) based GMTI algorithms.  4.2 Overview of GMTI Based Syntactic Tracking We provide in this section an overview of our novel approach to syntactic tracking for ground surveillance. Our premise for syntactic tracking is that the geometric pattern of a target’s trajectory can be modeled as “words” (tracklets) spoken by a SCFG language. The premise of the syntactic pattern recognition is that complex pattern can be expressed as simpler patterns. That is, we decompose the meta level descriptors into motion trajectories, and the motion trajectories into a fixed set of primitive geometric patterns such as a line or an arc. Two illustrative examples are provided here. • Syntactic tracking in threat inference: A vehicle approaches a security gate of a building and turns around. It then circles around the perimeter of the building in the midst of other moving vehicles. Given GMTI track information of multiple moving vehicles, how can this behavior be recognized as a threat? Equivalently, how can a threat be associated with the complex spatial trajectory of making a U-turn and then circling a building, and how can the spatial trajectory be identified from geometric patterns such as half circle followed by a full circle? • Syntactic tracking in military operations: Fig. 4.2 illustrates some meta level descriptions of motion patterns that are common in military ground surveillance, where each is characterized by certain combination of geometric patterns [8]; the line abreast and wedge formation are offensive combat formations with each vehicle moving in linear trajectory;  123  Line Abreast  Wedge  Column  Pincer  Figure 4.2: The battalion formations. Line abreast and wedge are offensive combat formation, column is a traveling technique, and pincer is a intercepting technique.  pincer, on the other hand, consists of two vehicles maneuvering in mirroring arc trajectories. With the meta description, inference would be made to determine if the ground units are in offensive, defensive or reconnaissance operation. In this paper, GMTI radar is the enabling technology for the syntactic tracking, and it provides near-real time detection of the ground moving units in the area of interest. The background of the GMTI radar and the STAP processing is provided in the Appendix C. Based on these GMTI detections, the aim is to construct an algorithm for continuous ground surveillance that infers the meta description of the moving units by classifying and labeling their trajectories according to their geometric patterns. The background on SCFG is provided in Sec. 4.2.1, and the system framework of the proposed syntactic tracking system is summarized in Sec. 4.2.2. The implementation of the system will be given in the next section.  4.2.1 SCFG for Syntactic Target Tracking With the motivation outlined above, we will use SCFGs to model geometric spatial patterns of target trajectories. Since SCFGs are not widely used in radar signal processing, we begin with a short formal description of SCFGs. In formal language theory, a grammar G is a four-tuple < N , T , P, S > [17; 12]. Here N is a finite set of nonterminals, T is a finite set of terminals; N ∩ T = 0. / P is a finite set of probabilistic production rules, and S ∈ N is the starting symbol. Throughout the paper, lower case letters are used to denote terminals, and upper case letters nonterminals. Definition [Stochastic Regular Grammar] Stochastic regular grammars, denoted as GRG , are equivalent to hidden Markov models (with termination state ∈ N ) and have production rules of the form A → aA and A → a with probabilities P(A → aA) and P(A → a) specified, where A ∈ N . N corresponds to the state space of the hidden Markov model, and T corresponds to its observation space. The set of all terminal strings generated by regular grammar is called the regular language and it is denoted as LRG . Definition [Stochastic Context Free Grammar] SCFG, denoted as GCFG , have production rules, P, of the form A → η with probabilities P(A → η ) specified, where A ∈ N and η ∈ 124  (N ∪ T )+ . (N ∪ T )+ denotes the set of all finite length strings of symbols in (N ∪ T ), excluding strings of length 0 (the case where strings of length 0 is included is indicated by (N ∪ T )∗ ). The set of all terminal strings generated by SCFG is called context free language and it is denoted as LCFG . Consider now a trajectory generated by (4.1) where a1:k is the string generated by a SCFG. Based on the definition of SCFG, some common languages that are relevant to syntactic tracking are described. Let M denotes the set of geometric patterns of interest, for example M = {arc, line, m-rectangle}.  (4.2)  Here m-rectangle (modified rectangles) denotes trajectories comprising of either three left turns or three right turns. A rectangle is a special case of m-rectangle when the trajectory is closed, i.e., when the start and end points of a m-rectangle coincide. The terminals characterizing the geometric patterns are called tracklets and are illustrated in Fig. 4.3 a). Tracklets represent the maneuvers of the target (the modeling details of the terminals can be found in the next section). The languages corresponding to the geometric patterns are denoted as Larc , Lline , and Lm-rectangle respectively, and the precise description of SCFGs that generate them are described in Sec. 4.3.2. It is sufficient to state here that many spatial geometric patterns exhibit spatial dependency that has long range dependency, and Markov models is not general enough to model them. The point is illustrated based on the following relation between the geometric patterns: Lline ⊂ LRG and Larc , Lm−rectangle . ⊂ LCFG Example: SCFG derivation of geometric patterns Lline includes lines of arbitrary length, and since it can be generated with Markov dependency, it is a subset of LRG . Suppose we have a concatenated string xA, where x is any combination of nonterminals and terminals, and A is a nonterminal, a one step derivation using the rule A → aA yields xA → xaA. The derivation process is similar to that of an hidden Markov model. Arc and m-square, on the other hand, are more general than the regular language. The analysis that shows Larc as a subset of LCFG is given later in Sec. 4.3.3. Fig. 4.3 b) illustrates an instance of the modified square language, and Fig. 4.3 c) demonstrates a derivation process. Referring to  125  m−rectangle g  h  b  a h  f e  b d  d  Bottom  h  Top  Up−Down  Down−Up  c f f  a)  Down−Up  h  h  b  d  c)  b)  Figure 4.3: a) Building blocks of the trajectory. b) A sample trajectory and the estimated tracklets. c) Syntactic analysis of the sequence of estimated tracklets.  Sec. 4.3.2 for the full set of production rules for m-square, the derivation process is as follows: m-rectangular → Bottom Down-Up Top Up-Down → f Down-Up Top Up-Down → f h Down-Up Top Up-Down → f h h Top Up-Down . → .. Even though the number of geometric patterns is finite, the language it generates can be infinite. To provide further insight on how geometric patterns are applicable to intent inference, we provide some examples here. To model the threat inference example provided at the beginning of Sec. 4.2, where a threat is related to suspicious U-turns and circling of a building, an arc language may be used to approximate U-turns and a m-rectangle language to circling around the restricted area. The pincer operation, on the other hand, consists of two arcs in close proximity and of opposite direction. As will be shown in 4.4.2, each parsing state has parameters describing its location and pattern type. As a result, given continuous labeling of the trajectories in the area of interest, a pincer operation can be identified if the following attributes are found: 1) two arcs are of matching type and comparable size, and 2) their locations are close together within a certain bound. Moreover, a maritime smuggling event can also be identified by tracking geometric patterns. More specifically, a smuggling event may be modeled as one circling trajectory being approached by a linear trajectory. The labeling of trajectories can identify vessels that are loitering in the open sea, and detect other vessels moving toward them.  126  4.2.2 Syntactic Tracking Estimation Overview Syntactic tracking builds on top of conventional tracking, and aims to infer complex geometric patterns from the track estimates. In this case, a1:k belongs to a context free language, i.e. a1:k ∈ LCFG . As described in Sec. 4.3.3, such patterns cannot be generated by regular grammar (Markov chain). The goal of syntactic tracking is to compute the geometric pattern with the highest posterior probability, i.e. mˆ = arg max P(m|a˜1:k , Gm ), where a˜1:k = arg max P(a1:k |z1:k , GRG m ) a  m∈M  (4.3)  Here a˜1:k is computed via a conventional tracklet estimator (see below) and Gm ∈ GCFG is the set of geometric patterns of interest given in (4.2). Equation (4.3) demonstrates the two fundamental components in a syntactic tracking system: • Syntactic Pattern Estimator (Parser). The term P(m|a˜1:k , Gm ),  (4.4)  is the posterior probability of the geometric pattern given the tracklet sequence and the corresponding SCFG grammatical rules. The probability allows us to classify the tracklet sequence as to which geometric pattern, arc, square, or line, does it belong. The computation of the associated probabilities is discussed in Sec. 4.4 where the SCFG parsing algorithm that performs the syntactic analysis is described. • Tracklet Estimator. The tracklet estimates a˜1:k are computed via a conventional stateof-the-art GMTI tracker. Note that P(a1:k |z1:k , Gm ) denotes the sequence of tracklet estimates given the sensor detection sequence z1:k and grammatical model Gm . Throughout this paper, we will approximate P(a1:k |z1:k , Gm ) by P(a1:k |z1:k , GRG m ), i.e., with a regular grammar. In words, this means that in the context of GMTI tracking: use a regular grammar (state space model with Markovian jumps GRG m ) to estimate tracklets, and then use a higher level syntactic Bayesian estimator (with SCFG model) to estimate the trajectory type. This approximation is justified since it directly facilitates the use of legacy GMTI tracking algorithms which operate as an input to our higher level syntactic tracking. This is described in Sec. 4.3, where GMTI system assumptions are formulated.  4.2.3 GMTI Based Syntactic Tracking Framework According to the discussion presented, the system framework of the GMTI based syntactic tracking system that implements the two layer approach is summarized in Fig. 4.4. It consists of five components: the GMTI STAP processor, a tracklet estimator, a geometric pattern knowledge-base, a syntactic pattern estimator (stochastic parser), and a radar resource allocator. 127  Motion Pattern Knowledge−base GMTI STAP Processor  Numeric−Symbol Bayesian Filter  Area of Interest  Intent Inference Parser  Operator  Radar Resource Allocator  Figure 4.4: Intent inference framework with stochastic parsing.  The GMTI radar detects ground moving targets and returns their estimated range, angle, and Doppler. The tracklet estimator, which computes a˜1:k in (4.3), keeps a track on the detected target, and continuously outputs its associated tracklet that best describes the ground vehicles’ kinematic state. The geometric pattern knowledge-base stores the prior knowledge of the relevant motions in terms of production rules. Syntactic pattern estimator, which implements (4.4), based on the syntactic structure of the tracklets and the knowledge-base, infers geometric patterns from vehicle’s trajectory. The meta level descriptor inferred can be fed to both the operator and the resource allocator. The radar resource allocator then adjusts the operating parameters of the transformer and the radar setting to enhance the target recognition. (The radar resource allocator has been studied extensively and it is not included in this paper.)  4.3 Syntactic Modeling of GMTI This section discusses syntactic models for capturing complex spatial patterns of target trajectories. The state space model of the tracklet estimator that maps GMTI detections to tracklets is described in Sec. 4.3.1, the syntactic modeling of the geometric models with SCFG is presented in Sec. 4.3.2. Finally, the well-posedness of the SCFG model (in terms of ability to model specific patterns), and the analysis of context free language for the geometric models are developed in Sec. 4.3.3.  4.3.1 SCFG Modulated State Space Model This section describes the dynamic system formulation that models the tracklet estimator. The model is formulated with the concept of directional process noise and the switching model. In the usual case, assuming Gaussian noise, the standard kinematic models assume equal variance for the process noise in all unit directions because they assume equal probabilities among the unit directions at which the target is moving. In order to model the tracklets, the process noise is assumed to have different noise variance along and perpendicular to the direction of the  128  particular tracklet. In other words, each tracklet is modeled as a switch mode whose process noise has a unique variance. Details of the formulation is given below. Let k denotes discrete time, the assumed target dynamics is xk = Fxk−1 + Gvk−1 (ak ). xk = (xk , yk , x˙k , y˙k ) denotes the ground moving target’s position and velocity in Cartesian coordinates, and assuming constant velocity model, the transition matrix model and the noise gain are, respectively,     2 0 1 0 T 0 T /2  0 1 0 T   T 2 /2  . ,G =  0 F =  0 0 1 0   T 0  0 T 0 0 0 1 The variable ak ∈ {0, π /4, π /2, 3π /4, π , 5π /4, 3π /2, 7π /4} denotes the tracklets in the terminal set, and it indexes the moving target’s possible directions of travel. The estimation of ak is the main objective of the numeric-symbol Bayesian filter, because it is the basic building blocks making up the motion patterns. As will be shown in Sec. 4.4, the intent inference parser will parse the estimated ak , and estimate the motion patterns of the target. The process noise vk is a white Gaussian process with the covariance matrix Q = ρak ·  σo2 0 0 σa2  − cos ak sin ak sin ak cos ak  · ρaTk , with ρak =  ,  where σa2 is the uncertainty along the direction indicated by ak and σo2 is orthogonal to it. The observation model describes the output of the GMTI STAP measurements, and the observation is zk = h(xk ) + wk , where       rk    h(xk ) = r˙k  =   θk  xk 2 + yk 2 + zk 2 xk x˙k +yk y˙k 2  2  2  xk +yk +zk y arctan( xk ) k     .   (4.5)  rk is the range, r˙k is the Doppler, θk is the azimuth angle, and wk ∼ N (0, R). The covariance matrix R is a diagonal matrix with the diagonal elements equal to the variances of the range, range rate, and azimuth angle measurements, which are denoted as σr2k , σr˙2k , and σθ2k respectively. Moreover, in order to compensate for the radar’s platform motion, x k = xk − xPk where xPk is the x coordinate of the sensor platform at time k; similarly for yk and zk . It should be noted that, strictly speaking, the measurement noise should also be a function 129  of the target’s motion pattern. The RCS (radar cross section) of the target is a complicated function of the aspect angle. As shown in [39], the RCS of an automobile might very by as much as 10 dB. This has a very significant impact on the probability of detection and parameter estimation. For instance, the Cramer-Rao bound on the variance of the angle estimate is a function of the signal-to-noise ratio [44]. Thus, a general measurement model would model the measurement noise to be a function of the symbol. Knowledge of the motion pattern has the potential to greatly reduce the measurement error. However, because the dependency is not well understand, we simplified the model and remove the dependency.  4.3.2 Dynamics of Syntactic Motion Patterns This section formulates the meta level description of the ground moving targets’ trajectories based on their geometric patterns using syntactic modeling. The correspondence of the SCFG’s model elements to the syntactic tracking system components is described as follows: The tracklet estimated is modeled by the terminal symbols, the labeling of the geometric patterns and the application specific meta descriptor by the nonterminal symbols, and the decomposition of meta descriptor and geometric patterns to tracklets by the production rules. More specifically, the terminal set T = {a, b, c, d, e, f , g, h} is the set of tracklets illustrated pictorially in Fig. 4.3a). The set of nonterminals are listed below: N ={La , Lb , . . . , Lh , mSqr , mSql , Aur , Aul , T OP, BOT,UD, DU, S}, P ={S → La |Lb | . . . |Lh |mSqr |mSql |Aul |Aur , Aur → aAur g|hAur |Aur h|ag|h, Aul → gAul a|hAul |Aul h|ga|h Lu → u Lu |u for u ∈ T, mSqr → TOP UD BOT DU mSql → BOT DU TOP UD TOP → Lc |Lg BOT → Lc |Lg UD → Le DU → La } The nonterminal Lu generates lines in the direction u for all u ∈ T , and Aur (Aul ) generates an arc pointing upward and to the right (left). mSqr and mSql are the clockwise and counter-clockwise m-rectangle respectively, and the nonterminals TOP, UD, BOT and DU represent the top, updown, bottom, and down-up components of a m-rectangle. Only one orientation of the arc and the m-rectangle is included, and it is straightforward to include other orientations. It should be noted that the grammar is a small subset for illustrative purpose, and no intention is made to be 130  exhaustive. The grammar is application specific, and it can be regarded as an guiding example for other development. The analysis of the grammar is provided in Sec. 4.3.3. Given the grammar, probability distribution is defined over the production rules. For each nonterminal N, the probability of its production rules must sum to 1, i.e.  ∑  P(N → η ) = 1.  η ∈(N ∪T )∗ s.t.(A→η )∈P  In practice, the production rule probabilities can be estimated from data. However, because the data is scarce in our case, and the pattern is well understood, manual assignment of probabilities produces better results. The probability assignment has to follow a requirement to keep the grammar stable, and it will be discussed in the analysis that is presented in the next subsection.  4.3.3 Structural Analysis of the SCFG Model Jump Markov linear systems such as (4.1) are used in radar signal processing because they adequately model the local kinematic behaviour of a target. In similar vein, we will show that SCFGs can model GMTI trajectory behaviour. More specifically, we show using the pumping lemma that SCFG is powerful enough to generate trajectories that are only m-rectangles or only arcs. A regular grammar (HMM) cannot generate only randomly sized m-rectangles or only randomly sized arcs. (Of course a regular grammar can generate an arc or a m-rectangle with some probability amongst a variety of random trajectories - but that is of little use in trajectory classification). The three trajectory patterns we consider are line, arc, and m-rectangle, and we will discuss them in turn. To save space we will only describe rectangles and arcs that are aligned with the horizontal and vertical axes. It is a trivial extension to consider rotated versions of the trajectories. Similarly other trajectory patterns such as extended trapeziums, etc can be considered, see [21] where complex patterns such as Chinese characters are considered. A line pattern creates linear trajectories with local Markov dependency, and it is characterized by rules of the form A → aA with a representing an observed local tracklet as the target moves. The production rule generates a language that is equivalent to that of the hidden Markov model formulation. Since Markov models have been studied extensively, line pattern will not be discussed any further. The two other geometric patterns are arc and m-rectangle, and they possess long range and self-embedding dependencies that require production rules that Markov model cannot represent. Informally, an arc pattern is characterized by two line patterns that can be arbitrarily long. In order for Markov models to model such variable length trajectories, the Markovian assumption requires the state space to be defined based on the maximum length dependency, and the exponential growing state space is the constraining issue. Furthermore, for sources with hidden branching processes, for example arcs, stochastic context free grammar is shown to be more efficient than HMM in the sense that the estimated SCFG has lower entropies 131  [29]. 4.3.3.1  Arc Tracklet  The language of an arc can be compactly expressed as a language L = {x ∈ an c∗ bn }, where there is same number of matching upward and downward tracklets and arbitrary number of forward tracklets. The grammar can be constructed based on many techniques, and the two most commonly seen are matching and recursive relation. For each a in the string, there must be a matching b, and the corresponding grammar rule is S → aSb|ε , where ε is empty string. The arbitrary number of forward tracklets, on the other hand, can be modeled by the rule S → cS|Sc|ε . As a result, the basic production rules applied to construct arcs are S → aSb|cS|Sc|ε . However, as is known in the parsing literature, the inclusion of ε causes the parsing algorithm not to halt in all cases, ε is removed. The final equivalent production rules for an arc is S → aSb|cS|Sc|ab|c. As shown above, the rules needed to generate patterns such as arc have syntax that is more complex than a regular grammar. The rules can be shown to strictly contain regular grammar using the following property: A self-embedding context free grammar cannot be represented by a Markov chain [13], where a context-free grammar is self-embedding if there exists a non∗ terminal A such that A ⇒ η Aβ with η , β ∈ (N ∪ T )+ . For the rules presented, self-embedding property can be seen by its first production rule S → a S b. 4.3.3.2  m-Rectangular Tracklet  The natural construction of the rectangle can be expressed by a language L = {am bn cm d n |m, n ≥ 1}, where m and n signifies the length and width of the rectangle. Recall m-rectangles (modified rectangles), on the other hand, are trajectories comprising of three left turns (or 3 right turns) each of ninety degrees but are not necessarily closed trajectories (if they were closed, it would coincide with a rectangle). There are two reasons why we consider m-rectangles instead of rectangles. First, using the pumping lemma below, we show that the language comprising of rectangles is not context free. So algorithms for extracting such trajectories are not polynomial complexity. Second from a modeling point of view, in order to recognize suspicious behaviour of a target moving around a building, m-rectangles are more robust since unlike a rectangle, the start and end points do not have to coincide. We start with the following pumping lemma proved in [23]. Pumping Lemma Let L be a context free language. Then there exists a constant K such that if z is any string in L such that |z| is at least K, then we write z = uvwxy, subject to the following conditions: 1. |vwx| ≤ K. That is, the middle portion is not too long. 132  2. vx = ε . Since v and x are the pieces to be ”pumped”, this condition says that at least one of the strings we pump must not be empty. 3. For all i ≥ 0, uvi wxi y in L. That is, the two strings v and x may be ”pumped” any number of times, including 0, and the resulting string will still be a member of L. Using the pumping lemma, we now show that the rectangle language is not context free. It’s sufficient to show that a subset of the language, i.e. L = {an bn cn d n |n ≥ 1}, is not context free. Lemma 1 L = {an bn cn d n |n ≥ 1} is not context free. Proof Suppose L is a context free language. Let z = aK bK cK d K . The first condition dictates that vwx is a substring of aK bK or cK d k . Let vwx be a substring of aK bK , then cK d K is a substring of y, and vx contains only a and b. uwy must be a string in the language by the pumping lemma, contains K c’s and d’s, but has fewer than K a’s and b’s. By contradiction, we can conclude that L is not context free. Same steps can be applied when vwx is a substring of cK d K . In order to restrict the grammar in the context free grammar, n can be kept finite, i.e. L = {an bn cn d n |n = 1, . . . , N}. In this case, the language can be represented by both regular and context free grammar. It can be shown that the context free grammar constructed can be much more compact than its regular counterpart. However, such rigid structures are too restrictive, and the choice of N is unclear. Instead, we chose to implement a rectangular grammar without the restrictive equilateral constraint. The new language is L = {a+ , b+ , c+ , d + }, and it can model any trajectories comprising of four sides at right angles, and not necessarily a closed curve. Even though the language does not represent rectangles in the conventional sense, the language still enforce the rectangular shape based on the rectangular template constraints. The m-rectangle grammar given in the previous subsection decompose a rectangular pattern in a template consisting of more primitive patterns such as four line patterns. 4.3.3.3  Well Posedness of the Model  Given the language and the probabilistic grammar, analysis has to be performed to determine if the grammar generates the language properly, i.e. the probability of all the generated strings in the language sum to one. The requirement is essential as it ensures that the generation of the geometric patterns is stable, where the derivation process is subcritical and it terminates in finite time with finite length. This finiteness criteria provides a constraint on the SCFG model parameters, which may be used as a bound on the parameter values. We discuss this point by first defining the stochastic mean matrix. Definition Let A, B ∈ N , the stochastic mean matrix MN is a |N | × |N | square matrix with its (A, B)th entry being the expected number of variables B resulting from rewriting A: MN (A, B) =  ∑  η ∈(N ∪T )∗ s.t.(A→η )∈P  P(A → η )n(B; η ) 133  where P(A → η ) is the probability of applying the production rule A → η , and n(B; η ) is the number of instances of B in η [11]. The finiteness constraint is satisfied if the grammar satisfies the following theorem. Theorem 4.3.1. If the spectral radius of MN is less than one, the generation process of the stochastic context free grammar will terminate, and the derived sentence is finite. Proof The proof can be found in [11].  4.4 Syntactic Tracking Algorithms Given both the dynamic model and the geometric pattern model developed in Sec. 4.3, the problem of syntactic tracking is reduced to a parsing problem of the moving target’s trajectory with estimated tracklets. This section aims to develop the tracklet estimator and the syntactic pattern estimator modules as shown in Fig. 4.4. Because the tracklets only arrive as the process unfolds, Earley-Stolcke parsing algorithm is chosen as it can parse data from left to right recursively [40; 24]. Earley-Stolcke parsing algorithm is a top down parser, and it is different from the more common bottom up parsers such as the CYK algorithm [17]. Sec. 4.4.1 discusses the implementation of the tracklet estimator that produces estimates of tracklet sequences, and Sec. 4.4.2 summarizes the implementation of syntactic pattern estimator based on the extended version of the Earley Stolcke parser.  4.4.1 Syntactic Enhanced Tracker The tracklet estimator formulated in (4.3) can be implemented as any multiple mode Bayesian filter that recursively estimates the target’s underlying mode, for example extended Kalman with IMM and multiple mode particle filter. Extended Kalman filter is required because of the nonlinearity in the observation model. Particle filter is a sequential Monte Carlo estimator based on point mass representation of probability densities, and instead of approximating the functional form of the observation model like the extended Kalman filter, particle filter approximates the probability distribution function itself using a a set of simulated samples and weights. It should be noted that particle filter is suitable to problems where the dynamics are nonlinear or the noise that distorts the signal is non-Gaussian. In such case, Kalman filter solutions may be far from optimal. However, because the particle filters are very computationally intensive, it is to be applied only if the Kalman based methods do not produce satisfactory results [37]. In this paper, extended Kalman with IMM is chosen to implement the numeric-symbol Bayesian filter, as it provides stable tracks of the ground moving vehicles.  134  4.4.1.1  Multiple Model Sequential Markov Chain Monte Carlo (particle filter)  A multiple mode particle filter is formulated using a hybrid state space representation. Let yk = (xTk , ak )T , where xk is the continuous value kinematic state, ak is the discrete value IMM mode, and T denotes transpose. The posterior probability distribution of the state space is approximated by P(yk |zk ) = ∑Ni=1 wik δ (yk − yik ). The random measure {yik , wik }Ni=1 are the particles and their associated weights that characterize the posterior distribution, and N is the number of particles. The multiple mode particle filter algorithm consists of three steps [37]: 1. sampling of the IMM mode transitions, 2. sampling of the mode conditioned kinematic state, and 3. resampling to avoid degeneracy. The three steps are described in the following three paragraphs. The sampling of the IMM mode is implemented based on the inverse transform method. Given the set of IMM modes at time k − 1, {aik−1 }Ni=1 , {aik }Ni=1 are generated based on the transition matrix of the IMM modes, which is modeled as a Markov chain. Let Cak−1 (a) = ∑aj=1 πi j be the cumulative distribution of the IMM mode a given ak−1 = i, the algorithm to generate the mode is shown below: for i = 1 to N do Sample u ∼ U[0, 1] m←1 while Cai (m) < u do k−1 m ← m+1 end while aik ← m end for The sampling of the mode conditioned kinematic state involves sampling from the transition probability and calculating the associated weight. The optimal importance density is P(xk |xik−1 , aik , zk ) given the IMM mode sampled from step 1, yet the most popular and simpler importance function is P(xk |xik−1 , aik ). The un-normalized weight of each sampled particle is updated by the following equation w˜ ik = wik−1  P(zk |xik , aik )P(xik |xik−1 , aik ) , q(xik |xik−1 , aik , zk )  where q(yik |yik−1 , zk ) is the importance density. Using the simplified importance density, it is simplified to w˜ ik = wik−1 P(zk |xik , aik ). 135  The normalized weight is then wik = w˜ ik / ∑Ni=1 w˜ ik . The resampling involves a mapping of random measure {xik , wik } to {x ik , 1/N} with uniform weights. The resampled particles {x ik }Ni=1 are generated by resampling with replacement N times from the random measure {xik , wik }Ni=1 . The resampling is necessary if the effective sample size is less than a threshold sample size, and the effective sample size is computed as Nˆ e f f =  1 ∑Ni=1 (wik )2  .  If resampling is not performed, degeneracy problem would occur which means after a certain recursive steps, all but one particle will have negligible normalized weights. 4.4.1.2  Extended Kalman filter with IMM  Because Eq. (4.5) is highly nonlinear, extended Kalman filter is needed to process the observations. In order to minimize the linearization required, the measurement model is converted to a more preferred measurement model as follows: z k = h (xk ) + wk where       rk sin θk xk h (xk ) =  rk cos θk  =  yk  r˙k r˙k  (4.6)  and w k ∼ N(0, R ) is the measurement noise in the converted model. The converted covariance matrix is  2  σx σxy 0 R =  σyx σy2 0  , 0 0 σr˙2k whose elements are  σx2 =rk2 σθ2k cos2 θk + σr2k sin2 θk σxy =(σr2k − rk2 σθ2k ) sin θk cos θk σy2 =rk2 σθ2k sin2 θk + σr2k cos2 θk . In order to run extended Kalman filter, the Jacobian of the converted measurement function  136  is needed and it is given by   ∇xk h (xk ) =   1 0  ∂ h [3] ∂ xk  0 1  ∂ h [3] ∂ yk  0 0  ∂ h [3] ∂ x˙k  0 0  ∂ h [3] ∂ y˙k      which can be computed by differentiation. As will be shown in Sec. 4.4, the terminal probability wkj = P(ak = j|z1:k ) models the input uncertainty for the parsing process, and the position estimate xˆ k|k is stored in the low and high marks of the Earley state for enforcing consistency of the tracks. According to the kinematic model, we can compute the two variables based on the interacting multiple models (IMM) [27], and its algorithm is summarized here: • Calculating the mixing probabilities P(ak−1 (i)|ak ( j), z1:k−1 ) 1 = P(ak ( j)|ak−1 (i), z1:k−1 )P(ak−1 (i)|zk−1 ) c • Mixing 8  i| j  xˆik−1|k−1 = ∑ uk−1 xˆik−1|k−1 i=1 8  i| j  j j i Pk−1|k−1 = ∑ uk−1 Pk−1|k−1 + (xˆik−1|k−1 − xˆk−1|k−1 ) i=1  j )T (xˆik−1|k−1 − xˆk−1|k−1  • Model-matched filtering  Λkj = p(zk |z1:k−1 , ak = j)  • Mode probability update wkj  =  Λkj ∑8i=1 πi j wik−1 j  ∑8j=1 Λk ∑8i=1 πi j wik−1  137  • Estimate and covariance combination 8  j ukj xˆk|k = ∑ xˆk|k j=1 8  j j j Pk|k = ∑ ukj Pk|k + [xˆk|k − xˆk|k ][xˆk|k − xˆk|k ] j=1  Recall that the purpose of the numeric-symbol Bayesian filter is to map the GMTI STAP measurements to a tracklet. Since the tracklet is defined by the IMM mode ak , given the IMM filter output, the output of the filter can be either wkj for all j or arg max j wkj , which calls for soft and hard parsing respectively. Even though both parsing are implemented, soft parsing is described in the paper as it better models the uncertainty involved in the trajectory.  4.4.2 Extended Earley Stolcke Parsing of Target Trajectory We are now ready to describe our syntactic signal processing algorithms on top of the GMTI tracking algorithm described above. In a formal language context this is called stochastic parsing - since we are parsing the language generated by the trajectory of the target. The Earley Stolcke parser is a top down parser [40; 18], and it builds a parse tree that best describes a sequence of terminals, namely, the mode of the directional noise of the dynamic system. This section describes the Earley Stolcke parser, and the extensions needed to implement the syntactic pattern estimator (4.4) with GMTI measurements. The extensions include the following four functionalities: 1) model the uncertainties of the GMTI detection, 2) data association of concurrent detections, 3) track initiation of target trajectory, and 4) trade-off between the completeness of the tracks and the computation resources. In other words, the parsing algorithm is to implement the basic functionalities of the track maintenance in multiple target tracking setting. In order to integrate the parser with GMTI tracking capability, two changes are to be made: modification to the parser state and the addition of production rules for false alarm rejection. The extensions are largely based on those described in [24], but altered to fit the specific case of GMTI. The control structure the parser uses to store the incomplete parse trees is defined as i : Xk → λ .Yu[l, h, α , γ ]. X and Y are nonterminals, λ and u are substrings of nonterminals and terminals, ”.” is the marker that specifies the end position for the partially parsed input and that position is indexed by i, k is the starting index of the substring that is generated by the nonterminal X. Fig. 4.5 illustrates an example state i : Xk → A.B, where A and B are nonterminals; the indices k and i specifies the beginning and the end of the substring respectively which the nonterminal X can ”explain” so far, and the index marker ”.” specifies the part of X’s production rule that has been applied to 138  X A t1 t2  tk ti  B tl  Figure 4.5: The pictorial representation of the state i : Xk → A.B, assuming the nonterminal A has a production rule of the form A → ak ti .  explain the substring. With the marker in front of the nonterminal B, B is not yet applied, and the state is still incomplete. α is called forward probability and it is the sum of probabilities of all incomplete parse trees containing a1 , . . . , ai , and γ is called inner probability and it is the sum of probabilities of all incomplete parse trees containing ak , . . . ,ti . l is the kinematic state of the moving target at the beginning of the track, and h is the newly estimated kinematic state. Based on this definition, let d be the euclidean distance between two kinematic states, a similarity function f (d) is introduced to measure the consistency of the kinematic states and it provides valuable information for the implementation of data association and rejection of false detection. Many models may be applied to exploit the spatial correlation [5], and in this paper, power exponential function, f (d) = exp(−(θd1 )θ2 ), is applied, where θ1 > 0 and θ2 ∈ (0, 2] are determined experimentally. The production rule, on the other hand, is modified to model false detection. For every production rule that involves the generation of terminals, a nonterminal F is added. For example, the rule L → lL will be modified to include L → lL|FL, and F → a|b|c|d|e| f |g|h. The rule simply states that if the terminal at the location is a false detection, replace it with F and carry on the parsing as before. To illustrate the parsing procedure, a simple example of parsing a very short input string ”bb” is provided. To initialize the parsing process, the dummy state 0 : 0 → .S[lc , hc , 1, 1] is inserted, where lc and lh are the extracted kinematic states of the target from the GMTI detection; the dummy state simply says that at the index position 0, the start symbol is applicable to parse the input string. With the dummy state in place, the parser builds the parse tree by iteratively going through three operations: prediction, scanning, and completion. The operations are applied sequentially, and each operation works on the set of states produced by the previous operation. The parsing of the string ”bb” is illustrated in Table 4.1. Given a set of states (or just the initial dummy state at index 0), the prediction operation searches for states whose index marker has a nonterminal to the right of it, and those nonterminals, with their production rules, are used to generate a set of predicted states. From the predicted states, the scanning operator looks if there are states whose index marker has a terminal to the right of it. For those states whose terminal matches the input string at the indexed position, their index marker is advanced by one position and produced as a scanned state. Lastly, from the set of scanned states, the completion operation looks if there are states whose index marker is at the end of its production rule. If any are found, the states that generated those scanned states will have their index advanced by one position. 139  0 0 : 0 → .S prediction 0 : S0 → .La 0 : S0 → .Lh 0 : La0 → .aLa 0 : La0 → .a ... 0 : Lh0 → .hLh 0 : Lh0 → .h  1 b Scanning 1 : Lb0 → b.Lb 1 : Lb0 → b. Completion 1 : S0 → Lb . Prediction 1 : Lb1 → .bLb 1 : Lb1 → .b  2 b Scanning 2 : Lb1 → b.Lb 2 : Lb1 → b. Completion 2 : Lb0 → bLb . 2 : S0 → Lb .  Table 4.1: Earley Stolcke Parser parsing a simple terminal string ”bb” with the simplified grammar specified in Sec. 4.3.2; only the production rules associated with the nonterminal Line are included.  The details of the three operations are discussed in turn. 4.4.2.1  Prediction  The prediction operator adds states that are applicable to explain the unparsed input string. For all states of the form i : Xk → λ .Yu [l, h, α , γ ], where λ and u may be empty, Y is the nonterminal that could possibly generate the next terminal in the input string, the operator adds Y ’s production rule, i : Yi → .v [l, h, α , γ ], as a predicted state. The α and γ are updated according to  α = ∑ α (i : Xk → λ .Zu)RL (Z,Y )P(Y → v) λ ,u  and  γ = P(Y → v), where RL is a reflective transitive closure of a left corner relation and it computes the probability of indefinite left recursion in the productions. (The detail of the relation is omitted as it has little significance in this paper. Interested readers can refer to [40].) The new predicted state inherits the kinematic states because it explains the same portion of the target track. The pruning capability of the parser can be implemented by discarding the predicted states if its forward probability is lower than a threshold. The value of the threshold is a system parameter that 140  balances between system loading and track completeness. In addition, the prediction stage is a good place for modification so the parser can capture unknown beginning of vehicle’s motion trajectory. At each time instant when the prediction operation is run, a dummy state of the form ∀k k : k → .S can be inserted if there are GMTI detection that cannot be associated with any partial parse tree (The data association is implemented next in the scanning operator). With this dummy state, the parser is not limited to capture patterns that were started at the time instant 0, and it allows the prediction operator to play the role of track initiation for the track maintenance. 4.4.2.2  Scanning  The scanning operator matches the terminal in the input string to the states generated from the prediction operator. For all states of the form i : Xk → λ .au [l, h, α , γ ], where λ and u can be empty, the state i + 1 : Xk → λ a.u [l, xa , α , γ ] is added if the terminal at the index i + 1 of the input string is a. Moreover, xa is the kinematics state of the terminal a estimated by the Bayesian filter, and Xk = F if f (|h − xa |) is greater than a threshold, and Xk = F otherwise. More specifically, the similarity function is combined with the false detection nonterminal to implement the nearest neighbor data association filter. More complicated algorithms can be implemented but they’re not pursued here. Moreover, the input uncertainty P(xa ) models the GMTI detection probability and it was described in the Appendix C. The α and γ are updated according to  α = α (i : Xk → λ .au)P(a) and  γ = γ (i : Xk → λ .au)P(a), where P(a) is the probability of the input. It is noted that by including P(a) in updating α and γ , the parsing process also takes the input uncertainty in account. 4.4.2.3  Completion  The completion operator advances the marker position of the pending predicted states if their derived states match the input string completely. The scanned states whose marker is at the end of their rule have the form i : Y j → v. [l2 , h2 , α , γ ]  141  and each of them has a corresponding pending predicted state with the form j : Xk → λ .Yu [l1 , h1 , α , γ ]. The purpose of the completion operator is to find those pending prediction states and advance their marker. It is important to notice the relationship of the indices in the scanned state and the pending prediction state. The indices of the pending prediction state says the nonterminal Y is applied at the position j, and the indices of the scanned state says the nonterminal Y matches the substring from the index j to i. As a result, the two states generate the complete state i : Xk → λ Y.u [l1 , h2 , α , γ ], which means the pending prediction state can now explains the substring from the index k to i. The associated α and γ probabilities are updated according to  α = f (h1 , h2 ) ∑ α (i : Xk → λ .Zu)RU (Z,Y )γ (i : Y j → v.) v  and  γ = f (h1 , h2 ) ∑ γ (i : Xk → λ .Yu)RU (Z,Y )γ (i : Y j → v.) v  respectively, where RU is a reflective transitive closure of a unit production relation and it computes the probability of an infinite summation due to cyclic completions (interested reader can refer to [40] for more detail), and the similarity function here models the consistency between the pending prediction state and the completed state. If the likelihood probabilities of the completed state is lower than a threshold, it will be pruned to trade track completeness with computation reduction, and this implements the track termination functionality of the track maintenance. The parsing algorithm can be further extended to incorporate more intelligence in the parsing algorithm. For example selection logic may added to the prediction operator, that instead of adding all probable states, only add those whose production rule yields terminal symbols that are compatible with the input string. In other words, instead of purely top down parsing, bottom up information could be incorporated to speed up the parsing algorithm.  4.5 Experimental Setup and Results The numerical studies in this section demonstrate how stochastic parsing with target tracking can discern geometric patterns with real GMTI data collected by DRDC. Sec. 4.5.1 describes the experiment setup and the data model, Sec. 4.5.2 discusses the pre-processing required to transform measurements from various coordinate systems, Sec. 4.5.3 summarizes the numerical results, and Sec. 4.5.4 shows the effect of feeding back the higher level geometric pattern 142  Figure 4.6: A SAR image of the location of the experiment captured by the DRDC XWEAR system.  information to tracking.  4.5.1 Experimental Setup The GMTI data is collected using the DRDC Ottawa’s X-band Wideband Experimental Airborne Radar (XWEAR)[16; 15]. It is a reflector-antenna based multi-function radar that is designed to collect coherent radar echo data with various modes for wide-area search and imaging. The introduction of a multimode feed (i.e., capable of carrying two electromagnetic modes) enables a two-channel GMTI capability[2]. The XWEAR radar is used to collect data for investigations into wideband synthetic aperture radar (SAR), inverse SAR (ISAR), maritime surveillance, and GMTI. The XWEAR radar data collection modes include search modes where the antenna is rotating, stripmap SAR, spotlight SAR imaging modes and wide-area surveillance GMTI mode. The navigation subsystem consists of an inertial measurement unit (IMU) mounted near the antenna phase centre (APC) and an embedded global positioning/inertial navigation system (EGI) mounted near the centre of gravity of the aircraft. In order to collect coherent radar echoes the radar data needs to be compensated for undesirable APC motion (changes in aircraft ground speed, deviation from ideal flight path) that introduces pulse-to-pulse errors. The IMU provides high-rate (200 Hz) measurements of velocity and angular increments. The strapdown navigator algorithms process this and yield estimates of APC position and velocity, and antenna orientation. The EGI blends its own inertial data with GPS data using an internal Kalman filter and the resulting accuracy in position and velocity is about 2 m and 0.03 m/s, respectively. 143  Pulse Length PRF Carrier Frequency Polarization Antenna  5µ s 1-2 kHz 9.75 GHz Transmit and Receive-Horizontal 1m width, 2.5o (4o ) azimuth (elevation) beamwidth  Table 4.2: Radar parameters of the DRDC XWEAR system used in data collection.  This is used in an external Kalman filter to give long-term stability to the strapdown navigation solution from the IMU. The phase corrections are then applied relative to a reference trajectory, so that the resulting data is coherent. In flight trials, the radar was installed and flown on a Convair 580 aircraft. The data was collected over Western Ottawa. A SAR image of the scene is shown in Figure 4.6. The aircraft was moving at about 200 knots, or 100 m/s, with aircraft positions recorded as discussed above. The ground moving target is a truck, and it is moving in trajectories that form various geometric patterns. The GPS data of the movers was also recorded. The antenna was pointed to a fixed point on the ground, and the target always had non-zero radial velocity so that the target could be observed continuously by STAP-based GMTI techniques. The elevation angle is neglected as it does not provide any additional information. This is because in the GMTI case, the target is moving on a known plane. Then, if the pointing angle and range resolution are known, a particular range bin is equivalent to an elevation angle of the target.  4.5.2 Pre-Processing of Experiment Data The tracking algorithms developed are defined in the tangential plane Cartesian coordinate system. However, the ownship coordinates, provided by the global positioning system on-board the aircraft, are given in the geodetic coordinate system, and the GMTI STAP measurements, which include range, Doppler, and azimuth angle, are collected in the local spherical coordinates. As a result, in order to apply the tracking algorithms developed, pre-processing of the various measurements by transformation between these coordinate systems is necessary. By convention, the transformation between these different coordinate systems is achieved by the use of the Earthcentered Earth-fixed (ECEF) coordinates [6], and the relationship between the ECEF and the tangential plane Cartesian coordinates is illustrated in Fig. 4.7. In this paper, the numerical study requires the pre-processing of both the ownship and the GMTI STAP measurements. We will begin by describing the steps taken to process the ownship measurements. The first step is to convert the ownship coordinates in latitude and longitude to ECEF coordinates, and standard algorithm exists for this conversion [4]. The second step is to convert the ECEF to tangential plane Cartesian coordinates, which involves first finding the axis of the Cartesian coordinates. The first point of the measurements is used as the origin, and x-axis and y-axis are computed using the second point of the measurements and the origin. The 144  Figure 4.7: This figure illustrates the relationship between the Earth-Centered Earth-Fixed (ECEF) coordinate system and the target coordinate system.  z-axis is the vector of the first measurement to the sea level. Given the x, y, and z axis, the final step is the transformation of all the measurements to the Cartesian coordinates, which can be computed by dot products of each measurement with the axis. Similar procedure is applied to the pre-processing of the GMTI measurements, with the extra step of converting the local spherical coordinates to the tangential plane Cartesian coordinates.  4.5.3 Numerical Studies of Syntactic Tracking The numerical results of two geometric patterns will be illustrated: an arc pattern in a pincer scenario, and a loitering pattern. However, it should be pointed out that the measurement data was augmented with GPS information of the airborne platform and the ground moving vehicle. This was because the application of STAP to the raw data did not yield measurements with enough accuracy, especially the angle, for the entire duration of the measurement. The inaccuracy is due to the many challenges of obtaining adequate clutter cancellation and consistently good parameter estimation of ground movers in a highly heterogeneous urban environment. However, the measurement data is still useful because it illustrates the meta-level tracking concepts. Furthermore, the data still required some changes in the dynamical model and data processing, such as incorporating platform motion. Numerical studies are done with both the particle filter and the IMM/extended Kalman filter. 145  Since the results are very similar, only the results of the IMM/Extended Kalman filter is shown. The tracking result is illustrated in Fig. 4.8 based on a run of the DRDC flight trials. The solid line of the figure on the left panel is the real GMTI track, and the dotted line is the output of the IMM/Extended Kalman filter. An observation to notice is that the tracker performs quite well even during the turns of the truck trajectory. An intuitive explanation for this performance is the constraints imposed by the IMM modes ak . Since the mode constrains the noise term and thus the uncertainty of the state estimates, given an accurate estimate of the mode, a better estimate of the track, even at the turns, is expected. The purpose of the IMM/Extended Kalman filter is to generate the terminals for geometric pattern parsing, and as described in Sec.4.3.1, the set of terminals corresponds to the set of IMM modes. The right panel in Fig. 4.8 shows the estimated IMM modes, where only four modes are shown in this case for easy display. The parsing of the IMM modes for geometric pattern identification could be either soft parsing or hard parsing (as in soft or hard decision making). The hard parsing parses the estimated IMM modes directly, and the soft parsing parses with the probabilities of the IMM modes. We focus mainly on the soft parsing, and the numerical results of the arc and the square patterns are shown next. Fig. 4.9 illustrates the likelihood probabilities of different geometric patterns as an arc is parsed, and it also shows the estimated tracklet sequence and the most likely parse tree. The parsing algorithm initially classifies the trajectory as a line, but as more data has arrived, it correctly identifies it as an arc. Fig. 4.10 shows both arcs in the pincer trajectory. The detection data arrived not as two independent tracks, but an out of order interleaved sequence. The parsing algorithm performs the data association as described in Sec. 4.4.2, and parses the two arcs separately. Once the arcs are recognized, the identification of the pincer can be easily achieved by some logic operator. Recall that an arc is a palindrome and it is important to identify an arc irrespective of its dimension and orientation. Fig. 4.11 illustrates the likelihood probabilities of different geometric patterns as a mrectangle is parsed. A much longer track is used in this study to demonstrate the practicality of the algorithm. However, the parse tree is omitted due to its large size. As can be seen from the top panel of the figure, the correct geometric pattern maintains its high probability as the probabilities of other patterns drop because the input sequence does not support them. The other patterns such as vertical line and clockwise m-rectangle also had high probabilities initially because the initial segment of the input terminal string matches their syntactic structure. However, as more terminals are parsed, their probabilities drop.  4.5.4 Performance of Syntactic Enhanced Tracker The above parsing results demonstrate how SCFG signal processing can estimate geometric patterns of the target trajectories. A natural question is : Can the syntactic tracker estimates be fed back to the standard tracking algorithm to improve performance? For example if the syntactic tracker estimates that the target is moving in an arc, this information should be useful to the lower level tracking algorithm. 146  IMM/EKF Filter Output −50 Vehicle trajectory Filtered track  Distance in meters  −100  −150  −200 3.585  3.59  3.595  3.6 3.605 Distance in meters  3.61  3.615  3.62 4  x 10  Estimated IMM Modes 4 h 3.5  b  f d  3  IMM mode  2.5  2  1.5  1  0.5  0 0  20  40  60  80 100 Time in seconds  120  140  160  Figure 4.8: The output of the IMM/Extended Kalman filter. The result of the particle filter is not shown because it’s very similar. The top panel illustrates the real trajectory of the truck, and the track developed by the filter. The bottom panel, on the other hand, shows the estimated IMM modes. The set of IMM modes corresponds to the set of terminals that is to be parsed by the algorithm for the identification of the geometric pattern.  147  Figure 4.9: The plot demonstrates the likelihood probabilities of different geometric patterns as the input sequence of IMM modes corresponding to an arc is being parsed.  Pincer 80 Top Arc Bottom Arc 60  40  Distance (meters)  20  0  −20  −40  −60  −80 0  10  20  30  40  50  60  70  Distance (meters)  Figure 4.10: The trajectories of a pincer operation.  148  Log Likelihood Probabilities of Motion Patterns  0  10  −5  10  −10  10  Log Probabilities  −15  10  −20  10  −25  10  −30  Lb Lf Sql Sqr Lh Ld  10  −35  10  −40  10  0  10  20  30 Input Sequene  40  50  60  IMM/EKF Track −90 Estimated IMM Mode 4  −100 −110 −120  −140 Mode  Distance in meters  3 −130  −150  2  −160 −170  1 Vehicle Trajectory IMM Track  −180 −190 3.6  3.602  3.604  3.606 Distance in meters  3.608  3.61  3.612 4  x 10  0  10  20  30 Time (Second)  40  50  60  Figure 4.11: The figure illustrates the numerical result of parsing a m-rectangle pattern. The log likelihood probabilities of different geometric patterns are shown in the top figure. The trajectory and its corresponding track are shown at the bottom left figure, and the estimated IMM modes are shown at the bottom right figure.  We used the syntactic parser of Sec. 4.4.2 and fed back the estimates to the IMM tracker described in Sec. 4.4.1, where the IMM mode probability is computed as the weighted sum of the IMM mode estimates and the SCFG parser estimates. More specifically, the IMM mode estimate is the probability P(ak |z1:k ), and the SCFG parser estimate is the probability P(ak |a1:k−1 ). The probability of the SCFG parser estimate is computed based on the outputs of the Earley Stolcke parser, i.e. prediction states generated at each time instant, and the detail can be found in [20; 26]. Since the IMM and the SCFG offers complimentary information of the mode, we mix the two models equally for each mode estimate, i.e. P(ak |z1:k , a1:k−1 ) = 0.5P(ak |z1:k ) + 0.5P(ak |a1:k−1 ), where P(ak |z1:k , a1:k−1 ) refers to the mixed mode estimate. Fig. 4.12 demonstrates the reduction in estimator covariance with the knowledge of the extracted geometric pattern; the solid line shows the covariance of the tracker as the target is moving in a square, and the dotted line is the covariance of the assisted tracker. The jumps in covariance 149  15  Motion Pattern Knowledge−base Signal−to−Symbol Filter  Intent Inference Parser  Average variance  IMM Smoothed with parser  10  5  0 0  5  10  15 Samples  20  25  30  Figure 4.12: The figure shows the covariance reduction from feeding back the meta level description to the Bayesian tracking module.  correspond to the times when the target is making sharp turns, and the knowledge about the target trajectory’s geometric pattern allows the tracker to make better predictions of the turns, and thus reduce covariance. Another observation from the numerical results is that it is possible to prune the parse tree as their probabilities drop below a certain threshold. If the input terminal sequence does not support the syntactic rules of a motion pattern, the parse tree corresponding to the pattern could be pruned completely. The computational complexity and the storage requirement could be greatly reduced.  4.6 Conclusion We model the ground vehicles under the GMTI surveillance as string generating devices whose motion trajectories are annotated by SCFG. The parse tree of the trajectory characterizes its geometric pattern, and which provides valuable knowledge regarding the ground moving targets’ intents. The parsing of the motion trajectories is implemented with Earley Stolcke parsing algorithm, and its control structure is extended with either the particle filter and the IMM/Extended Kalman filter to deal with the GMTI data. The meta level understanding of the trajectories is developed as GMTI data is collected and parsed, and which provides near real time syntactic information about the targets’ trajectories for ground surveillance. Lastly, the parsing algorithm and the Bayesian filters are implemented in C++ for numerical studies, and real GMTI data is collected using DRDC Ottawa’s XWEAR radar. The particular numerical scenarios studied are the identification of a pincer and a square. Current paper focuses mainly on the identification of geometric patterns, and future works are to enhance its capabilities. The future work can be grouped into two areas: the refinement of the syntactic tracking algorithm, and the enhancement of the knowledge management tech150  niques. In terms of the syntactic tracking, in addition to relying on knowledge captured in the grammar, other intent inference methods such as Bayesian network may be utilized for a complete battlefield awareness solution. Moreover, in addition to bottom up data fusion, feedback loop may be added to supplied the meta level knowledge such as the identified geometric pattern to the lower level tracking and radar management components. For example, the sampling rate and the signal to noise ratio threshold of the GMTI’s SAR/GMTI model may be controlled for better detection performance. The knowledge management techniques, on the other hand, focus more on the collection and maintenance of the meta level knowledge. One area of interest is the application of ontology to facilitate the codification and sharing of expert knowledge.  151  Bibliography [1] A. V. Aho and J. D. Ullman. The Theory of Parsing, Translation and Compiling, volume I: Parsing. Prentice-Hall, Englewood Cliffs, NJ, 1972. [2] B. Balaji and A. Damini. Multimode adaptive signal processing: a new approach to GMTI radar. T-AES, 42(3):1121–1126, 2006. [3] Y. Bar-Shalom and X. Li. Estimation and Tracking: Principles, Techniques, and Software. Artech House, 1993. [4] Y. Bar-Shalom, X. R. Li, and T. Kirubarajan. Estimation with Applications to Tracking and Navigation. Wiley-Interscience Publication, 2001. [5] J. O. Berger, V. D. Oliverira, and B. Sanso. Objective bayesian analysis of spatially correlated data. Journal of American Statistical Association, 96(456):1361–1374, 2001. [6] S. S. Blackman and R. Popoli. Design and Analysis of Modern Tracking Systems. Artech House, 1999. [7] L. E. Brennan and I. S. Reed. Theory of adaptive radar. IEEE Transactions on Aerospace and Electronic Systems, 9(2):237–252, 1973. [8] M. J. Carlotto. MTI data clustering and formation recognition. T-AES, 37(2):524–537, 2001. [9] E. Charniak. Statistical Language Learning. MIT Press, 1993. [10] E. Charniak and R. Goldman. A bayesian model of plan recognition. Artificial Intelligence, 64:53–79, 1993. [11] Z. Chi. Statistical properties of probabilistic context-free grammars. Computational Linguistics, 25:131–160, 1999. [12] N. Chomsky. Three models for the description of language. IRE Transactions on Information Theory, 2(3):113–124, 1956. [13] N. Chomsky. On certain formal properties of grammars. Information and Control, 2(2):137–167, June 1959. [14] J. Coleman. Introducing Speech and Language Processing. Cambridge University Press, 2005. [15] A. Damini, B. Balaji, G. Haslam, and M. Goulding. X-band wideband experimental airborne radar phase ii: synthetic aperture radar and ground moving target indication. In IEE Proceedings of Radar, Sonar and Navigation, volume 153, pages 144–151, 2006. 152  [16] A. Damini, M. McDonald, and G. Haslam. X-band wideband experimental airborne radar for sar, gmti and maritime surveillance. In IEE Proceedings of Radar, Sonar and Navigation, volume 150, pages 305–312, 2003. [17] R. Durbin, S. Eddy, A. Krogh, and G. Mitchison. Biological sequence analysis: Probabilistic models of proteins and nucleic acids. Cambridge University Press, 1998. [18] J. Earley. An efficient context-free parsing algorithm. Communications of the ACM, 13(2):94–102, 1970. [19] S. Fine, Y. Singer, and N. Tishby. The hierarchical hidden markov model: Analysis and applications. Machine Learning, 32:41–62, 1998. [20] J. Frederick and J. D. Lafferty. Computation of the probability of initial substring generation by stochastic context-free grammars. Computational Linguistics, 17:315–323, 1991. [21] K. S. Fu. Syntactic Pattern Recognition and Applications. Prentice-Hall, Englewood Cliffs, NJ, 1982. [22] J. E. Hopcroft, R. Motwani, and J. D. Ullman. Introduction to Automata Theory, Languages, and Computation. Addison-Wesley, Reading, MA, second edition, 2001. [23] J. E. Hopcroft, R. Motwani, and J. D. Ullman. Introduction to Automata Theory, Languages, and Computation. Pearson Education, third edition, 2007. [24] Y. A. Ivanov and A. F. Bobick. Recognition of visual activities and interactions by stochastic parsing. T-PAMI, 22:852–872, 2000. [25] F. Jelinek. Statistical Methods for Speech Recognition. MIT Press, 1997. [26] D. Jurafsky, C. Wooters, J. Segal, A. Stolcke, E. Fosler, G. Tajchman, and N. Morgan. Using a stochastic context-free grammar as a language model for speech recognition. In ICASSP, 1995. [27] T. Kirubarajan, Y. Bar-Shalom, K. R. Pattipati, and I. Kadar. Ground target tracking with variable structure imm estimator. IEEE Transactions on Aerospace and Electronic Systems, 36:26–46, 2000. [28] R. Klemm. Space-Time Adaptive Processing. IEE Press, Stevenage, UK, 1998. [29] K. Lari and S. J. Young. The estimation of stochastic context free grammars using the Inside-Outside algorithm. Computer Speech and Language, 4:35–56, 1990.  153  [30] X. R. Li and V. P. Jilkov. Survey of maneuvering target tracking. part v: Multiple-model methods. IEEE Transactions on Aerospace and Electronic Systems, pages 1255–1321, 2005. [31] L. Lin, Y. Bar-Shalom, and T. Kirubarajan. New assignment-based data association for tracking move-stop-move targets. IEEE Transactions Aerospace And Electronic Systems, 40(2):714–725, 2004. [32] S. Luhr, H. H. Bui, S. Venkatesh, and G. A. W. West. Recognition of human activity through hierarchical stochastic learning. In Proceedings of the First IEEE International Conference on Pervasive Computing and Communications, 2003. [33] D. Lymberopoulos, A. S. Ogale, A. Savvides, and Y. Aloimonos. A sensory grammar for inferring behaviors in sensor networks. In International conference on Information processing in sensor networks, pages 251–259, 2006. [34] C. D. Manning and H. Sch¨utze. Foundations of Statistical Natural Language Processing. The MIT Press, 1999. [35] M. I. Miller and A. O’Sullivan. Entropies and combinatorics of random branching processes and context-free languages. IEEE Transactions on Information Theory, 38:1292– 1310, 1992. [36] D. V. Pynadath and M. P. Wellman. Probabilistic state-dependent grammars for plan recognition. In Proceedings of the 16th Annual Conference on uncertainty in artificial intelligence, pages 507–514, 2000. [37] B. Ristic, S. Arulampalam, and N. Gordon. Beyond the Kalman Filter Particle Filters for Tracking Applications. Artech House, 2004. [38] Y. Sakakibara. Grammatical inference in bioinformatics. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27:1051–1062, 2005. [39] M. I. Skolnik. Introduction to Radar Systems. McGraw-Hill, 2002. [40] A. Stolcke. An efficient probabilistic context-free parsing algorithm that computes prefix probabilities. Computational Linguistics, 21(2):165–201, 1995. [41] A. Wang and V. Krishnamurthy. Threat estimation of multifunction radars : modeling and statistical signal processing of stochastic context free grammars. In ICASSP, volume 3, pages III–793–III–796, 2007. [42] A. Wang, V. Krishnamurthy, F. A. Dilkes, and N. A. Visnevski. Threat estimation by electronic surveillance of multifunction radars: a stochastic context free grammar approach. In IEEE Conference on Decision and Control, pages 2153–2158, December 2006. 154  [43] J. Ward. Space-time adaptive processing for airborne radar. Technical report 1015, Lincoln Laboratory, MIT, December 1994. [44] J. Ward. Cramer-Rao bounds for target angle and doppler estimation with space-time adaptive processing radar. In Proceedings of the 29th ASILOMAR conference on Signals, Systems and Computers, pages 1198–1203, October 30-November 2 1995.  155  Chapter 5  Conclusion and Future Work 5.1 Overview of Thesis Meta level tracking concerns with higher level information extraction from recursively estimated state sequence. It is demonstrated that even though the traditional state space model is sufficient for the kinematic tracking of sequential data, it is not expressive enough to characterize higher level information such as multifunction radar’s operation modes and geometrical patterns of ground moving target’s trajectories. Based on the Chomsky’s characterization of grammatical models, stochastic context free grammar is selected as the knowledge representation language, and it is used to construct a system framework to support the processing of meta level tracking. The developed SCFG based tracking system has been applied successfully to various applications. We model and characterize a multi-function radar (MFR) as a string generating device, where the MFR’s inner control mechanism is specified in terms of a stochastic context free grammar (SCFG) modulated by the radar’s operational mode, which is modeled as a Markov chain. Each production rule corresponds to an operational rule employed by the MFR to generate its radar words, and whose probabilities are modulated by the Markov chain. The domain specific knowledge is assumed to be supplied by expert radar analysts. This is unlike modeling of targets, where hidden Markov and state space models are adequate [2; 1]. The research shows how a large scale dynamical system such as a multifunction radar can be expressed by a compact representation. Based on the SCFG representation, The signal interpretation of the MFR, under our formulation, is reduced to a state estimation by parsing through radar words. A maximum likelihood sequence estimator is derived to evaluate the threat poses by the MFR, and a maximum likelihood parameter estimator is derived to infer the unknown system parameters with the Expectation Maximization algorithm. In addition, based on the interpreted radar signal, the interaction dynamics of the MFR and the target is studied and the control of the aircraft’s maneuvering models is formulated as a discrete stochastic approximation problem. Moreover, we applied the meta level tracking to model the trajectories of the ground moving vehicles under the GMTI surveillance, and targets’ motion patterns are characterized by SCFG enhanced kinematic models. The control structure of the Earley’s parsing algorithm is extended with both an IMM/Extended Kalman filter and a particle filter to perform data association and syntactic tracking based on the track estimates. The meta level understanding of the trajectories are developed as GMTI data is collected and parsed, and which provides the real time syntactic 156  information about the targets’ trajectories for ground surveillance. The parsing algorithm and the Bayesian filters are implemented in C++ for numerical studies, and real GMTI data is collected using DRDC Ottawa’s XWEAR radar. The particular numerical scenarios studied are the identification of a pincer and a square.  5.2 Discussion of Results The research work presented in the thesis demonstrates the feasibility and the practicality of applying SCFG to perform meta level data fusion on dynamically evolving sequential data and in diverse application domains. The two major application domains considered are 1) Electronic support measurement against multifunction radars, and 2) Syntactic tracking and ground surveillance with GMTI radar. In the first problem, we model the complicated mode switching and task scheduling of an agile multifunction radar, and study the feedback of the meta level estimation to control the aircraft maneuvering mode to maximize its survivability. In the second problem, a syntactic tracking system framework is developed to extract geometric patterns in ground moving targets’ trajectories in real time. The work is based GMTI STAP technology, which is capable of detecting moving targets over a large region. In this section, the results obtained from the two problems will be discussed. Chapter 2 discusses the modeling of multifunction radars in great detail. Abstract model components were described to model radar system functionalities. The identified model components are radar controller, phrase scheduler, and radar manager. The radar manager is responsible for selection of radar tasks such as search and target identification; the phrase scheduler plans or preempts radar commands based on the system load; and the radar controller maps the command to radar pulses. Based on SCFG, a syntactic model for each of the model components was developed; Table 3.2.2.2 lists the complete set of production rules developed to specify the phrase scheduler, and Table 3.2.2.3 the complete set of rules for radar controller. Because the selection of tasks and generation of radar pulses depend on the MFR’s operation mode, Markov modulated SCFG was developed where the Markov chain models the change in operation mode and SCFG models the resulting command generation given the mode. The model specification is introduced in Chapter 2. The maximum likelihood state estimator was developed in Sec. 2.3.2, and the parameter estimation algorithm was derived in Sec. 2.3.3. In addition, even though SCFG is more expressive and more general than its Markov counterpart, it also demands higher computational complexity. Sec. 2.4.1 describes the conditions over which a CFG can be mapped to its Markov counterpart, and the algorithms that perform the conversion. Chapter 3 extends the syntactic model developed in Chapter 2, and equips it with ability to provide feedback to the aircraft control system and allows optimal selection of aircraft maneuvering mode to maximize survivability. The inclusion of feedback control requires modification to syntactic models of the phrase scheduler and the radar manager. The Markov chain modeling the radar manager is parameterized by Logit distribution with parameters indicating aircraft 157  maneuvering modes, and the result is described in Sec. 3.2.2.1. Because each state of the radar manager now corresponds to a particular radar task, the phrase scheduler is modified accordingly to model the mapping of radar tasks to radar commands. Given the modified Markov modulated SCFG model, optimal selection of the optimal maneuvering modes against a multifunction radar was formulated as a stochastic approximation algorithm. The formulation of the algorithm is discussed Sec. 3.3.3. The algorithms were programmed in C++ and evaluated. The numerical studies of the algorithms are presented in Sec. 3.4. Chapter 4 studies the application of SCFG to model two-dimensional spatial patterns and the development of pattern estimation algorithms that work with legacy Bayesian tracking systems. The factorization of the pattern estimation algorithm to its constituent Bayesian tracker and syntactic pattern estimator is described in Sec. 4.2.2. The state space model that supports the SCFG pattern estimator is described in Sec. 4.3.1, and the SCFG formulation of the spatial patterns is provided in Sec. 4.3.2. The modification allows extraction of the target headings from tracking algorithms such as interacting multiple model (IMM), and the extracted outputs can be processed directly by the SCFG parser. A standard IMM/Extended Kalman filter and a particle filter were implemented to perform the tracking. The parser is extended from the standard Earley Stolcke Parser, and the results are shown in Sec. 4.4.2. In addition, structural analysis of the spatial patterns were performed to express the ad hoc patterns in formal language expression, and further analysis is done to study their complexity and language type, i.e. determine if they belong to regular or context free language. The analysis is presented in Sec. 4.3.3. Numerical studies of the algorithms were performed based on real radar data provided by DRDC, and the results are shown in Sec. 4.5. In order to evaluate the algorithms developed, a software testbed is developed. As described in the previous paragraphs, meta level tracking system consists of many components such as kinematic tracker, parameter estimator, and SCFG parser, and each component may require many different implementations. In other words, meta level tracking is application specific and requires high level of configurability. The two main design requirements considered are modularity and scriptability. The testbed has to be modular so modification and extension to diverse applications can be easily incorporated, and it has to be scriptable so application specific scenario can be easily constructed. As a result, object-oriented C++ classes are used to implement the algorithms for modularity, and Matlab wrappers are written around the classes for scriptability. In the particular case of the GMTI work, extra capabilities such as database access of GMTI data and visualization of the tracking and parsing results on Google Earth were also implemented.  5.3 Future Work Future work of the research can proceed in two major directions: 1) focus the research on facilitating human designers to codify their expert knowledge, or 2) focus the research on automatic 158  knowledge generation. In the defense domain, the first research direction may be more practical due to the lack of training data. However, in computer vision, for example, training data are readily available and the second proposed research direction may be applicable there. In the first research direction where the focus is on the human designers, the problem is to develop tools and methodologies that allow the designers to create and share knowledge bases. More expressive knowledge representation and more intuitive knowledge manipulation methods should be developed. Sec. 5.3.1 describes the candidate representations that not only provide a more standardized data formats for human designers to codify their knowledge, but they are also readily accessible to software so development cost can be greatly reduced. In the second research direction, the problem is to devise a data driven system framework that allows the tracking system to extract and build knowledge bases. Not only are we interested in the automatic generation of knowledge bases, but also automatic sensor management by feeding back meta level information extracted. Sec. 5.3.2 discusses the completion of the loop for the meta level tracking, i.e., feeding back of the meta level information to lower signal processing components, and Sec. 5.3.3 describes the application of genetic programming to unify the learning of the grammar production rules and the Bayesian filter parameters.  5.3.1 Grammatical Knowledge Representation with Ontology SCFG’s grammatical rules is used as knowledge representation throughout the thesis, and it’s demonstrated that it is capable of characterizing sequential data with complicated dependencies, i.e., self-embedding and branching. However, from the experience obtained, the production rules are easy to construct, but it is not easy to maintain nor expand. The dependencies between nonterminals are intertwined, and they are hidden in the set of interdependent production rules, and there is no systematic way to track and maintain the dependencies. One possible direction is to combine knowledge management techniques that are developed in the artificial intelligence community to the maintenance of production rules. The relevant technique is the knowledge codification language called ontology, and one of its possible applications is the construction of semantic web. Typically, an ontology consists of a finite list of terms and the relationships between them. The terms denote concepts such as classes of objects, and the relationships describes hierarchies of classes. In addition, an ontology may also include information such as properties, property restrictions, and disjointness statements. There are several languages that have been developed to implement ontology. For example a RDF or a RDF scheme that is developed as a data model that captures knowledge in terms of an object-attribute-value triple. However, because its limitation to binary ground predicates, newer languages are developed. The language DAML-OIL is proposed by W3C (http://www.w3.org/2001/sw/WebOnt/) and it is a joint work from the US proposal DAML-ONT (DARPA Agent Mark Language) and the European proposal OIL. DAML-OIL has several advantages such as a well defined syntax, a formal semantics, efficient reasoning support, and sufficient expressive power. The building blocks of the language are Class Elements, Property Elements, and Property Restrictions. The 159  nonterminal of the production rules can be expressed in terms of the Class Element such as < owl : Class rd f : ID = ”Nonterminal1” > < rd f s : subClassO f rd f : resource = ”GrammarName” > < /owl : Class > The production rule, on the other hand, can be expressed using the Property Element and the Property Restriction, for example < owl : Class rd f : ID = ”Nonterminal1” > < rd f s : subClassO f > < owl : Restriction > < owl : onProperty rd f : resource = ”isProducedBy”/ > < /owl : Restriction > < rd f s : subClassO f > < /owl : Class > In addition to the advantages of having a standardized and expressive format, another important attribute of the language is that it is written in XML. Using XML to express the SCFG grammar, not only is it easy for the designer to share and express the knowledge, the knowledge is also machine accessible, i.e., software programs can extract the knowledge base easily because standard parser exists to parse XML documents. As as result, the use of ontology can facility the build up of the knowledge base, and it allows standard interface to be implemented to gain access to the knowledge.  5.3.2 Feedback Control with Meta Level Information The implemented information flow of the meta level tracking system is bottom up: the processes are observed in noise, the measurements are grouped into tracks, and the tracks are labeled and parsed to extract the meta level information. Because each level of the tracking system performs signal processing based on different amount of prior knowledge, and the meta level information provides hints to the underlying mechanism of the process, thus it is believed that meta level information can greatly reduce the uncertainties associated with the prior knowledge. The feeding back of the meta level information to the lower level signal processing components should increase their predictive power, and greatly enhance their quality. For example, the higher level information can be exploited to enhance the efficiency of the sensors by lowering the variance of the observation model, i.e., by conditioning the observation model on the meta level information. Moreover, the meta level may be exploited to increase the detection rates of the underlying detector because it provides tighter constraints on the search space, and allows 160  the detector to concentrate on the more probable space. For the research work on the modeling of multifunction radars, the effort focuses on the modeling of the mechanism of radar pulse generation. The only feedback considered is the selection of maneuvering models to decrease the threat based on the intercepted pulses. However, it would also be useful if the aircraft’s ESM radar can switch to different detection modes based on the predicted operational mode of the MFR. The MFR generates pulses with different pulse width and different pulse repetition frequency based on its operational mode, any prior knowledge of the pulse parameters can greatly enhance its detection. For the research work in the GMTI tracking, on the other hand, current work focuses mainly on the identification of the geometric patterns exhibited by the targets’ trajectories. It is envisioned that the identified geometric patterns could be fed back for radar’s resource management. For example, the sampling rate and the signal to noise ratio threshold of the GMTI radar are two control parameters that could be tuned in the SAR/GMTI model, and conditioning on the identified geometric pattern can greatly limit the search space.  5.3.3 Grammatical Inference with Genetic Programming Grammatical inference is a difficult problem, and a survey can be found in [5; 3]. The problem consists of two parts: estimation of the grammar structure, and the estimation of the model parameters. The parameter estimation of the SCFG identification problem, given the SCFG’s structure, is a solved problem based on the development of the inside outside algorithm [4]. The challenge remains, however, for the estimation of the structure. As was described in the introduction, meta level data fusion concerns about the codification of expert knowledge, and allows the system to automatically extract the attributes identified by the knowledge. However, even though the meta level is application specific, it is not practical to expect the experts to know all the anomalies and code them in the production rules. One possible research direction is to learn the rules from data. Fu and Booth describes the induction of regular grammar based on the idea of constructing a ”canonical” grammar that generates only the strings in the training data and then replaces pairs of nonterminals with a single nonterminals repeatedly to attempt generalization. Stockle proposes the application of Bayesian model merging to iteratively construct nonterminals that can generate the strings in the training data. However, because the number of all possible grammatical structures for a given positive training data becomes exponential in the length of the data. The hypothesis space of context free grammar is very large, and the identification often results in over fitting. One possible approach may be to use genetic programming that is often applied to problems outside the standard function optimization formulation. Genetic programming is an extension of genetic algorithms, where the genetic algorithms usually operates on a population of fixed length binary strings, the genetic programming operates on a population of parse trees. The mutation and cross over operators of the genetic programming can be iteratively applied to a population of parse trees in the search of a tree that minimize a certain cost function. Because of the genetic 161  programming ability to deal with tree structure and discrete knowledge representation, it may be plausible candidate in solving the grammatical structure identification problem.  162  Bibliography [1] Y. Bar-Shalom and X. Li. Estimation and Tracking: Principles, Techniques, and Software. Artech House, 1993. [2] S. S. Blackman and R. Popoli. Design and Analysis of Modern Tracking Systems. Artech House, 1999. [3] G. Chanda and F. Dellaert. Grammatical methods in computer vision: an overview. Technical report, Georgia Institute of Technology, 2004. [4] K. Lari and S. J. Young. The estimation of stochastic context free grammars using the Inside-Outside algorithm. Computer Speech and Language, 4:35–56, 1990. [5] Y. Sakakibara. Grammatical inference in bioinformatics. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27:1051–1062, 2005.  163  Appendix A  Functional Specification of the Mercury Emitter This appendix contains a sanitized version of a textual intelligence report describing the functionality of the emitter called “Mercury”1 .  A.1  General Remarks  The timing of this emitter is based on a crystal-controlled clock. Each cycle of the clock is known as a crystal count (Xc) and the associated time interval is the clock period. All leading edge emission times and dead-times can be measured in crystal counts (integer multiples of the clock period). Most of the information below relates to search, acquisition and tracking functions only. Missile engagement modes (launching, guidance and fusing) can also be fit into the structure below, but with some modifications.  A.2  Radar Words  The timing of this emitter is dictated by a sub-structure called a word. Words occur sequentially in the pulse train so that one word begins as the previous word is ending. There are nine distinct words, denoted by w1 , . . . , w9 . Each has the same length (on the order of several milliseconds), and is associated with a fixed integer number of crystal counts. For the purpose of this paper we will consider all words of the radar distinguishable from each other.  A.3  Time-Division Multiplexing – Phrases and Clauses  This system is an Multi-Function Radar capable of engaging five targets in a time-multiplexed fashion using structures called clauses and phrases. A phrase is a sequence of four consecutive words. A clause is a sequence of five consecutive phrases (see Fig.A.1). 1 The specification of this emitter was provided by Dr. Fred A. Dilkes of Defence R&D Canada. It is based on specifications of some real-life Anti-Aircraft Defense radars, but has been altered and declassified before the release.  164  Clause  Phrase 1  1  2  3  Phrase 2  4  5  6  7  Phrase 3  8  9  10  11  Phrase 4  12  13  14  15  Phrase 5  16  17  18  19  20  Figure A.1: The output sequence of this radar is formed so that the clauses follow each other sequentially. As soon as the last word of the last phrase of a clause is emitted, the first word of the first phrase of the new clause follows. Although the process is linear in time, it is very convenient to analyze the radar output sequence as a two-dimensional table when clauses are stacked together not horizontally, but vertically. In that case, boundaries of phrases associated with multiplexed tasks align, and one can examine each multiplexed activity independently by reading radar output within one phrase from top to bottom.  Each phrase within a clause is allocated to one task, and these tasks are independent of each other. For instance, the radar may search for targets using phrases 1, 3, and 4, while tracking two different targets using phrases 2 and 5.  A.4  Search-while-Track Scan  One of the generic functional states of the radar is a search scan denoted by < FourW Search >. In the < FourW Search > scan, the words are cycled through the quadruplet of words w1 − w2 − w4 − w5 . The radar will complete one cycle (four words) for each beam position as it scans in space. This is done sequentially using all unoccupied word positions and is not dictated by the clause or phrase structure. (Note that the radar does not have to start the cycle with W1 at each beam position; it could, for instance, radiate w4 − w5 − w1 − w2 or any other cyclic permutation at each beam position.) It is possible for the entire system to operate in a search-only state in which no target tracks are maintained during the search. However, < FourW Search > can also be multiplexed with target tracking functions. In the latter case, some of the words within each clause are occupied by target tracking and will not engage in search functions. Only the available phrases (those that are not occupied) are cycled through the quadruplet of words. Since the number of beam positions in the scan is fixed, the rate at which the radar is able to search a given volume of space is proportional to the number of available words; as a result, simultaneous tracking increases the overall scan period. The radar has another scan state called < T hreeW Search >. This is similar to < FourW Search > except that it uses only a triplet of words w1 − w3 − w5 (and dwells on each beam position with 165  only three words). It can also be multiplexed with automatic tracking.  A.5  Acquisition Scan  When the radar search scan detects a target of interest, it may attempt to initiate a track. This requires the radar scan to switch from one of the search behaviors to one of the acquisition patterns. All of the acquisition scans follow these steps sequentially: 1. Switch from Search to Acquisition: The switch from search to acquisition begins with all available words being converted to the same variety of word: one of w1 , . . . , w6 , chosen so as to optimize to the target Doppler shift. Words that are occupied with other tracks continue to perform their tracking function and are not affected by the change from Search to Acquisition. The available words perform one of several scan patterns in which each beam position dwells only for the period of one word. 2. Non-adaptive track: Then, one of the available phrases becomes designated to track the target of interest. This designation will perpetuate until the track is dropped. Correspondingly, either the last three or all four of the words within that designated phrase become associated with the track and switch to w6 (a non-adaptive track without range resolution). The remaining available words continue to radiate in the variety appropriate to the target Doppler. 3. Range resolution: At this point the radar has angular track resolution but still suffers from range ambiguities. After some variable amount of time, the first word in the designated phrase will hop between words w7 , w8 , and w9 , in no predictable order. It will dwell on each of those varieties of words only once in order to resolve the range ambiguity, but dwell-time for each variety is unpredictable. 4. Return from Acquisition to Search: Finally, once the radar has established track, it is ready to terminate the acquisition scan. Thereafter, until the track is dropped, either the last three or all four words of the designated phrase will be occupied with the track and will not be available for search functions or further acquisitions. The radar then returns to one of the search-while-track functions. All occupied words maintain their tracks and all available words (possibly including the first word of the designated track phrase) execute the appropriate scan pattern. Only one acquisition can be performed at any given time.  A.6  Track Maintenance  Each track is maintained by either the last three or all four words of one of the phrases. Those words are considered occupied and cannot participate in search or acquisition functions until 166  the target is dropped. The radar performs range tracking by adaptively changing amongst any of the high pulse repetition frequency words (w6 , . . . , w9 ) in order to avoid eclipsing and maintain their range gates. Occasionally, the system may perform a range verification function on the track by repeating the range resolution steps described above.  167  Appendix B  Justification of Logit Model The Logit model can be justified by utility maximization argument. Consider only binary Logit model for simplicity, the utilities of the decisions (advancing up or down the state space as illustrated in Fig. 3.5) are Uup = au + bu zk + cu s1k + du s2k + εu Udown = ad + bd zk + cd s1k + dd s2k + εd , where ε is random threshold value. The threshold value indicates the amount of threat the MFR could take before switching of states is desired. The threshold value is random because different targets may have different threshold values. Assuming that the MFR always selects the decision with the highest utility, the probability of going up in state can be expressed as Pup = P(Uu > Ud ) = P((au − ad ) + (bu − bd )zk + (cu − cd )s1k + (du − dd )s2k + (εu − εd ) > 0) = P(ε > −(a + bzk + cs1k + ds2k )). Suppose that the random variable ε has the logistic distribution, the probability of advancing up the states, under the utility maximization argument, is expressed as Pup =  exp(a + bzk + cs1k + ds2k ) . 1 + exp(a + bzk + cs1k + ds2k )  A more general discussion for more than two states can be found in [2].  168  Appendix C  GMTI STAP Background The GMTI radar output is an array of detections of ground moving vehicles based on space-time adaptive processing (STAP) with range, angle, and Doppler information. The introduction of both spatial and temporal dimensions is essential for the cancellation of clutter. This effectively introduces a redundancy in the measurement of clutter, and increases the size of the adaptive problem, which can be exploited for clutter cancellation. In fully adaptive STAP, received signals for each element and pulse in the CPI are adaptively weighted. The weight vector can be viewed as a combined receive array beamformer and target Doppler filter. In the ideal case (e.g., when the interference covariance matrix is known), it provides coherent gain on targets while forming nulls in angle and Doppler to suppress clutter. Partially adaptive STAP algorithms have been studied in numerous publications, and several have been shown to achieve near-optimal performance at considerably less computational expense [9; 6]. The partially adaptive approach is to transform the fully adaptive signal space to a relatively small signal space and then solve the reduced-dimension adaptive filtering problem with the transformed data. Finally, there are four main steps in STAP-based GMTI systems for detecting moving targets: 1. Defining the adaptive problem: the adaptation may be performed either prior to, or after, Doppler processing (pre- and post-Doppler STAP, respectively)[9]; 2. Covariance matrix estimation: typically, the Maximum-Likelihood estimate, formed by averaging the Hermitian outer product of the received signal from adjacent range bins; 3. Weight Computation: inversion or projection class of algorithms; 4. Constant False Alarm Rate (CFAR) normalization and parameter estimation: several statistics possible, such as the adaptive matched filter (AMF-CFAR)[5]. In the fully adaptive case, the entire spatio-temporal dimension is preserved. However, the size of the adaptive problem makes it computationally prohibitive. Furthermore, reliable estimation of the covariance matrix for such a large adaptive problem may not be possible. For these reasons, attention is focussed on partially adaptive STAP algorithms which may be Pre-Doppler STAP ( adaptation performed prior to Doppler filtering) or Post-Doppler STAP (adaptation performed after Doppler filtering). 169  The maximum likelihood (ML) estimate of the covariance matrix is formed by averaging the Hermitian outer product of the received signal from adjacent range bins on either side of the † ˆ = 1 ∑2L test cell: R 2L i=1 xi xi , where 2L is the number of range bins (typically, L on either side of the cell under test (CUT) used in the estimate and xi is the signal vector. The two main techniques used for adaptive weight computation are based on estimated covariance matrix inversion and projection. Loaded sample matrix inversion (LSMI) and eigenvector projection (EVP) are examples from the inversion and projection classes of algorithms, respectively. The number of range gates required in the training set for these algorithms for a specified level of performance (ex., -3 dB for normalized signal-to-interference-plus-noise ratio (SINR)) for a stationary environment is 2KN [1; 4]. The two algorithms are defined as: ˆ ϑ ) + α I)−1 d(ϑ , ϕ ) w(ϑ , ϕ ) = (R(  LSMI  (C.1)  and ˆ ϑ )⊥ d(ϑ , ϕ ) w(ϑ , ϕ ) = P(  EVP,  (C.2)  where w(ϑ , ϕ ) is the weight vector, α is typically 2 − 3 times the average noise power, I is the ˆ ϑ )⊥ is the projection matrix formed from the eigenvectors corresponding to identity matrix, P( ˆ ϑ ), and d(ϑ , ϕ ) is the steering vector which is a function of both the dominant eigenvalues of R( Doppler (ϑ ) and angle (ϕ ). The appropriate choice of the number of projected eigenvectors is sample size dependent [3]. Apart from improving numerical stability, the load α I reduces the number of snapshots required for 3dB performance from twice the size of the interference covariance matrix [8] (without loading, i.e., α = 0) to only twice its rank with the load. This results in an improvement in computational efficiency, particularly for large multi-aperture systems. Recall that the steering vector at any spatial frequency (angle) and normalized Doppler frequency is the response of a target at that particular angle and Doppler. Thus, it depends on the transmit antenna pattern. Further, the steering vectors need to be computed only once for any granularity in angle and Doppler we choose. Several constant false alarm rate (CFAR) decision statistics have been studied in the literature. Finally, we use the AMF (Adaptive Matched Filter) CFAR test statistic defined as [5; 7] |x† w(ϑ , ϕ )|2 . d(ϑ , ϕ )† w(ϑ , ϕ )  (C.3)  This follows from the analysis of the adaptive detection algorithm by Kelly where the interference is modeled as Gaussian noise. This is often used as the benchmark for the performance of other CFAR detection algorithms when non-Gaussian interference is considered. It may also be interpreted as the ML estimate of angle and Doppler for the case of Gaussian interference [10]. 170  Appendix D  List of Publications The following publications have been completed based on the work in this thesis. Journal Papers • N. A. Visnevski, V. Krishnamurthy, A. Wang, S. Haykin, “Syntactic Modeling and Signal Processing of Multifunction Radars: A Stochastic Context Free Grammar Approach”, in Proceedings of the IEEE, vol. 95, pp. 1000-1025, 2007. • A. Wang, V. Krishnamurthy, “Signal Interpretation of Multifunction Radars : Modeling and Statistical Signal Processing with Stochastic Context Free Grammars”, in IEEE Transactions on Signal Processing, vol. 56, pp. 1106-1119, 2008. • A. Wang, V. Krishnamurthy, B. Balaji, “Syntactic Tracking and Ground Surveillance with GMTI Radar”, in IEEE Transactions on Signal Processing (submitted). Conference Papers • A. Wang, V. Krishnamurthy, F. A. Dilkes, N. A. Visnevski, “Threat estimation by electronic surveillance of multifunction radars: a stochastic context free grammar approach”, in IEEE International Conference on Decision and Control, pp. 2153-2158, 2006. • A. Wang, V. Krishnamurthy, “Threat Estimation of Multifunction radars : modeling and statistical signal processing of stochastic context free grammars”, in IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. III-793-III-796, 2007. • A. Wang, V. Krishnamurthy, “Modeling and Interpretation of Multifunction Radars with Stochastic Grammar”, in IEEE Aerospace Conference, Montana USA, 2008. • A. Wang, V. Krishnamurthy, “Target Identification and Distributed Cooperative Control of Sensor Networks”, in IEEE International Conference on Communications, Dresden Germany, 2009. • A. Wang, V. Krishnamurthy, B. Balaji, “Meta Level Tracking with Multimode SpaceTime Adaptive Processing of GMTI Data”, in Proceedings of IEEE International Conference on Information Fusion, Washington, USA, July 2009.  171  • A. Wang, J. Araujo, V. Krishnamurthy, “Syntactic Inference for Highway Traffic Analysis”, in Proceedings of IEEE International Conference on Information Fusion, Washington, USA, July 2009.  172  [1] O. P. Cheremisin. Efficiency of adaptive algorithms with regularised sample covariance matrix (in Russian). Radiotechnik und Elektronik, 2(10):1933–1941, 1982. [2] J. S. Cramer. Logit Models From Economics and Other Fields. Cambridge University Press, 2003. [3] C. H. Gierull and B. Balaji. Minimal sample support space-time adative processing with fast subspace techniques. IEE Proc. Radar, Sonar and Navig., 149(5):209–220, Oct 2002. [4] A. M. Haimovich. Asymptotic distribution of the conditional signal-to-noise ratio in an eigenanalysis-based adaptive array. IEEE Transactions on Aerospace and Electronic Systems, 33(3):988–997, July 1997. [5] E. J. Kelly. An adaptive detection algorithm. IEEE Transactions on Aerospace and Electronic Systems, 22(6):115–127, March 1986. [6] R. Klemm. Space-Time Adaptive Processing. IEE Press, Stevenage, UK, 1998. [7] I. S. Reed, Y. L. Gau, and T. K. Truong. CFAR detection and estimation for STAP radar. IEEE Transactions on Aerospace and Electronic Systems, 34(3):722–735, July 1998. [8] I. S. Reed, J. D. Mallet, and L. E. Brennan. Rapid convergence rate in adaptive arrays. IEEE Transactions on Aerospace and Electronic Systems, AES-10(6):853–863, November 1974. [9] J. Ward. Space-time adaptive processing for airborne radar. Technical report 1015, Lincoln Laboratory, MIT, December 1994. [10] J. Ward. Cramer-Rao bounds for target angle and doppler estimation with space-time adaptive processing radar. In Proceedings of the 29th ASILOMAR conference on Signals, Systems and Computers, pages 1198–1203, October 30-November 2 1995.  173  

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.24.1-0067518/manifest

Comment

Related Items

Admin Tools

To re-ingest this item use button below, on average re-ingesting will take 5 minutes per item.

Reingest

To clear this item from the cache, please use the button below;

Clear Item cache