UBC Theses and Dissertations
Deep learning for sequence modelling : applications in natural languages and distributed compressive sensing Palangi, Hamid
The underlying data in many machine learning tasks have a sequential nature. For example, words generated by a language model depend on the previously generated words, behavior of a user in a social network evolves over different snapshots of the social graph over time, different speech frames in a speech recognition system depend on the previously generated frames, etc. The main question is, how can we leverage the sequential nature of data to extract better features for the target machine learning task? In an effort to address this question, this thesis presents three important applications of deep sequence modelling methods. The first application is sentence modelling for web search task where the question addressed is: How to create a vector representation for a natural language sentence, aimed at a specific task such as web search? We propose Long Short-Term Memory Deep Structured Semantic Model (LSTM-DSSM), a model for information retrieval on click-through data with significant performance gains compared to existing state of the art baselines. The proposed LSTM-DSSM model sequentially takes each word in a sentence, extracts its relevant information, and embeds it into a semantic vector. The second application involves distributed compressive sensing, where the main questions addressed are: (a) How to relax the joint sparsity constraint? (b) How to exploit the structural dependency of a group of sparse vectors to reconstruct them better from down-sampled measurements (structures besides sparsity)? (c) How to exploit available offline data during sparse reconstruction at the decoder? We present a deep learning approach to distributed compressive sensing and show that it addresses the above three questions and is almost as fast as greedy methods during reconstruction. The third application is related to speech recognition. The question addressed here is: How to build a recurrent acoustic model for the task of phoneme recognition? We present a Recurrent Deep Stacking Network (R-DSN) architecture for this task. Each module in the R-DSN is initialized with an Echo State Network (ESN), and then all connection weights within the module are fine tuned.
Item Citations and Data
Attribution-NonCommercial-NoDerivatives 4.0 International