Open Collections

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

A parallel workflow for online correlation and clique-finding : with applications to finance Rostoker, Camilo

Abstract

This thesis investigates how a state-of-the-art Stochastic Local Search (SLS) algorithm for the maximum clique problem can be adapted for and employed within a fully distributed parallel workfiow environment. First we present parallel variants of Dynamic Local Search (DLS-MC) and Phased Local Search (PLS), demonstrating how a simple yet effective multiple independent runs strategy can offer superior speedup performance with minimal communication overhead. We then extend PLS into an online algorithm so that it can operate in a dynamic environment where the input graph is constantly changing, and show that in most cases trajectory continuation is more efficient than restarting the search from scratch. Finally, we embed our new algorithm within a data processing pipeline that performs high throughput correlation and clique-based clustering of thousands of variables from a high-frequency data stream. For communication within and between system components, we use MPI, the de-facto standard API for message passing in high-performance computing. We present algorithmic and system performance results using synthetically generated data streams, as well as a preliminary investigation into the applicability of our system for processing high-frequency, real-life intra-day stock market data in order to determine clusters of stocks exhibiting highly correlated short-term trading patterns.

Item Metadata

Title	A parallel workflow for online correlation and clique-finding : with applications to finance
Creator	Rostoker, Camilo
Publisher	University of British Columbia
Date Issued	2007
Description	This thesis investigates how a state-of-the-art Stochastic Local Search (SLS) algorithm for the maximum clique problem can be adapted for and employed within a fully distributed parallel workfiow environment. First we present parallel variants of Dynamic Local Search (DLS-MC) and Phased Local Search (PLS), demonstrating how a simple yet effective multiple independent runs strategy can offer superior speedup performance with minimal communication overhead. We then extend PLS into an online algorithm so that it can operate in a dynamic environment where the input graph is constantly changing, and show that in most cases trajectory continuation is more efficient than restarting the search from scratch. Finally, we embed our new algorithm within a data processing pipeline that performs high throughput correlation and clique-based clustering of thousands of variables from a high-frequency data stream. For communication within and between system components, we use MPI, the de-facto standard API for message passing in high-performance computing. We present algorithmic and system performance results using synthetically generated data streams, as well as a preliminary investigation into the applicability of our system for processing high-frequency, real-life intra-day stock market data in order to determine clusters of stocks exhibiting highly correlated short-term trading patterns.
Genre	Thesis/Dissertation
Type	Text
Language	eng
Date Available	2007-08-22
Provider	Vancouver : University of British Columbia Library
Rights	Attribution-NonCommercial-NoDerivatives 4.0 International
DOI	10.14288/1.0051985
URI	http://hdl.handle.net/2429/29661
Degree	Master of Science - MSc
Program	Computer Science
Affiliation	Science, Faculty of; Computer Science, Department of
Degree Grantor	University of British Columbia
Graduation Date	2007-05
Campus	UBCV
Scholarly Level	Graduate
Rights URI	http://creativecommons.org/licenses/by-nc-nd/4.0/
Aggregated Source Repository	DSpace

Open Collections

UBC Theses and Dissertations

UBC Theses and Dissertations

A parallel workflow for online correlation and clique-finding : with applications to finance Rostoker, Camilo

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights