- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- A parallel workflow for online correlation and clique-finding...
Open Collections
UBC Theses and Dissertations
UBC Theses and Dissertations
A parallel workflow for online correlation and clique-finding : with applications to finance Rostoker, Camilo
Abstract
This thesis investigates how a state-of-the-art Stochastic Local Search (SLS) algorithm for the maximum clique problem can be adapted for and employed within a fully distributed parallel workfiow environment. First we present parallel variants of Dynamic Local Search (DLS-MC) and Phased Local Search (PLS), demonstrating how a simple yet effective multiple independent runs strategy can offer superior speedup performance with minimal communication overhead. We then extend PLS into an online algorithm so that it can operate in a dynamic environment where the input graph is constantly changing, and show that in most cases trajectory continuation is more efficient than restarting the search from scratch. Finally, we embed our new algorithm within a data processing pipeline that performs high throughput correlation and clique-based clustering of thousands of variables from a high-frequency data stream. For communication within and between system components, we use MPI, the de-facto standard API for message passing in high-performance computing. We present algorithmic and system performance results using synthetically generated data streams, as well as a preliminary investigation into the applicability of our system for processing high-frequency, real-life intra-day stock market data in order to determine clusters of stocks exhibiting highly correlated short-term trading patterns.
Item Metadata
Title |
A parallel workflow for online correlation and clique-finding : with applications to finance
|
Creator | |
Publisher |
University of British Columbia
|
Date Issued |
2007
|
Description |
This thesis investigates how a state-of-the-art Stochastic Local Search (SLS)
algorithm for the maximum clique problem can be adapted for and employed
within a fully distributed parallel workfiow environment. First we present
parallel variants of Dynamic Local Search (DLS-MC) and Phased Local
Search (PLS), demonstrating how a simple yet effective multiple independent runs strategy can offer superior speedup performance with minimal
communication overhead. We then extend PLS into an online algorithm
so that it can operate in a dynamic environment where the input graph is
constantly changing, and show that in most cases trajectory continuation
is more efficient than restarting the search from scratch. Finally, we embed
our new algorithm within a data processing pipeline that performs high throughput
correlation and clique-based clustering of thousands of variables
from a high-frequency data stream. For communication within and between
system components, we use MPI, the de-facto standard API for message
passing in high-performance computing. We present algorithmic and system
performance results using synthetically generated data streams, as well as a
preliminary investigation into the applicability of our system for processing
high-frequency, real-life intra-day stock market data in order to determine
clusters of stocks exhibiting highly correlated short-term trading patterns.
|
Genre | |
Type | |
Language |
eng
|
Date Available |
2007-08-22
|
Provider |
Vancouver : University of British Columbia Library
|
Rights |
Attribution-NonCommercial-NoDerivatives 4.0 International
|
DOI |
10.14288/1.0051985
|
URI | |
Degree | |
Program | |
Affiliation | |
Degree Grantor |
University of British Columbia
|
Graduation Date |
2007-05
|
Campus | |
Scholarly Level |
Graduate
|
Rights URI | |
Aggregated Source Repository |
DSpace
|
Item Media
Item Citations and Data
Rights
Attribution-NonCommercial-NoDerivatives 4.0 International