Two problems in dynamic graph analytics : maximum flow and online partitioning

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Two problems in dynamic graph analytics : maximum flow and online partitioning Luo, Juntong

Abstract

Many real-world systems, such as social networks and financial transactions, evolve quickly over time and can be modelled as dynamic graphs. Dynamic graph processing systems emerged to provide timely analysis on these graphs, ingesting graph updates through one or more streams, and providing (on-demand or regularly) answers to a standing query. This thesis explores both the algorithmic and the infrastructural aspects of dynamic graph processing. On the algorithm side, this thesis proposes the vertex-local invariant restoration approach for designing dynamic graph algorithms. With this approach, we propose a novel algorithm for the maximum flow problem, targeting graphs that evolve quickly with millions of edge changes per second. The algorithm works well on a shared-nothing, asynchronous computational model in which the graph topology is updated concurrently as the algorithm executes. We prove that the algorithm is correct and provide a thorough experimental evaluation with difficult cases on large real-world dynamic graphs, showing that the algorithm obtains high throughputs, supports both additions and deletions efficiently, and provides results with low latency and high stability. On the infrastructure side, this thesis explores a new problem that appears when processing dynamic graphs: online partitioning. A common model in dynamic graph processing is that the system ingests a stream of edge updates (edge add/remove/modify events), and creates and assigns a vertex—and its edge list—to a partition when first observing an edge that references the vertex. The common solution is to use random partitioning—vertices assigned to partitions based on hashes of their IDs, but it leads to suboptimal edge-cut minimization and/or load balance. This thesis proposes the Minimum Load with Affinity framework for systematic exploration of the trade-offs in a partitioning strategy: load balance, edge-cut minimization, and ingestion delay. The preliminary results show that MLA can (i) achieve a 7%-14% reduction in edge-cuts without sacrificing load balance (or a much higher reduction by sacrificing load balance), and (ii) achieve both better edge-cut minimization and load balance by ingesting edges in batches and reasoning about batch properties (at the cost of higher ingestion latency).

Item Metadata

Title	Two problems in dynamic graph analytics : maximum flow and online partitioning
Creator	Luo, Juntong
Supervisor	Ripeanu, Matei
Publisher	University of British Columbia
Date Issued	2025
Description	Many real-world systems, such as social networks and financial transactions, evolve quickly over time and can be modelled as dynamic graphs. Dynamic graph processing systems emerged to provide timely analysis on these graphs, ingesting graph updates through one or more streams, and providing (on-demand or regularly) answers to a standing query. This thesis explores both the algorithmic and the infrastructural aspects of dynamic graph processing. On the algorithm side, this thesis proposes the vertex-local invariant restoration approach for designing dynamic graph algorithms. With this approach, we propose a novel algorithm for the maximum flow problem, targeting graphs that evolve quickly with millions of edge changes per second. The algorithm works well on a shared-nothing, asynchronous computational model in which the graph topology is updated concurrently as the algorithm executes. We prove that the algorithm is correct and provide a thorough experimental evaluation with difficult cases on large real-world dynamic graphs, showing that the algorithm obtains high throughputs, supports both additions and deletions efficiently, and provides results with low latency and high stability. On the infrastructure side, this thesis explores a new problem that appears when processing dynamic graphs: online partitioning. A common model in dynamic graph processing is that the system ingests a stream of edge updates (edge add/remove/modify events), and creates and assigns a vertex—and its edge list—to a partition when first observing an edge that references the vertex. The common solution is to use random partitioning—vertices assigned to partitions based on hashes of their IDs, but it leads to suboptimal edge-cut minimization and/or load balance. This thesis proposes the Minimum Load with Affinity framework for systematic exploration of the trade-offs in a partitioning strategy: load balance, edge-cut minimization, and ingestion delay. The preliminary results show that MLA can (i) achieve a 7%-14% reduction in edge-cuts without sacrificing load balance (or a much higher reduction by sacrificing load balance), and (ii) achieve both better edge-cut minimization and load balance by ingesting edges in batches and reasoning about batch properties (at the cost of higher ingestion latency).
Genre	Thesis/Dissertation
Type	Text
Language	eng
Date Available	2025-05-01
Provider	Vancouver : University of British Columbia Library
Rights	Attribution-NonCommercial-NoDerivatives 4.0 International
DOI	10.14288/1.0448695
URI	http://hdl.handle.net/2429/90945
Degree	Master of Applied Science - MASc
Program	Electrical and Computer Engineering
Affiliation	Applied Science, Faculty of; Electrical and Computer Engineering, Department of
Degree Grantor	University of British Columbia
Graduation Date	2025-05
Campus	UBCV
Scholarly Level	Graduate
Rights URI	http://creativecommons.org/licenses/by-nc-nd/4.0/
Aggregated Source Repository	DSpace

Open Collections

UBC Theses and Dissertations

UBC Theses and Dissertations

Two problems in dynamic graph analytics : maximum flow and online partitioning Luo, Juntong

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights