- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Mining unstructured social streams : cohesion, context...
Open Collections
UBC Theses and Dissertations
UBC Theses and Dissertations
Mining unstructured social streams : cohesion, context and evolution Li, Pei
Abstract
As social websites like Twitter greatly influence people's digital life, unstructured social streams become prevalent, which are fast surging textual post streams without formal structure or schema between posts or inside the post content. Modeling and mining unstructured social streams in Twitter become a challenging and fundamental problem in social web analysis, which leads to numerous applications, e.g., recommending social feeds like "what's happening right now?" or "what are related stories?". Current social stream analysis in response to queries merely return an overwhelming list of posts, with little aggregation or semantics. The design of the next generation social stream mining algorithms faces various challenges, especially, the effective organization of meaningful information from noisy, unstructured, and streaming social content. The goal of this dissertation is to address the most critical challenges in social stream mining using graph-based techniques. We model a social stream as a post network, and use "event" and "story" to capture a group of aggregated social posts presenting similar content in different granularities, where an event may contain a series of stories. We highlight our contributions on social stream mining from a structural perspective as follows. We first model a story as a quasi-clique, which is cohesion-persistent regardless of the story size, and propose two solutions, DIM and SUM, to search the largest story containing given query posts, by deterministic and stochastic means, respectively. To detect all stories in the time window of a social stream and support the context-aware story-telling, we propose CAST, which defines a story as a (k,d)-Core in post network and tracks the relatedness between stories. We propose Incremental Cluster Evolution Tracking (ICET), which is an incremental computation framework for event evolution on evolving post networks, with the ability to track evolution patterns of social events as time rolls on. Approaches in this dissertation are based on two hypotheses: users prefer correlated posts to individual posts in post stream modeling, and a structural approach is better than frequency/LDA-based approaches in event and story modeling. We verify these hypotheses by crowdsourcing based user studies.
Item Metadata
Title |
Mining unstructured social streams : cohesion, context and evolution
|
Creator | |
Publisher |
University of British Columbia
|
Date Issued |
2017
|
Description |
As social websites like Twitter greatly influence people's digital life,
unstructured social streams become prevalent, which are fast surging textual
post streams without formal structure or schema between posts or inside the post
content. Modeling and mining
unstructured social streams in Twitter become a challenging and fundamental
problem in social web analysis, which leads to numerous applications, e.g.,
recommending social feeds like "what's happening right now?" or "what are
related stories?".
Current social stream analysis in response to queries
merely return an overwhelming list of posts, with little aggregation or semantics. The design of
the next generation social stream mining algorithms faces various challenges,
especially, the effective organization of meaningful information from noisy, unstructured,
and streaming social content.
The goal of this dissertation is to address the most critical challenges in
social stream mining using graph-based techniques.
We model a social stream as a post network, and use "event" and
"story" to capture a group of aggregated social posts presenting similar
content in different granularities, where an event may contain
a series of stories.
We highlight our contributions on social stream mining from a structural
perspective as follows. We first model a story as a quasi-clique, which is
cohesion-persistent regardless of the story size, and propose two solutions, DIM
and SUM, to search the largest story containing given query posts, by
deterministic and stochastic means, respectively. To detect all stories in the time window of a
social stream and support the context-aware story-telling, we propose CAST,
which defines a story as a (k,d)-Core in post network and tracks the
relatedness between stories.
We propose Incremental Cluster Evolution Tracking (ICET),
which is an incremental computation framework for event evolution on
evolving post networks, with the ability to track evolution patterns of social
events as time rolls on. Approaches in this dissertation are based on two
hypotheses: users prefer correlated posts to individual posts in post
stream modeling, and a structural approach is better than
frequency/LDA-based approaches in event and story modeling. We verify these hypotheses by
crowdsourcing based user studies.
|
Genre | |
Type | |
Language |
eng
|
Date Available |
2017-03-23
|
Provider |
Vancouver : University of British Columbia Library
|
Rights |
Attribution-NonCommercial-NoDerivatives 4.0 International
|
DOI |
10.14288/1.0343307
|
URI | |
Degree | |
Program | |
Affiliation | |
Degree Grantor |
University of British Columbia
|
Graduation Date |
2017-05
|
Campus | |
Scholarly Level |
Graduate
|
Rights URI | |
Aggregated Source Repository |
DSpace
|
Item Media
Item Citations and Data
Rights
Attribution-NonCommercial-NoDerivatives 4.0 International