UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Query-driven event search in news information network Chen, Shanshan

Abstract

Traditional search focuses on keyword matching and document ranking, thus users will only get a overload of related news articles or videos, with little semantics aggregation, when they input keywords among which they are interested in exploring potential connections. To capture these connections, one good way is to model news articles into a complex network of events, where an event itself is a complex network of interrelated actions (or assertions). Event search enables users to get a big picture of essential actions involved without going through overwhelming documents. Further on, Query-driven event search, in which users can access key actions that occurred in various events with respect to input query and construct a new event capturing corresponding actions and connections between them, is able to inherently revolutionize search from its traditional keyword matching to a higher level of abstraction. Thus, we propose a natural and useful paradigm, Query- driven event search in News Information Network, to allow users to search through news articles and obtain events as answers. We construct the news information network with nodes representing actions and edges indicating relatedness between nodes. We define a good answer as a structurally densely linked and semantically related subgraph, describing a concise and informative event. Thus our objective function combines both the linkage structure of graph and semantic content relatedness. We then formally define our problem to build a query-driven event search engine that processes users query and generates a subgraph satisfying following conditions: 1) covering all keywords but without noisy nodes 2) maximizes the objective function. We prove this problem is NP-complete and propose heuristics algorithms. Our experimental evaluation on real-life news datasets demonstrates algorithms efficiency and meaningful solutions we obtain.

Item Media

Item Citations and Data

Rights

Attribution-NonCommercial-NoDerivs 2.5 Canada