Visual mining of powersets with large alphabets

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Visual mining of powersets with large alphabets Kong, Qiang

Abstract

We present the PowerSetViewer visualization system for lattice-based mining of powersets. Searching for items within the powerset of a universe occurs in many large dataset knowledge discovery contexts. Using a spatial layout based on a powerset provides a unified visual framework at three different levels: data mining on the filtered dataset, browsing the entire dataset, and comparing multiple datasets sharing the same alphabet. The features of our system allow users to find appropriate parameter settings for data mining algorithms through lightweight visual experimentation showing partial results. We use dynamic constrained frequent-set mining as a concrete case study to showcase the utility of the system. The key challenge for spatial layouts based on powerset structure is in handling large alphabets, since the size of the powerset grows exponentially with the size of the alphabet. We present scalable algorithms for enumerating and displaying datasets containing between 1.5 and 7 million itemsets, and alphabet sizes of over 40,000.

Item Metadata

Title	Visual mining of powersets with large alphabets
Creator	Kong, Qiang
Publisher	University of British Columbia
Date Issued	2006
Description	We present the PowerSetViewer visualization system for lattice-based mining of powersets. Searching for items within the powerset of a universe occurs in many large dataset knowledge discovery contexts. Using a spatial layout based on a powerset provides a unified visual framework at three different levels: data mining on the filtered dataset, browsing the entire dataset, and comparing multiple datasets sharing the same alphabet. The features of our system allow users to find appropriate parameter settings for data mining algorithms through lightweight visual experimentation showing partial results. We use dynamic constrained frequent-set mining as a concrete case study to showcase the utility of the system. The key challenge for spatial layouts based on powerset structure is in handling large alphabets, since the size of the powerset grows exponentially with the size of the alphabet. We present scalable algorithms for enumerating and displaying datasets containing between 1.5 and 7 million itemsets, and alphabet sizes of over 40,000.
Genre	Thesis/Dissertation
Type	Text
Language	eng
Date Available	2010-01-06
Provider	Vancouver : University of British Columbia Library
Rights	For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.
DOI	10.14288/1.0051718
URI	http://hdl.handle.net/2429/17553
Degree	Master of Science - MSc
Program	Computer Science
Affiliation	Science, Faculty of; Computer Science, Department of
Degree Grantor	University of British Columbia
Graduation Date	2006-05
Campus	UBCV
Scholarly Level	Graduate
Aggregated Source Repository	DSpace

Item Media

ubc_2006-0062.pdf -- 11.5MB

Item Citations and Data

Rights

For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.

Open Collections

UBC Theses and Dissertations

Visual mining of powersets with large alphabets Kong, Qiang

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights