Efficient data mining of constrained association rules

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Efficient data mining of constrained association rules Pang, Chiu Yan (Alex)

Abstract

With the recent advances in information technology, companies are now collecting more and more data related to their business. Companies are very interested in decision support systems that can discover knowledge from data and help them gain insight into their data. Data mining with the goal of discovering non-trivial information or patterns hidden in large databases has, therefore, recently become one of the most active research areas in database technology. Association rules relate items which tend to occur together in a given event or record. Mining association rules represents one of the most important problems in data mining. However, the current framework suffers seriously from the lack of user interaction and focus. In this thesis, we propose a new paradigm called Constrained Association Rules where (i) the mining of the rules is divided into two phases with various breakpoints for user feedback, and (ii) users can associate constraints with their queries. We analyze many SQL-style constraints and introduce the notions of succinctness and anti-monotonicity for their classification. We design a new algorithm called CAP for mining association rules that satisfy a set of given constraints. The idea is to check for satisfaction of the constraints as early as possible by exploiting the properties of anti-monotonicity and succinctness of the constraints. Several optimization techniques are developed. Our experimental evaluation indicates that CAP runs much faster and can sometimes outrun several basic algorithms by as much as 80 times.

Item Metadata

Title	Efficient data mining of constrained association rules
Creator	Pang, Chiu Yan (Alex)
Publisher	University of British Columbia
Date Issued	1998
Description	With the recent advances in information technology, companies are now collecting more and more data related to their business. Companies are very interested in decision support systems that can discover knowledge from data and help them gain insight into their data. Data mining with the goal of discovering non-trivial information or patterns hidden in large databases has, therefore, recently become one of the most active research areas in database technology. Association rules relate items which tend to occur together in a given event or record. Mining association rules represents one of the most important problems in data mining. However, the current framework suffers seriously from the lack of user interaction and focus. In this thesis, we propose a new paradigm called Constrained Association Rules where (i) the mining of the rules is divided into two phases with various breakpoints for user feedback, and (ii) users can associate constraints with their queries. We analyze many SQL-style constraints and introduce the notions of succinctness and anti-monotonicity for their classification. We design a new algorithm called CAP for mining association rules that satisfy a set of given constraints. The idea is to check for satisfaction of the constraints as early as possible by exploiting the properties of anti-monotonicity and succinctness of the constraints. Several optimization techniques are developed. Our experimental evaluation indicates that CAP runs much faster and can sometimes outrun several basic algorithms by as much as 80 times.
Extent	3271577 bytes
Genre	Thesis/Dissertation
Type	Text
File Format	application/pdf
Language	eng
Date Available	2009-05-26
Provider	Vancouver : University of British Columbia Library
Rights	For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.
DOI	10.14288/1.0051263
URI	http://hdl.handle.net/2429/8197
Degree	Master of Science - MSc
Program	Computer Science
Affiliation	Science, Faculty of; Computer Science, Department of
Degree Grantor	University of British Columbia
Graduation Date	1998-11
Campus	UBCV
Scholarly Level	Graduate
Aggregated Source Repository	DSpace

Item Media

ubc_1998-0571.pdf -- 3.12MB

Item Citations and Data

Rights

For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.

Open Collections

UBC Theses and Dissertations

Efficient data mining of constrained association rules Pang, Chiu Yan (Alex)

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights