BIRS Workshop Lecture Videos

Banff International Research Station Logo

BIRS Workshop Lecture Videos

Investigation of Code Tables to compress and describe the underlying characteristics of binary databases Hess, Sibylle

Description

We inspect the spectrum of methods (from frequent pattern mining to numerical optimization) to extract the pattern set that describes a binary database best. Invoking the Minimum Description Length (MDL) principle, this objective can be stated as: find the code table that compresses the database most. A particularly interesting interpretation of this task, relating it to biclustering, arises from the formulation as a matrix factorisation problem. Biclustering has a variety of applications in research fields such as collaborative filtering, gene expression analysis and text mining. The derived matrix factorisation analogy provides a new perspective on distinct data mining subfields (unifying biclustering and pattern mining concepts such as Krimp), initialising a cross-over of their applications and interpretations of derived models.

Item Media

Item Citations and Data

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International