- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Faculty Research and Publications /
- diceR: an R package for class discovery using an ensemble...
Open Collections
UBC Faculty Research and Publications
diceR: an R package for class discovery using an ensemble driven approach Chiu, Derek S; Talhouk, Aline
Abstract
Background: Given a set of features, researchers are often interested in partitioning objects into homogeneous clusters. In health research, cancer research in particular, high-throughput data is collected with the aim of segmenting patients into sub-populations to aid in disease diagnosis, prognosis or response to therapy. Cluster analysis, a class of unsupervised learning techniques, is often used for class discovery. Cluster analysis suffers from some limitations, including the need to select up-front the algorithm to be used as well as the number of clusters to generate, in addition, there may exist several groupings consistent with the data, making it very difficult to validate a final solution. Ensemble clustering is a technique used to mitigate these limitations and facilitate the generalization and reproducibility of findings in new cohorts of patients. Results: We introduce diceR (diverse cluster ensemble in R), a software package available on CRAN: https://CRAN.R-project.org/package=diceR Conclusions: diceR is designed to provide a set of tools to guide researchers through a general cluster analysis process that relies on minimizing subjective decision-making. Although developed in a biological context, the tools in diceR are data-agnostic and thus can be applied in different contexts.
Item Metadata
Title |
diceR: an R package for class discovery using an ensemble driven approach
|
Creator | |
Publisher |
BioMed Central
|
Date Issued |
2018-01-15
|
Description |
Background:
Given a set of features, researchers are often interested in partitioning objects into homogeneous clusters. In health research, cancer research in particular, high-throughput data is collected with the aim of segmenting patients into sub-populations to aid in disease diagnosis, prognosis or response to therapy. Cluster analysis, a class of unsupervised learning techniques, is often used for class discovery. Cluster analysis suffers from some limitations, including the need to select up-front the algorithm to be used as well as the number of clusters to generate, in addition, there may exist several groupings consistent with the data, making it very difficult to validate a final solution. Ensemble clustering is a technique used to mitigate these limitations and facilitate the generalization and reproducibility of findings in new cohorts of patients.
Results:
We introduce diceR (diverse cluster ensemble in R), a software package available on CRAN:
https://CRAN.R-project.org/package=diceR
Conclusions:
diceR is designed to provide a set of tools to guide researchers through a general cluster analysis process that relies on minimizing subjective decision-making. Although developed in a biological context, the tools in diceR are data-agnostic and thus can be applied in different contexts.
|
Subject | |
Genre | |
Type | |
Language |
eng
|
Date Available |
2018-05-14
|
Provider |
Vancouver : University of British Columbia Library
|
Rights |
Attribution 4.0 International (CC BY 4.0)
|
DOI |
10.14288/1.0366289
|
URI | |
Affiliation | |
Citation |
BMC Bioinformatics. 2018 Jan 15;19(1):11
|
Publisher DOI |
10.1186/s12859-017-1996-y
|
Peer Review Status |
Reviewed
|
Scholarly Level |
Faculty
|
Copyright Holder |
The Author(s).
|
Rights URI | |
Aggregated Source Repository |
DSpace
|
Item Media
Item Citations and Data
Rights
Attribution 4.0 International (CC BY 4.0)