BIRS Workshop Lecture Videos
Adaptive choice of parameters in Robust Clustering for model based clustering Garcia-Escudero, Luis Angel
Outliers can be extremely harmful when applying well-known Cluster Analysis methods. More- over, clustered outliers can be also troublesome for traditional robust techniques. Therefore, the devel- oping of appropriate robust clustering methods could be useful for addressing simultaneously both types of problems. The TCLUST method is a flexible way for doing robust cluster analysis by resorting to trimming. This methodology can be implemented by using the tclust package available at the CRAN repository. This high flexibility allows to deal with non-necessarily spherical clusters and to cope with dif- ferent amounts/types of contamination. However, due to this high flexibility, the use of this methodology in real data applications is not completely straightforward and requires the specification of some tuning parameters (the number of clusters, the trimming proportion and a constant constraining the relative clus- ters shapes and sizes). A fully automatic way to choose simultaneously all these parameters is not feasible given that their dependence on the desired type of cluster partition. I.e., the user of any clustering method must always play an active role by specifying the type of clusters that he/she is particularly interested in. When applying TCLUST, we will present some new graphical and automatized procedures which may help the user in making easier this specification. These procedures allow TCLUST to be initialized with less risky parameters configurations which can be later adapted to the data set at hand.
Item Citations and Data
Attribution-NonCommercial-NoDerivatives 4.0 International