BIRS Workshop Lecture Videos
Detecting cellwise outliers Rousseeuw, Peter
A multivariate dataset consists of n observations in p dimensions, and is often stored in an n by p matrix X. Robust statistics has mostly focused on identifying and downweighting outlying rows of X, called rowwise or casewise outliers. However, downweighting an entire row if only one (or a few) of its cells are deviating entails a huge loss of information. Also, in high-dimensional data the majority of the rows may contain a few contaminated cells, which yields a loss of robustness as well. Recently new robust methods have been developed for datasets with missing values and with cellwise outliers, also called elementwise outliers. We will explore the detection of cellwise outliers, and compare the misclassification rates between methods by means of simulations in which the data contain cellwise outliers, rowwise outliers, or both simultaneously. The result of a cellwise outlier detection rule can also be used in a second step which robustly estimates the underlying location and scatter matrix. We will compare the accuracy of a few such two-step procedures.
Item Citations and Data
Attribution-NonCommercial-NoDerivatives 4.0 International