The detection and testing of multivariate outliers

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

The detection and testing of multivariate outliers White, Richard Alan

Abstract

The classical estimators of multivariate location and scatter for the normal model are the sample mean and sample covariance. However, if outliers are present in the data, the classical estimates can be very inaccurate and robust estimates should be used in their place. Most multivariate robust estimators are very difficult if not impossible to compute, thus limiting their use. I will present some simple approximations that make these estimators computable. Robust estimation down weighs or completely ignores the outliers. This is not always best because the outliers can contain some very important information about the population. If they can be identified, the outliers can be further investigated and an appropriate action can be taken based on the results. To detect outliers, a sequential multivariate scale-ratio test is proposed. It is based on a non-robust estimate and a robust estimate of scatter and is applied in a forward fashion, removing the most extreme point at each step, until the test fails to indicate the presence of outliers. We will show that this procedure has level c when applied to an uncontaminated sample, is uneffected by swamping or masking and is accurate in detecting outliers. Finally, we will apply the scale-ratio test to several data sets and compare it to the sequential Wilk’s outlier test as proposed by C. Caroni and P. Prescott in 1992.

Item Metadata

Title	The detection and testing of multivariate outliers
Creator	White, Richard Alan
Publisher	University of British Columbia
Date Issued	1992
Description	The classical estimators of multivariate location and scatter for the normal model are the sample mean and sample covariance. However, if outliers are present in the data, the classical estimates can be very inaccurate and robust estimates should be used in their place. Most multivariate robust estimators are very difficult if not impossible to compute, thus limiting their use. I will present some simple approximations that make these estimators computable. Robust estimation down weighs or completely ignores the outliers. This is not always best because the outliers can contain some very important information about the population. If they can be identified, the outliers can be further investigated and an appropriate action can be taken based on the results. To detect outliers, a sequential multivariate scale-ratio test is proposed. It is based on a non-robust estimate and a robust estimate of scatter and is applied in a forward fashion, removing the most extreme point at each step, until the test fails to indicate the presence of outliers. We will show that this procedure has level c when applied to an uncontaminated sample, is uneffected by swamping or masking and is accurate in detecting outliers. Finally, we will apply the scale-ratio test to several data sets and compare it to the sequential Wilk’s outlier test as proposed by C. Caroni and P. Prescott in 1992.
Extent	1158229 bytes
Genre	Thesis/Dissertation
Type	Text
File Format	application/pdf
Language	eng
Date Available	2008-12-16
Provider	Vancouver : University of British Columbia Library
Rights	For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.
DOI	10.14288/1.0086547
URI	http://hdl.handle.net/2429/2952
Degree (Theses)	Master of Science - MSc
Program (Theses)	Statistics
Affiliation	Science, Faculty of; Statistics, Department of
Degree Grantor	University of British Columbia
Graduation Date	1992-11
Campus	UBCV
Scholarly Level	Graduate
Aggregated Source Repository	DSpace

Item Media

ubc_1992_fall_white_richard_alan.pdf -- 1.1MB

Item Citations and Data

Rights

For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.

Open Collections

UBC Theses and Dissertations

The detection and testing of multivariate outliers White, Richard Alan

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights