Bump hunting in regression revisited

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Bump hunting in regression revisited Harezlak, Jaroslaw

Abstract

Suppose bivariate data [formula] are observed at times a ≤ t1 ≤ t2 ≤ ... ≤ tn ≤ b. Given a nonparametric regression model [formula] mean 0, variance ϭ² , for i = 1, 2 , . . . , n, we want to estimate the number of modes of the underlying regression function m(-) or its derivative. We use the penalized least squares technique to get an estimate of m(-), i.e. the function minimizing [formula] where [formula] dt is a penalty function and ƛ is a smoothing parameter. The estimate of the derivative, m’ is simply the derivative of the estimate, (m’). A new test of multimodality is introduced and its performance is studied. Our idea is motivated by the test proposed by Silverman (1981) concerning the number of modes in the density function. He used a "critical bandwidth" as a test statistic in a kernel smoothing context. He noted that if the data are strongly bimodal, we would need a large value of a bandwidth to obtain a unimodal density estimate. In our case we define the "critical smoothing parameter" Xcrit as the smallest ƛ giving an estimate with the specified number of modes. We use Ac r ; 4 as a test statistic in our new test CriSP. We use bootstrap techniques to assess the performance of our test. We study the effects of the penalty L on the quality of our test via simulation using different regression functions and we compare it with Bowman et.al.'s monotonicity test (1998). CriSP is also applied to the children's growth data in studying the number of bumps in the derivatives of the growth functions. In a sample of 43 boys and 50 girls, our test procedure gives an automatic classification rule in about 80% of the growth curves analyzed.

Item Metadata

Title	Bump hunting in regression revisited
Creator	Harezlak, Jaroslaw
Publisher	University of British Columbia
Date Issued	1998
Description	Suppose bivariate data [formula] are observed at times a ≤ t1 ≤ t2 ≤ ... ≤ tn ≤ b. Given a nonparametric regression model [formula] mean 0, variance ϭ² , for i = 1, 2 , . . . , n, we want to estimate the number of modes of the underlying regression function m(-) or its derivative. We use the penalized least squares technique to get an estimate of m(-), i.e. the function minimizing [formula] where [formula] dt is a penalty function and ƛ is a smoothing parameter. The estimate of the derivative, m’ is simply the derivative of the estimate, (m’). A new test of multimodality is introduced and its performance is studied. Our idea is motivated by the test proposed by Silverman (1981) concerning the number of modes in the density function. He used a "critical bandwidth" as a test statistic in a kernel smoothing context. He noted that if the data are strongly bimodal, we would need a large value of a bandwidth to obtain a unimodal density estimate. In our case we define the "critical smoothing parameter" Xcrit as the smallest ƛ giving an estimate with the specified number of modes. We use Ac r ; 4 as a test statistic in our new test CriSP. We use bootstrap techniques to assess the performance of our test. We study the effects of the penalty L on the quality of our test via simulation using different regression functions and we compare it with Bowman et.al.'s monotonicity test (1998). CriSP is also applied to the children's growth data in studying the number of bumps in the derivatives of the growth functions. In a sample of 43 boys and 50 girls, our test procedure gives an automatic classification rule in about 80% of the growth curves analyzed.
Extent	5206705 bytes
Genre	Thesis/Dissertation
Type	Text
File Format	application/pdf
Language	eng
Date Available	2009-05-25
Provider	Vancouver : University of British Columbia Library
Rights	For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.
DOI	10.14288/1.0088542
URI	http://hdl.handle.net/2429/8123
Degree	Master of Science - MSc
Program	Statistics
Affiliation	Science, Faculty of; Statistics, Department of
Degree Grantor	University of British Columbia
Graduation Date	1998-11
Campus	UBCV
Scholarly Level	Graduate
Aggregated Source Repository	DSpace

Item Media

ubc_1998-0458.pdf -- 4.97MB

Item Citations and Data

Rights

For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.

Open Collections

UBC Theses and Dissertations

Bump hunting in regression revisited Harezlak, Jaroslaw

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights