Modeling zero inflated count data

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Modeling zero inflated count data Garden, Cheryl Ellen

Abstract

A natural approach to analyzing the effect of covariates on a count response variable is to use a Poisson regression model. A complication is that the counts are often more variable than can be explained by a Poisson model. This problem, referred to as overdispersion, has received a great deal of attention in recent literature and a number of variations on the Poisson regression model have been developed. As such, statistical consultants are faced with the difficult task of identifying which of these alternative models is best suited to their particular application. In this thesis, two applications where the data exhibit overdispersion are investigated. In the first application, two treatments for chronic urinary tract infections are compared. The response variable represents the number of resistant strains of bacteria cultured from rectal swabs. In the second application, the number of units sold of a product are modeled as depending on two factors representing the day of the week and the store. Two alternative models that allow for overdispersion are used in both applications. The negative binomial regression model and the zero inflated Poisson regression model so named by Lambert (Lambert, 1992) provide improved fits. Further, the zero inflated Poisson regression model performs particularly well in the situation when the overdispersion is suspected to be due to a large number of zeroes occurring in the data. The zero inflated Poisson regression model allows one to both fit the data well and make some inference regarding the nature of the overdispersion present. This little known model may prove to be valuable as there exist a number of applications where observed overdispersion in a count response variable is clearly due to an inflated number of zeroes.

Item Metadata

Title	Modeling zero inflated count data
Creator	Garden, Cheryl Ellen
Publisher	University of British Columbia
Date Issued	1996
Description	A natural approach to analyzing the effect of covariates on a count response variable is to use a Poisson regression model. A complication is that the counts are often more variable than can be explained by a Poisson model. This problem, referred to as overdispersion, has received a great deal of attention in recent literature and a number of variations on the Poisson regression model have been developed. As such, statistical consultants are faced with the difficult task of identifying which of these alternative models is best suited to their particular application. In this thesis, two applications where the data exhibit overdispersion are investigated. In the first application, two treatments for chronic urinary tract infections are compared. The response variable represents the number of resistant strains of bacteria cultured from rectal swabs. In the second application, the number of units sold of a product are modeled as depending on two factors representing the day of the week and the store. Two alternative models that allow for overdispersion are used in both applications. The negative binomial regression model and the zero inflated Poisson regression model so named by Lambert (Lambert, 1992) provide improved fits. Further, the zero inflated Poisson regression model performs particularly well in the situation when the overdispersion is suspected to be due to a large number of zeroes occurring in the data. The zero inflated Poisson regression model allows one to both fit the data well and make some inference regarding the nature of the overdispersion present. This little known model may prove to be valuable as there exist a number of applications where observed overdispersion in a count response variable is clearly due to an inflated number of zeroes.
Extent	3575052 bytes
Genre	Thesis/Dissertation
Type	Text
File Format	application/pdf
Language	eng
Date Available	2009-02-11
Provider	Vancouver : University of British Columbia Library
Rights	For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.
DOI	10.14288/1.0099036
URI	http://hdl.handle.net/2429/4495
Degree (Theses)	Master of Science - MSc
Program (Theses)	Statistics
Affiliation	Science, Faculty of; Statistics, Department of
Degree Grantor	University of British Columbia
Graduation Date	1996-05
Campus	UBCV
Scholarly Level	Graduate
Aggregated Source Repository	DSpace

Item Media

ubc_1996-0219.pdf -- 3.41MB

Item Citations and Data

Rights

For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.

Open Collections

UBC Theses and Dissertations

Modeling zero inflated count data Garden, Cheryl Ellen

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights