TY - THES
AU - Garden, Cheryl Ellen
PY - 1996
TI - Modeling zero inflated count data
KW - Thesis/Dissertation
LA - eng
M3 - Text
AB - A natural approach to analyzing the effect of covariates on a count response variable is to
use a Poisson regression model. A complication is that the counts are often more variable than
can be explained by a Poisson model. This problem, referred to as overdispersion, has received a
great deal of attention in recent literature and a number of variations on the Poisson regression
model have been developed. As such, statistical consultants are faced with the difficult task of
identifying which of these alternative models is best suited to their particular application. In this
thesis, two applications where the data exhibit overdispersion are investigated. In the first
application, two treatments for chronic urinary tract infections are compared. The response
variable represents the number of resistant strains of bacteria cultured from rectal swabs. In the
second application, the number of units sold of a product are modeled as depending on two
factors representing the day of the week and the store.
Two alternative models that allow for overdispersion are used in both applications. The
negative binomial regression model and the zero inflated Poisson regression model so named by
Lambert (Lambert, 1992) provide improved fits. Further, the zero inflated Poisson regression
model performs particularly well in the situation when the overdispersion is suspected to be due
to a large number of zeroes occurring in the data. The zero inflated Poisson regression model
allows one to both fit the data well and make some inference regarding the nature of the
overdispersion present. This little known model may prove to be valuable as there exist a
number of applications where observed overdispersion in a count response variable is clearly due
to an inflated number of zeroes.
N2 - A natural approach to analyzing the effect of covariates on a count response variable is to
use a Poisson regression model. A complication is that the counts are often more variable than
can be explained by a Poisson model. This problem, referred to as overdispersion, has received a
great deal of attention in recent literature and a number of variations on the Poisson regression
model have been developed. As such, statistical consultants are faced with the difficult task of
identifying which of these alternative models is best suited to their particular application. In this
thesis, two applications where the data exhibit overdispersion are investigated. In the first
application, two treatments for chronic urinary tract infections are compared. The response
variable represents the number of resistant strains of bacteria cultured from rectal swabs. In the
second application, the number of units sold of a product are modeled as depending on two
factors representing the day of the week and the store.
Two alternative models that allow for overdispersion are used in both applications. The
negative binomial regression model and the zero inflated Poisson regression model so named by
Lambert (Lambert, 1992) provide improved fits. Further, the zero inflated Poisson regression
model performs particularly well in the situation when the overdispersion is suspected to be due
to a large number of zeroes occurring in the data. The zero inflated Poisson regression model
allows one to both fit the data well and make some inference regarding the nature of the
overdispersion present. This little known model may prove to be valuable as there exist a
number of applications where observed overdispersion in a count response variable is clearly due
to an inflated number of zeroes.
UR - https://open.library.ubc.ca/collections/831/items/1.0099036
ER - End of Reference