UBC Faculty Research and Publications

A proposed case-control framework to probabilistically classify individual deaths as expected or excess… Henderson, Sarah B; Gauld, Jillian S; Rauch, Stephen A; McLean, Kathleen E; Krstic, Nikolas; Hondula, David M; Kosatsky, Tom Nov 15, 2016

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


52383-12940_2016_Article_195.pdf [ 690.87kB ]
JSON: 52383-1.0347313.json
JSON-LD: 52383-1.0347313-ld.json
RDF/XML (Pretty): 52383-1.0347313-rdf.xml
RDF/JSON: 52383-1.0347313-rdf.json
Turtle: 52383-1.0347313-turtle.txt
N-Triples: 52383-1.0347313-rdf-ntriples.txt
Original Record: 52383-1.0347313-source.json
Full Text

Full Text

METHODOLOGY Open AccessA proposed case-control framework toprobabilistically classify individual deathsas expected or excess during extreme hotweather eventsSarah B. Henderson1,2*, Jillian S. Gauld1, Stephen A. Rauch1, Kathleen E. McLean1, Nikolas Krstic1,David M. Hondula3 and Tom Kosatsky1AbstractBackground: Most excess deaths that occur during extreme hot weather events do not have natural heat recordedas an underlying or contributing cause. This study aims to identify the specific individuals who died because of hotweather using only secondary data. A novel approach was developed in which the expected number of deathswas repeatedly sampled from all deaths that occurred during a hot weather event, and compared with deathsduring a control period. The deaths were compared with respect to five factors known to be associated with hotweather mortality. Individuals were ranked by their presence in significant models over 100 trials of 10,000repetitions. Those with the highest rankings were identified as probable excess deaths. Sensitivity analyses wereperformed on a range of model combinations. These methods were applied to a 2009 hot weather event in greaterVancouver, Canada.Results: The excess deaths identified were sensitive to differences in model combinations, particularly betweenunivariate and multivariate approaches. One multivariate and one univariate combination were chosen as the bestmodels for further analyses. The individuals identified by multiple combinations suggest that marginalizedpopulations in greater Vancouver are at higher risk of death during hot weather.Conclusions: This study proposes novel methods for classifying specific deaths as expected or excess duringa hot weather event. Further work is needed to evaluate performance of the methods in simulation studiesand against clinically identified cases. If confirmed, these methods could be applied to a wide range of populationsand events of interest.Keywords: Extreme hot weather, Population mortality, Vulnerability, Case-control, Administrative data, Public healthBackgroundExtreme hot weather events have been associated withsharp increases in population mortality. The mostdramatic examples include Chicago in 1995 [1, 2],Western Europe in 2003 [3–7], and Moscow in 2010[8], but there are many examples of lesser impactsworldwide [9–14]. One hallmark of these events isthat very few of the deaths meet the criteria [15] forbeing certified with ambient heat as the underlyingcause [16–19], making it challenging to separate theexcess heat-related deaths from the expected deathsthat would have occurred regardless of temperature.A simple method for probabilistically identifying theseexcess deaths would be valuable for supporting epide-miologic analyses and improving public health out-reach during future hot weather.In the summer of 2009 the metropolitan area ofgreater Vancouver, Canada experienced an extremehot weather event that resulted in a 40% increase in* Correspondence: sarah.henderson@bccdc.ca1Environmental Health Services, British Columbia Centre for Disease Control,655 West 12th Avenue, Vancouver, BC V5Z 4R4, Canada2School of Population and Public Health, The University of British Columbia,2206 East Mall, 3rd Floor, Vancouver, BC V6T 1Z3, CanadaFull list of author information is available at the end of the article© The Author(s). 2016 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, andreproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link tothe Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.Henderson et al. Environmental Health  (2016) 15:109 DOI 10.1186/s12940-016-0195-zmortality over a 7-day period. However, only one ofthe deaths during this period included an Inter-national Classification of Diseases (ICD) code for am-bient heat as the underlying cause. Rapid, case-onlyanalyses conducted with the administrative vital statis-tics data indicated that mortality during the hotweather event was shifted towards deaths in the com-munity, younger seniors (aged 65–74 years), denselypopulated areas, and more deprived areas [17]. Recentstudies conducted in other urban areas have also indi-cated that a lack of residential greenness contributesto an increased risk of mortality during extreme hotweather [19, 20].Here we leverage these known associations to prob-abilistically separate the excess deaths from the expecteddeaths using a novel resampling approach applied withina case-control framework. The reasoning is as follows:(1) some of the deaths that occur during an extreme hotweather event are expected, regardless of temperature;(2) those expected deaths are similar to deaths thatoccur during typical summer weather with respect tocharacteristics that describe the decedents, their resi-dential neighborhoods, and the circumstances of theirdeaths; (3) if the expected deaths alone were to be in-cluded in a case-control analysis with typical weatherdeaths, the effects of variables known to be associatedwith heat-related mortality should be null; (4) if theexpected number of deaths is repeatedly sampledfrom all deaths observed during the extreme hot wea-ther event, deaths that are consistently present innon-null subsets are more likely to have occurred be-cause of the heat.MethodsContextThe 2009 extreme hot weather event in greater Vancouverhas been described in detail elsewhere [17, 20]. In brief,Vancouver is a coastal city with a population of 2.8million residents and a mild climate. The average(standard deviation) high temperature on summer days(June through August) is 21 °C (3 °C) at VancouverInternational Airport (YVR). The hottest 7-day periodof 2009 occurred between July 27 and August 2, withhigh temperatures ranging from 26 °C to 34 °C and lowtemperatures ranging from 16 °C to 22 °C at YVR. Al-though these conditions may not seem extreme whencompared with those experienced elsewhere, they wereunprecedented in the history of the study area [20].There were 411 adult deaths from all causes during thisperiod, compared with an average (standard deviation)of 297 (22) during all other summer weeks of 2009–2012. Deaths started to increase above baseline on July26, and the increase was sustained until August 2 withno clear evidence of mortality displacement (Fig. 1).Here we define the extreme event as July 27 throughAugust 2 to be consistent with our previous case-onlyanalyses, and we do not consider the impacts of airpollution because we have already reported that theambient concentrations remained relatively low [17].Data sourcesAll summer deaths from 2009 to 2012 were extractedfrom the administrative vital statistics data received dailyby the British Columbia Centre for Disease Control(BCCDC) to support its public health surveillance andFig. 1 Daily time series of the greater Vancouver extreme hot weather event in the summer of 2009. The maximum temperatures measured atVancouver International Airport are shown on the right-hand axis, and the 411 deaths included in the pool of cases are shown as the darker barsHenderson et al. Environmental Health  (2016) 15:109 Page 2 of 10protection mandates. The database contains fields fordate of death, age, sex, location of death (includinghome, hospital, residential institution, or other), under-lying cause of death (ICD, 10th revision), and residential6-digit postal code. The latter was used to geolocateevery record and to assign variables for neighborhoodpopulation density, deprivation, and vegetative green-ness. Population density estimates for the year 2010 weretaken from the Gridded Population of the World(GWPv4) project, which models the global distributionof the human population at a spatial resolution of 30 arcseconds (~1km at the equator) using all available censusdata [21]. Deprivation quintiles were assigned from 2006values of the Vancouver Deprivation Index (VANDIX),which was developed with local public health practi-tioners to reflect the variables most relevant to health inthe city [22]. Neighborhood greenness was measured bythe normalized difference vegetation index (NDVI),which was taken from the summer of 2009 Landsat 7ETM+ composite for the region. The NDVI can takevalues from −1.0 to 1.0, with surfaces such as rock hav-ing values <0.1 and healthy forests having values >0.6[23]. Previous work has shown that lower NDVI valuesare associated with increased land surface temperatureand the urban heat island effect [24].Statistical approachAll 411 adult deaths that occurred from 27 July through2 August 2009 were included in the pool of cases (Fig. 1).All 11,632 adult deaths that occurred during thesummers (June, July and August) of 2010–2012 were in-cluded in the pool of controls. This control period waschosen (1) to ensure that all controls survived the 2009event and (2) because there were no heat health emer-gencies in the greater Vancouver during those summers.The regional heat health emergency warning system wasdeveloped in response to the excess mortality observedduring the 2009 event [20]. The case-control analyseswere conducted using five variables that have previouslybeen associated with mortality during extreme hot wea-ther events: age; location of death; population density;neighborhood deprivation quintile (VANDIX); andneighborhood greenness (NDVI). The continuous agevariable was categorized to <75 and ≥ 75 years becauseour previous work indicated that the lower age categorywas at higher risk during this event [17]. The NDVIvalues were normally distributed and used as a continu-ous variable in the analyses. The location of death was afactor variable with four categories, and death in hospitalwas used as the reference. Both VANDIX and populationdensity were ordered quintiles that we used as con-tinuous variables in the analyses. The populationdensity was converted to quintiles because it followedno clear distribution and its peak was at the very endof its right tail, reflecting the dense urban populationin some parts of the city.The basic methodological framework was a logisticregression model into which 297 deaths from the pool of411 cases and 1188 deaths from the pool of 11,632controls were randomly selected, creating a 4:1 ratio ofcontrols to cases. The expected number of 297 deathswas taken from the average mortality during typicalsummer weeks from 2009 to 2012. If the model was null,all of the cases were flagged as being selected into a nullmodel; if the model was significant, all of the cases wereflagged as being selected into a significant model. Amodel was defined as significant when any independentvariable had a p-value less than the specified alpha value,as described in the following section on the univariateand multivariate modelling combinations. This processwas repeated 10,000 times over 100 trials. We conducted100 trials to evaluate the consistency of the resultsbetween trials.Within each trial tallies were recorded reflecting (A)the number of times each case was randomly selectedinto a repetition, and (B) the number of times each caseappeared in a significant model. After all repetitionswere complete, the value for (B) was divided by thevalue for (A), resulting in a probability ratio of selectioninto significant models over all selections for each case.The cases were then ranked according to this value. Thetop 114 (411 total deaths – 297 expected deaths) wereclassified as the more probable excess deaths while thebottom 297 cases were classified as the more probableexpected deaths. Once the 100 trials were complete, thepercentage of trials in which a case was identified as amore probable excess death was calculated. The 114cases that had the highest values were identified as themost probable excess deaths. The standard deviation ofthe ranking for each case across trials was recorded toevaluate the consistency of the results. We chose 10,000repetitions to optimize the balance between the mean ofthese values and the computation time per trial.Modeling combinationsWithin this framework there are modeling choices thatcan impact the final results. We used a series of 12 com-binations to assess sensitivity to these choices, with vari-ations in the (1) type of model, (2) evaluation of variablesignificance, (3) importance of coefficient direction, and(4) alpha value (Table 1). The model type was either uni-variate or multivariate. In the univariate combinationseach of the five variables was evaluated separately over10,000 repetitions for the selected cases and controls;then the significance and selection tallies were summedacross variables before the probability ratio was calcu-lated and the case ranking was performed. In the multi-variate combinations all five variables were included in aHenderson et al. Environmental Health  (2016) 15:109 Page 3 of 10single model over 10,000 repetitions. Within the multi-variate models, the probability ratio was calculated usingeither (1) the tally of repetitions with at least one signifi-cant variable, or (2) the tally of the count of the numberof significant variables in each model repetition. The im-portance of coefficient direction was also tested for boththe univariate and multivariate combinations. The firstvariation counted any significant variable, regardless ofits direction. The second variation only tallied signifi-cance if the coefficient was in the expected directionbased on previous work, as follows: positive for theyounger age category compared with the older agecategory; negative for increasing NDVI; positive for anylocation of death compared with death in hospital;positive for increasing deprivation; and positive forincreasing population density. Finally, alpha values of0.10 and 0.05 were tested for the significance threshold.Assessment of the resultsOutcomes of the analyses were compared between the12 combinations. The standard deviation of rank wascalculated for each case across the 100 trials, and vari-ability in the standard deviations for the 411 cases wasused as an indicator of consistency across trials. Themost consistent univariate and multivariate combina-tions were selected for more detailed description of theanalytic results. We expected the probable excess deathsto be significantly different from the controls with re-spect to each of the five variables used in the analyses.Conversely, we expected the probable expected deathsnot to be significantly different from the controls withrespect to each variable. As such, statistical assessmentof these assumptions was tested for each combination. Atwo sample t-test was used for the NDVI variable, chi-square tests were used for the categorical age and locationof death variables, and the Wilcoxon Mann-Whitney testwas used for the ordinal deprivation and populationdensity variables. 7We also expected that the probableexcess deaths would occur more frequently on the hot-ter days of the study period (Fig. 1), which we assessedfor the most consistent combinations. Finally, the over-lap in probable excess deaths was compared betweenall combinations to evaluate consistency associatedwith the modeling decisions.ResultsComparison of combinationsThere were differences between the average standard devi-ation of ranks across trials for all 12 combinations (Table 1),and most pair-wise comparisons between combinationswere significantly different (not shown). Univariate com-bination #2 had the lowest variability in the univariategroup (mean standard deviation = 36.6 positions), whilemultivariate combination #1 had the lowest variability inthe multivariate group (mean standard deviation = 50.6 po-sitions). These were chosen as the most promising combi-nations for further analyses, as described in the followingsection. Univariate combination #2 counted a repetition assignificant at the 0.10 level without considering the direc-tion of the effect. Multivariate combination #1 counted thenumber of variables that were significant at the 0.10 levelwhen the effect was in the expected direction (Table 1).Table 1 Summary of variants for each combination and standard deviations (SD) for case ranks between trialsCombination Significance tally Coefficient in expecteddirectionAlpha Mean ofrank SDSD of rank SD Median ofrank SDInterquartile rangeof rank SDUnivariate Combinations#1 Summed across variables T 0.10 36.94 12.09 40.37 17.50#2 Summed across variables F 0.10 36.61 11.67 40.10 15.81#3 Summed across variables T 0.05 41.12 12.68 44.44 17.72#4 Summed across variables F 0.05 41.05 12.71 44.40 18.18Multivariate Combinations#1 Count of significant variables T 0.10 50.56 12.77 54.23 17.52#2 Count of significant variables F 0.10 50.56 12.92 54.43 17.76#3 Count of significant variables T 0.05 60.07 14.76 64.25 18.56#4 Count of significant variables F 0.05 60.31 14.81 63.89 22.29#5 At least one significant variable T 0.10 89.74 13.62 93.17 15.21#6 At least one significant variable F 0.10 89.83 13.44 94.09 14.82#7 At least one significant variable T 0.05 66.83 18.16 74.09 27.79#8 At least one significant variable F 0.05 67.07 17.95 74.75 23.72Bold highlighting indicates the univariate and multivariate combinations selected as the most promising combinations based on minimized variability in theranking resultsHenderson et al. Environmental Health  (2016) 15:109 Page 4 of 10The statistical differences between the probable excessdeaths and the pool of controls were assessed for eachof the five variables (Fig. 2). The probable excess deathsidentified by the univariate combinations were signifi-cantly different from the pool of controls with respect toall variables. Those identified by the multivariate combi-nations #1–4 (in which the number of significant vari-ables was tallied) were also significantly different fromthe controls for all variables, with the exception ofgreenness for combinations #3–4. Those identified bymultivariate combinations #5–8 (in which the presenceof at least one significant variable was tallied) did notdiffer from controls with respect to greenness or popula-tion density. Further, those identified by combination #7(in which significance was tallied at the 0.05 level) didnot differ from controls with respect to deprivation.Statistical differences between the probable expecteddeaths and the pool of controls were also assessed foreach of the five variables (Fig. 2), under the assumptionthat the differences would be small. The probable ex-pected deaths identified by the univariate combinationswere significantly different from the pool of controlswith respect to all variables except location of death,though many of the p-values were in the 0.01 to 0.05range. Those identified by the multivariate combinations#1–4 (in which the number of significant variables wastallied) were not different from the controls for thegreenness and age category variables, and combinations#3–4 were not different for the location of death vari-able. Finally, those identified by multivariate combina-tions #5–8 (in which the presence of at least onesignificant variable was tallied) were only different fromthe controls for the location of death variable. Overall, itappears that differences were driven by one or two vari-ables for each group of combinations.The percentage of overlap in the probable excessdeaths identified between each pair of combinations wasevaluated (Fig. 3). Univariate combinations #1–4 showedstrong internal consistency with overlap of >90% be-tween all pairs. Multivariate combinations #1–4 showedsimilar internal consistency (mean overlap 81.9%), andso did multivariate combinations #5–8 (mean overlap93.3%). The univariate combinations were more consist-ent with multivariate combinations #1–4 (mean overlap58.0%), than with multivariate combinations #5–8 (meanoverlap 45.6%). Multivariate combinations #3–4 wereFig. 2 Heatmap comparing probable excess deaths, probable expected deaths, and controls for the five variables. Statistical analysis was performedcomparing probable excess deaths (top) and probable expected deaths (bottom) from each combination with the pool of controls across the fivevariables. A t-test was used for greenness (NDVI), the Wilcoxon rank-sum test was used for the deprivation and population density variables,and chi-squared tests were used for the location of death and age category variables. Tests were repeated for all combinations. Color representsthe significance levelHenderson et al. Environmental Health  (2016) 15:109 Page 5 of 10also consistent with multivariate combinations #5–8(mean overlap 62.9%). Overall, there were 30 decedents(26.3% of the total 114) who were identified as probableexcess deaths by all 12 of the modelling combinations.Further analyses on most promising combinationsUnivariate combination #2 and multivariate combination#1 were selected for further analyses because they hadthe lowest variability in the standard deviations of rank(Table 1). Deaths from the pool of 411 cases were ran-domly selected into approximately 72% of the total fivemillion repetitions (10,000 repetitions × 5 univariatemodels × 100 trials) for univariate combination #2, andone million repetitions (10,000 repetitions × 1 multivari-ate model × 100 trials) for multivariate combination #1.Deaths from the pool of 11,632 controls were selectedan average of 10.2% of the time. In the univariate com-bination the mean ranks across 100 trials ranged from3.9 to 410.3 (out of a possible 1.0 to 411.0) for the poolof cases, compared with 20.39 to 396.70 for the multi-variate combination. The standard deviations of theranks ranged from 1.1 to 54.3 positions for the univari-ate combination, and from 13.5 to 75.3 positions for themultivariate combination. The greatest consistency inrank was found in the highest and lowest ranked cases.Distributions of the five variables were comparedbetween the 114 probable excess deaths from the twoselected approaches (Table 2). The deaths identified inthe univariate approach were significantly younger, had asignificantly lower residential greenness, and were sig-nificantly more deprived than those identified in themultivariate approach. No significant difference wasfound between approaches in the location of death orthe population density. We also summarized the under-lying causes of death for the probable excess deaths,according to categories of the ICD-10 (Table 2). Thedeaths identified by the univariate combination wereshifted away from cardiovascular and respiratory causescompared with the multivariate combination, but shiftedtoward deaths from cancer and external causes startingwith X in the ICD-10. The 17 probable excess deaths at-tributed to external causes in the univariate group in-cluded one highly-publicized death due to excessivenatural heat and 16 deaths due to accidental poisoningsby pharmaceutical or illicit agents and intentional self-harm. All but one of these deaths occurred at home or inthe community.There were 72 probable excess deaths that were identi-fied by both the univariate and multivariate combina-tions. Overall, 79% died at less than 75 years of ageFig. 3 Percentage of overlap of identified probable excess deaths between combinations. Probable excess deaths were compared between eachpair of combinations, and the percentage of overlap of individuals identified was plottedHenderson et al. Environmental Health  (2016) 15:109 Page 6 of 10Table 2 Summary statistics comparing the univariate #2 and multivariate #1 approaches, overlap cases, and the pool of controlsVariable Description Uni. #2 probableexcess deathsMulti. #1 probableexcess deathsp-value between uni. #2and multi. #1 resultsOverlap between uni.#2 and multi. #1Overlap betweenall combinationsPool ofcontrolsN Total number 114 114 72 30 11632Age (%) Age at death (%)< 75 years 83.3 52.6 <0.001 79.2 96.7 37.4> = 75 years 16.7 47.4 20.8 3.3 62.6Location of death (%) Location of death (%)In hospital 37.7 37.7 0.288 34.7 0 51.9Residential institution 17.5 27.2 19.4 0 30.9Home 36.0 28.9 37.5 80.0 14.5Other 8.8 6.1 8.3 20.0 2.7Neighborhood deprivation quintile (%) Neighborhood deprivation quintile (%)1 (least deprived) 6.1 0 0.049 0 0 20.72 4.4 0 0 0 20.73 20.2 14.0 15.3 20.0 19.24 21.9 33.3 26.4 13.3 19.15 (most deprived) 47.4 52.6 58.3 66.7 20.2Population density quintile (%) Population density quintile (%)0–181 persons/km2 1.7 0 0.829 0 0 20.1182–526 persons/km2 8.8 1.8 2.8 3.3 19.5527–681 persons/km2 10.5 18.4 8.3 16.7 19.9682–2356 persons/km2 28.9 30.7 33.3 20.0 19.82357–2573 persons/km2 50.0 49.1 55.6 60.0 20.7Greenness Mean NDVI measurement 0.252 0.290 0.008 0.256 0.254 0.329Selected underlyingcauses of death (%)Underlying cause of death (%)Cardiovascular (ICD-10 starting with I) 17.5 28.1 NA 22.2 36.7 27.1Respiratory (ICD-10 starting with J) 6.1 11.4 8.3 6.7 10.0Cancer (ICD-10 starting with C) 36.0 25.4 30.6 10.0 30.7External (ICD-10 starting with X) 14.9 8.8 13.9 26.7 3.1Identified cases from the univariate and multivariate combinations were compared with respect to the five variables used in the analyses, as well as with respect to selected underlying causes of death. The overlapcases identified in both combinations (N = 72) and by all combinations (N = 30) were compared with the pool of controlsHendersonetal.EnvironmentalHealth (2016) 15:109 Page7of10(compared with 37% among controls), 46% died in thecommunity (17% among controls), 58% lived in the mostdeprived neighborhoods (20% among controls), 56%lived in the most densely populated neighborhoods (20%among controls), and there was lower vegetative green-ness surrounding their residences. These differenceswere more pronounced when the probable excessdeaths were restricted to the 30 identified by all 12combinations (Table 2).Finally, daily deaths during the extreme hot weatherevent ranged from 46 to 76, and all days were elevatedabove the summer 2009 mean of 42 (Fig. 1). Given thatheat was assumed to be the underlying driver of in-creased mortality, we expected the probable excessdeaths to be shifted towards the hotter days in theperiod. We assessed this by comparing the proportion ofdeaths on each day for: (1) all 411 deaths; (2) the 114probable excess deaths identified by univariate combin-ation #2 and by multivariate combination #1; (3) the 72deaths identified by both combinations; and (4) the 30deaths identified by all 12 combinations. The probableexcess deaths were shifted towards the July 29–31 dates,and the shift was more pronounced among the 72and 30 deaths that were identified by multiple model-ling combinations (Table 3). Even so, all approacheshad some probable excess deaths occurring on everyday of the event.DiscussionRapid epidemiologic assessment of mortality duringextreme hot weather events is challenging because theexcess deaths are almost always indistinguishable fromthe expected deaths in administrative vital statisticsdatabases. Most excess deaths do not meet the criteriafor certification due to excess heat [15], and are there-fore attributed to the same ICD codes as the expecteddeaths. However, statistical tools that facilitate rapidepidemiologic assessment with these readily-availabledata could help generate timely evidence to protectvulnerable populations and to improve public healthoutreach during future events. The BCCDC has devel-oped the proposed methods to identify the most prob-able excess deaths during a 2009 hot weather event ingreater Vancouver, Canada. The approach is predicatedon the assumption that the expected deaths during a hotweather event are similar to deaths that occur duringnormal summer temperatures when compared withrespect to variables known to be associated with hotweather mortality. The framework for the approach is acase-control analysis into which hot weather cases andtypical weather controls were randomly sampled fromlarger pools. Over thousands of repetitions and 100trials, probable excess deaths were identified as cases thatwere most consistently found in non-null models.Several sensitivity analyses were conducted to explorethe behavior of the framework across variations in the in-put parameters including model type, significance weight-ing, coefficient directionality, and significance level. Theexcess deaths identified and the standard deviations oftheir ranking across 100 trials were most sensitive to themodel type, where univariate and multivariable combina-tions were tested. In comparison, the direction of the sig-nificant coefficients had little impact on the stability of theresults, suggesting that the effects were in the expecteddirection in the majority of repetitions. Similarly, therewas little difference between the 0.10 and 0.05 significancelevels, but the combinations with the 0.10 level showedmore internal consistency.All of the univariate combinations were considerablymore stable than the multivariate combinations whenassessed by the variability of the rankings. However,neither approach was completely consistent with our apriori assumption that the probable expected deathswould be similar to the pool of controls while theTable 3 Summary of deaths by day of the 2009 hot weather eventDate, 2009 Daily maximum temperatureat Vancouver InternationalAirport (°C)Total deaths(%)Uni. #2 probableexcess deaths (%)Multi. #1 probableexcess deaths (%)Overlap between uni.#2 and multi. #1 (%)Overlap between allcombinations (%)N = 411 N = 114 N = 114 N = 72 N = 30July 27 27.8 11.2 9.6 9.6 9.7 6.7July 28 30.9 12.7 7.9 10.5 8.3 10.0July 29 34.0 15.2 15.8 18.4 16.7 20.0July 30 34.4 18.4 18.4 18.4 22.2 23.3July 31 28.7 15.6 21.1 17.5 18.1 20.0August 1 26.7 11.2 11.4 11.4 9.7 10.0August 2 25.5 15.1 15.8 14.0 15.3 10.0Columns show how the total deaths during this period compare with the most probably excess deaths identified by univariate combination #2, multivariatecombination #1, the overlap between these combinations, and the overlap between all 12 combinations. The underlying assumption is that the most probableexcess deaths would occur on the hotter daysHenderson et al. Environmental Health  (2016) 15:109 Page 8 of 10probable excess deaths would not (Fig. 2). There wasalso a marked lack of stability in the results from themultivariate combinations that counted a repetition assignificant if any one of the five variables had a signifi-cant coefficient. This suggests that weighting the ranksfor the number of significant variables in each repetitionof the multivariable models produces more internal sta-bility. Given that the model type was the most importantchoice, it is possible that other modelling or machinelearning approaches would produce more robust results.Overall, the suite of analyses demonstrated that thespecification of the modeling framework affects the setof cases that will be identified as probable excess deathsand any subsequent conclusions drawn from themethods. Furthermore, we have used the ecologic time-varying deprivation, greenness, and population densityvariables as if their respective 2006, 2009, and 2010values were static across the region during the 2009–2012 period. As such, both cases and controls may havebeen misclassified, which could bias the results if theactual exposures were systematically different from theestimates. For example, the City of Vancouver hasplanted many new street trees in recent years, meaningthat the 2009 greenness estimates for the control popu-lation may be low when compared with the actual expo-sures. This could result in some cases with moremoderate greenness values not being included in thegroup of probable excess deaths.Another potential source of bias is the selection of thecase and control periods. We defined the extreme eventas July 27 through August 2 to be consistent with previ-ous work [17], but this period includes 2–3 days whenmortality was within the normal range of variability(Fig. 1). The results would have been considerably differ-ent if we had limited the pool of cases to the dates withsignificantly elevated mortality. Likewise, the resultsmight be different if we had restricted the pool of con-trols such that we were randomly sampling the sameproportion from the case group (297/411) and the con-trol group (1,188/1,644) for each repetition. Ultimately,however, it would be ideal if we could test and refine thisapproach in a context where other, more conventionalepidemiologic methods have been used to identify excessdeaths. For example, Price et al. identified 106 of the 304deaths as being heat-related via chart review after the2010 event in Montreal [25]. Alternatively, the methodscould be optimized using simulated data, in which theexcess and expected deaths have been clearly defined. Ei-ther way, a dataset that included known excess deathswould allow refinement of the methods proposed here.We have used deaths in the summers following the2009 event as the pool of controls to simulate a conven-tional case-control study. However, the proposed methodscould be most useful in the days and weeks following anextreme event associated with an obvious increase in mor-tality. In such cases the pool of controls would be drawnfrom a period before the event, as was done for our priorcase-only analyses with these data [17]. Indeed, the idealpublic health response could comprise the following steps:(1) rapid case-only analysis to ascertain variables moststrongly associated with death during the extreme event;(2) application of the proposed methods to identify themost probable excess deaths; (3) review of charts/deathcertificates to confirm the role of extreme hot weather;and (4) targeted case-control study using administrativedata or other methods, depending on the resources. How-ever, the proposed approach is unlikely to be informativefor short events with more subtle impacts, though it maybe even more informative as the magnitude and/orduration of the exposure increases and its public healthimpacts are even further elevated above baseline.The 30 cases identified by all 12 combinations wereclassified as the most probable excess deaths. These in-dividuals were significantly younger, more likely to die inthe community, and they lived in more densely popu-lated, more deprived, and less green neighborhoods.Combined with the increase in external causes of death,which included deaths such as poisoning and self-harm,these results suggest that a marginalized population maybe at highest risk in greater Vancouver. This is consistentwith recent findings from the 2010 event in Montreal,where mental health illness was an important risk factorfor death during the hot weather [25]. We cannot reliablyevaluate the role of specific comorbidities or pharmaceuti-cals without access to death certificates or linkage betweenthe vital statistics and other sources of administrativehealth data, but these results provide sufficient evidenceto help target public health interventions during futureepisodes of extreme hot weather in the region.ConclusionWe have proposed and applied a method to probabilis-tically separate excess deaths from expected deathsduring an extreme hot weather event. Many such eventsresult in excess mortality that is not easy to characterizefrom administrative data. Once validated on data withknown separation between excess and expected deaths,this method could be used to rapidly understand whichdeaths were most likely excess during any hot weatherevent and, therefore, to identify who is at most riskduring future events in the region. The method may alsobe applicable to other episodic environmental hazardsresulting in obvious excess morbidity or mortality. Al-though this method cannot replace conventional epide-miologic approaches, it may help to support publichealth protection when funding and/or access to moredetailed data (ie medical charts) are not rapidly availablefor more rigorous study.Henderson et al. Environmental Health  (2016) 15:109 Page 9 of 10AbbreviationsBC: British Columbia; BCCDC: British Columbia Centre for Disease Control;GWPv4: Gridded population of the world version 4; ICD: Internationalclassification of diseases; NDVI: Normalized difference vegetation index;VANDIX: Vancouver deprivation index; YVR: Vancouver International AirportAcknowledgementsThe authors would like to thank Ben Armstrong for reviewing the first draftof the manuscript and for providing suggestions about how to make thework stronger. This work was internally funded, though some salary supportfor JSG, SAR, KEM, and NK was provided by Health Canada. Finally, thanks toour reviewers whose insightful comments and suggestions helped us torefine the methods and the manuscript.FundingThis study was funded as part of core activities at the BC Centre for DiseaseControl (BCCDC).Availability of data and materialsThe authors cannot make the raw data available to other researchers.However, we can make the analytic code available, along with a mockdataset to demonstrate the nature and structure of the data used.Authors’ contributionSBH conceived of the study, designed and oversaw the coding and analyses,and led the draft manuscript. JSG led final development of the code, generatedall data visualizations, and helped to draft the manuscript. SAR led initialdevelopment of the code and contributed to the study design. KEM ledsecondary development of the code, code testing, and helped to draft themanuscript. NK assisted with final development of the code, conducted all finalanalyses, and helped with preparation of the manuscript. DMH helped toconceive of the study design and helped to draft the manuscript. TKoversaw the entire project and helped to draft that manuscript. All authorsread and approved the final manuscript.Competing interestsThe authors declare that they have no competing interests.Consent for publicationNot applicable.Ethics approval and consent to participateThe British Columbia Centre for Disease Control (BCCDC) receives dailyadministrative vital statistics data for the purposes of public health surveillanceand protection. Routine use of these data for such purposes is authorizedunder the provincial Public Health Act.Author details1Environmental Health Services, British Columbia Centre for Disease Control,655 West 12th Avenue, Vancouver, BC V5Z 4R4, Canada. 2School ofPopulation and Public Health, The University of British Columbia, 2206 EastMall, 3rd Floor, Vancouver, BC V6T 1Z3, Canada. 3Center for PolicyInformatics, School of Public Affairs, Arizona State University, Phoenix, AZ85004, USA.Received: 18 September 2015 Accepted: 10 November 2016References1. Whitman S, Good G, Donoghue E, Benbow N, Shou W, Mou S. Mortality inChicago attributed to the July 1995 heat wave. Am J Public Health. 1997;87(9):1515–8.2. Semenza JC, Rubin CH, Falter KH, Selanikio JD, Flanders WD, Howe HL,Wilhelm JL. Heat-Related Deaths during the July 1995 Heat Wave inChicago. New England J Med. 1996;335(2):84–90.3. Fouillet A, Rey G, Laurent F, Pavillon G, Bellec S, Guihenneuc-Jouyaux C,Clavel J, Jougla E, Hémon D. Excess mortality related to the August 2003heat wave in France. Int Arch Occup Environ Health. 2006;80(1):16–24.4. Vandentorren S, Suzan F, Medina S, Pascal M, Maulpoix A, Cohen J, LedransM. Mortality in 13 French cities during the August 2003 heat wave. Am JPublic Health. 2004;94(9):1518–20.5. Conti S, Meli P, Minelli G, Solimini R, Toccaceli V, Vichi M, Beltrano C, PeriniL. Epidemiologic study of mortality during the Summer 2003 heat wave inItaly. Environ Res. 2005;98(3):390–9.6. Grize L, Huss A, Thommen O, Schindler C, Braun-Fahrlander C. Heat wave2003 and mortality in Switzerland. Swiss Med Wkly. 2005;135(13-14):200–5.7. Johnson H, Kovats S, McGregor G, Stedman J, Gibbs M, Walton H. Theimpact of the 2003 heat wave on daily mortality in England and Wales andthe use of rapid weekly mortality estimates. Euro Surveill. 2003;10(7):558.8. Revich BA. Heat wave, air quality and mortality in European Russia in summer2010: preliminary assessment. Yekologiya Cheloveka/Human Ecology. 2011;7:3–9.9. Bustinza R, Lebel G, Gosselin P, Bélanger D, Chebana F. Health impacts ofthe July 2010 heat wave in Québec, Canada. BMC Public Health. 2013;13(56):1–7.10. Lan L, Cui G, Yang C, Wang J, Sui C, Xu G, Zhou D, Cheng Y, Guo Y, Li T.Increased Mortality During the 2010 Heat Wave in Harbin, China. Ecohealth.2012;9(3):310–4.11. Sartor F, Snacken R, Demuth C, Walckiers D. Temperature, ambient ozonelevels, and mortality during summer 1994, in Belgium. Environ Res. 1995;70(2):105–13.12. Tan J, Zheng Y, Song G, Kalkstein L, Kalkstein A, Tang X. Heat wave impactson mortality in Shanghai, 1998 and 2003. Int J Biometeorol. 2007;51(3):193–200.13. Ostro BD, Roth LA, Green RS, Basu R. Estimating the mortality effect of theJuly 2006 California heat wave. Environ Res. 2009;109(5):614–9.14. Tong S, Ren C, Becker N. Excess deaths during the 2004 heatwave in Brisbane,Australia. Int J Biometeorol. 2010;54(4):393–400.15. Donoghue ER, Graham MA, Jentzen JM, Lifschultz BD, Luke JL, MirchandaniHG. Criteria for the diagnosis of heat-related deaths: National Association ofMedical Examiners: position paper. Am J Forensic Med Pathol. 1997;18(1):11–4.16. Kovats S. Heatwave of August 2003 in Europe: provisional estimates of theimpact on mortality. Euro Surveill. 2004;8(11):2409.17. Kosatsky T, Henderson SB, Pollock SL. Shifts in Mortality During a Hot WeatherEvent in Vancouver, British Columbia: Rapid Assessment With Case-OnlyAnalysis. Am J Public Health. 2012;102(12):2367–71.18. Mirchandani HG, McDonald G, Hood IC, Fonseca C. Heat-related deaths inPhiladelphia-1993. Am J Forensic Med Pathol. 1996;17(2):106–8.19. Shen T, Howe HL, Alo C, Moolenaar RL. Toward a Broader definition of heat-related death: comparison of mortality estimates from medical examiners’classification with those from total death differentials during the July 1995 heatwave in Chicago, Illinois. Am J Forensic Med Pathol. 1998;19(2):113–118.20. Henderson SB, Kosatsky T. A Data-driven Approach to Setting TriggerTemperatures for Heat Health Emergencies. Can J Public Health. 2012;103(3):227–30.21. Doxsey-Whitfield E, MacManus K, Adamo SB, Pistolesi L, Squires J, BorkovskaO, Baptista SR. Taking Advantage of the Improved Availability of CensusData: A First Look at the Gridded Population of the World, Version 4. Papersin Applied Geography. 2015;1(3):1–9.22. Bell N, Schuurman N, Oliver L, Hayes MV. Towards the construction of place-specific measures of deprivation: a case study from the Vancouver metropolitanarea. Canadian Geographer/Le Géographe canadien. 2007;51(4):444–61.23. Pettorelli N, Vik JO, Mysterud A, Gaillard JM, Tucker CJ, Stenseth NC.Using the satellite-derived NDVI to assess ecological responses toenvironmental change. Trends Ecol Evol. 2005;20(9):503–510.24. Yuan F, Bauer ME. Comparison of impervious surface area and normalizeddifference vegetation index as indicators of surface urban heat island effectsin Landsat imagery. Remote Sens Environ. 2007;106(3):375–86.25. Price K, Perron S, King N. Implementation of the montreal heat responseplan during the 2010 heat wave. Can J Public Health. 2013;104(2):e96–e100.Henderson et al. Environmental Health  (2016) 15:109 Page 10 of 10


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items