Where are we? : individual geographic location using two different sources of postal code information… McGrail, Kimberlyn, 1966-; Fung, Patrick Wong 1999

Centr e for  Heal th Ser v icesand Po l i c y  Resear chWhere are we?  Individual geographic locationusing two different sources ofpostal code information in BCKimberlyn McGrailandPatrick Wong FungHIDU 99:01                      December, 1999Health Information Development UnitResearch ReportsTHE UNIVERSITY OF BRITISH COLUMBIAIntroductionAny regional analysis of health services utilisation is only as accurate as the patient geographic location informa-tion on which it is based.  In BC, such analyses have until recently relied on the postal code(s) recorded on theMSP Registration and Premium Billing file (R&PB).  In the past several years, Ministry of Health analyses haveused postal codes from the Client Registry (CR), but these data have not been available to the research commu-nity outside the Ministry.The CR began as a simple copy of the R&PB.  This means that when the CR was first established, the postalcodes recorded in it were exactly the same as those recorded in the R&PB. As time proceeded, the twoinformation sources were expected to diverge because of different procedures for updating the postal codesrecorded in them.  The same postal code is recorded in both the CR and the R&PB for new Medical ServicesPlan registrants, but address changes are handled differently.  Any changes made to the R&PB are reported tothe CR, but the CR also draws on address changes reported from hospital separations, from the PharmaNetsystem, and from a number of other sources.  The R&PB is, however, the single most important source ofchange in address information, accounting for more than half of all changes made to the CR.The reason for the difference in updating procedures is the administrative responsibility for each of the systems.Billing and collection of premiums is the main function that the R&PB data system supports, so it is contractaddress (billing address) rather than home address that is of concern in that system.  Home address is recordedat initial enrollment, and may be updated when there is a contract change (e.g. a change in employment), butotherwise, there is no motivation (or necessity) for the R&PB to keep up-to-date home address information.  TheCR, on the other hand, was designed to be a registry of all people who have had contact with the BC health caresystem, and was intended, in part, to create an accurate account of geographic location.As a result, to the extent the CR and the R&PB are now different, the CR is assumed to be more accuratebecause of the multiple sources for address updating.  In addition, the discrepancy between the two is commonlyassumed to result in relatively large inconsistencies in regional analyses.  Ultimately, the validity of research witha geographic (or socioeconomic) component using R&PB postal codes is questioned.The purpose of this project was to compare, on an age-specific basis, the postal codes from these two sources asrecorded in 1999.  They have been compared at three levels: six-digit postal code, local health area (LHA) andHealth Region (HR).  The results should provide researchers with information on the reliability of geographiclocation based on the different sources and therefore about potential difficulties with regional analyses.  Theresults will also provide an empirical basis for the decision about which source of postal codes ought to be usedfor research purposes.MethodsThe Ministry of Health provides the Centre for Health Services and Policy Research (CHSPR) with an annual‘snapshot’ of the R&PB.  This file is used to update the ‘Linkage Coordinating File’ (LCF), an amalgamation ofyearly snapshots from 1986 through 1999 which forms the backbone of the BC Linked Health Data set (BCLHD).The LCF is the registry to which all administrative data sets are probabilistically linked (see: Chamberlayne R etal., (1998) Creating a Population-based Linked Health Database: A New Resource for Health ServicesResearch. Canadian Journal of Public Health July/Aug 89(4):270-73).  A file of PHNs1  and postal codes asrecorded in the June 1999 ‘snapshot’ was created for this project._________________________1 All PHNs in the BCLHD are scrambled. For ease of reading, however, they are referred to as ‘PHNs’ rather than ‘scrambledPHNs’.Page 1In October 1999, the Ministry of Health provided CHSPR with a file derived from the CR that contains PHN and(then) current postal code for all ‘active’ (i.e. non-retired) PHNs.2   PHNs are only ‘retired’ when the individualsto whom they are assigned die.  Thus, active implies neither residence in the province nor registration with MSP.We know, in fact, that the list provided by the CR will contain many people not in the LCF.  For example,residents of Alberta or other provinces who fill a prescription in BC are assigned a PHN through the CR, basedon the policy that everyone who receives any kind of health care service in BC be given one.The first step was to merge the CR data with the extract of 1999 postal codes from the LCF.  The second wasto retain all PHNs that appeared in both, along with LCF postal code and CR postal code.  Next, postal codesfrom both sources were converted, using the latest version of the BC Stats Translation Master File (TMF), toLHA and HR.  Finally, age was calculated as of 30 June 1999, based on the birthdate recorded in the LCF.Individuals were assigned to age groups 0-19, 20-29, 30-39, 40-49, 50-59, 60-69, 70-79, or 80+.There were 209,077 PHNs with postal codes that were not in the TMF, and thus could not be assigned a LHAor HR.  This can occur when a postal code has been introduced subsequent to the yearly publication of the TMF.Affected postal codes were included in the analyses, since comparison at the 6-digit level was clearly possible.In all cases when 6-digit postal code agreed, LHA and HR were automatically set to agreement, minimising theeffect of postal codes missing from the TMF. Missing LHA and HR values will still occur, however, when eitherthe LCF or CR postal code cannot be converted; this may result in a slight underestimate of agreement at theselevels.The following information was produced:1. A frequency of agreement (and disagreement) between the two six-digit postal codes, by age group.2. A frequency of agreement (and disagreement) between the two LHA codes, by age group.3. A frequency of agreement (and disagreement) between the two HR codes, by age group.4. 1, 2 and 3 above, excluding blank and non-BC postal codes.ResultsThe 1999 LCF contains 5,632,184 unique PHNs, and the CR contains 4,885,782 (Figure 1).  There are 1,181,100unique PHNs on the LCF that are not on the CR.  This occurs both because the LCF has records of peopleregistered as early as 1986, even if that registration is no longer valid (compared to the early 1990s for the CR)and because it includes people who have died.  The CR, on the other hand, contains 434,698 unique PHNs notincluded in the LCF.  This occurs because of out-of-province residents who receive health services in BC(emergency hospital use, filling of a prescription) and people who are BC residents and received services, butwho are not registered with the province (either because of eligibility status or because of non-payment ofpremiums)._______________________2 The time delay between the R&PB snapshot (created in June) and the CR extract (created in October) may explain a portionof any discrepancy in postal codes. The effect of this, however, is assumed to be small.Page 2Figure 1In LCF In CR5,632,184 4,885,782In both LCF andCR4,451,084Valid postalcodeMissing or non-BC postal code4,335,074 116,010Translated byTMFNot translatedby TMF4,125,997 209,077There are 4,451,084 unique PHNs that appear on both the LCF and the CR.  There was no attempt made to seewhether these are current registrants, but we know some must not be because that number is about 10% largerthan the estimated BC population in 1999.  Of these 4,451,084 there were 116,010 that did not have a valid BCpostal code on either the CR or the LCF (blank or beginning with something other than ‘V’).  We calculatedcomparison percentages including and excluding these PHNs.  The differences were less than 1% in all cases,so the numbers reported here exclude these, leaving 4,335,074 unique scrambled PHNs with postal codes to becompared.As expected, there is an increase in agreement as the level of aggregation increases from 6-digit postal codeto LHA to HR (Figure 2).  There is also a trend with age, with the 20-39 age groups showing the lowest levelof agreement.  This is expected because of higher intra-provincial migration in these age groups.  The level ofagreement even in these age groups, however, is still relatively high; above 75% at the 6-digit postal codelevel, and 90% or higher for LHA and HRAt the highest level of aggregation - the Health Region - agreement in the other age groups ranges from about95 to 97%.Page 3Figure 2: Percent agreement between postal code sources, excluding blank and invalid codes, BC 19990%10%20%30%40%50%60%70%80%90%100%0-19 20-29 30-39 40-49 50-59 60-69 70-79 80+ TOTALAge group% agreement6-digit postal code LHA HRDiscussionThe CR and R&PB have behaved as expected, and in 1999 contain different postal codes for about one in fiveregistrants.  Because the CR started as a copy of the R&PB, we can infer that if the same comparison weremade between the 1994 CR and 1994 LCF (for example), there would be a higher level of agreement than in1999.  Nevertheless, these results indicate that even several years from the start of the CR, there is still a highdegree of concordance between the two sources, and that agreement is increased further when the postal codesare aggregated to LHA and HR.  The most likely explanation for this is that people often change residenceswithin the same community. In this case, if the address was updated on the CR and not the R&PB, the postalcode would disagree, but the LHA and HR would be unaffected.The preceding analysis does not answer all questions that might be of interest.  For example, there are othergeographic boundaries used for research and planning purposes, such as census tract or census sub-division.Such comparisons between the CR and LCF are possible, but the result would likely fall somewhere between the6-digit postal code and LHA comparisons.Page 4We also did not address the ecologic construction of individual or family socioeconomic status.  Socioeconomicstatus might, for example, be calculated based on enumeration area– a very small geographic location wheredifference in six-digit postal code might have a large impact – or census tract – a larger area where differencesin 6-digit postal code might matter less.  Because of the small number of categories (either five for SES quintilesor 10 for deciles), we would expect some agreement based on chance alone3 .  Our best guess is that SEScomparisons would probably have agreement in the same range as HR comparisons.Even though the difference between the CR and LCF – especially at the LHA and HR levels – is relativelysmall, it still may be worth addressing.  There is, for example, differential agreement among the age groups,probably because of patterns of mobility.  This mobility may be of some interest, so capturing as much as possibleof it (by using a full ‘string’ of postal codes from the CR) is potentially important.  For example, if ‘high mobility’people use health care services more frequently than the average similarly-aged BC resident, the impact ongeographic analyses may be greater than expected based on the magnitude of disagreement.Another consideration is the difference in trends or projections that might occur in Ministry of Health reports vs.reports produced by external researchers.  If these differences (whether they are in fact large, or are onlyperceived to be large) decrease the credibility of analyses using geographic location drawn from the LCF, thenthey might be of concern to the research community, or to the Ministry.  There is something to be said forconsistent results emerging from similar analyses of common databases.  This will become even more importantas the Ministry proceeds with building a Health Data Warehouse and increases the number of people accessingdata that way.For these reasons, it makes sense to consider providing historical postal code information from the CR to re-searchers accessing data through the BCLHD.  These postal codes would be considered additional to thoserecorded in the LCF, and would provide researchers the opportunity to do ‘sensitivity analyses’ comparing theresults of one source vs. the other.  Since neither can truly be considered a ‘gold standard’, it is most likely thatthe ‘truth’ would be found in the range produced by such analyses.Acknowledgments:Thanks to Morris Barer and Clyde Hertzman for comments on an earlier draft. Funding and data for this workwere provided by the Ministry of Health and Ministry Responsible for Seniors.3 A smaller number of possible values implies a higher level of assignment of the ‘correct’ purely due to luck.Page 5


