Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Screening intervals and the risk of carcinoma in situ of the cervix Phillips, Norman 1994-12-31

You don't seem to have a PDF reader installed, try download the pdf

Item Metadata


ubc_1994-0595.pdf [ 1.49MB ]
JSON: 1.0094710.json
JSON-LD: 1.0094710+ld.json
RDF/XML (Pretty): 1.0094710.xml
RDF/JSON: 1.0094710+rdf.json
Turtle: 1.0094710+rdf-turtle.txt
N-Triples: 1.0094710+rdf-ntriples.txt
Original Record: 1.0094710 +original-record.json
Full Text

Full Text

SCREENING INTERVALS AND THE RISK OFCARCINOMA INSITU OF THE CERVIXbyNORMAN PHILLIPSB.A., McGill University, 1978M.A., The University of British Columbia, 1984A THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OFMASTER OF SCIENCEinTHE FACULTY OF GRADUATE STUDIES(Department of Statistics)We accept this thesis as conformingto the required standardTHE UNIVERSITY OF BRITISH COLUMBIASeptember 1994© Norman Phillips, 1994In presenting this thesis in partial fulfillment oftherequirements for an advanced degree at the University ofBritishColumbia, I agree that the Library shall make it freely availablefor reference and study. I further agree that permissionforextensive copying of this thesis for scholarlypurposes may begranted by the head of my department or byhis or herrepresentatives. It is understood that copyingor publication ofthis thesis for financial gain shall not be allowedwithout mywritten permission.(Signature)_____________-- -Department of____________________The University of British ColumbiaVancouver, CanadaDateAbstractThis study examines the effect oflength of interval between routine tests on the risk ofcarcinoma in situ (CIS) ofthe uterine cervix using cohort data from the B.C. population.CIS is a symptomless disease only detected by screening. Because of this, specialmethodsare required forestimating incidencerates. Some case-control studies haveusedprevalence odds ratios to estimate the relative risk of disease, usually invasive cancer,from length of screening interval. But duration of disease is related to interval lengthand henceprevalencerates cannotbeused to estimaterelativerisk. A multivariatemodelis fit to the incidence data using Poisson regression, and prevalence rates are fit with alogistic regression model. The results for prevalence odds ratios indicate apositiveassociation between screening interval length and risk of disease whereas the results forrelative risk indicate a negative relationship. Theoretical screeningmodels areconsidered to examine the consequences ofacase-control paradigm in which controlsarematched with cases on the basis ofhaving had a screen near the date ofdiagnosisofthecase, the matching period. As the matching period shortens,the distribution ofintervallengths for controls converges to the underlying distribution, whereas the distribution ofintervallengths forcases equals the distribution oflengths ofintervals which span apointin time. Thelatter distribution favours longer intervals. The difference is not due to thesampling ofcontrols but, rather, to the relation between interval length and durationofdisease. A matched case-control study is simulated with the cohort data, and a11conditional likelihood logistic regression model is fit. The results agreewith those ofalogistic regression analysis of prevalence rates indicating apositive relation betweeninterval length and risk of disease. When the sampling of controls isweighted byinterval length the odds ratios approximate the relative risk. A possible explanationofthe surprising result that screening intervallength is inversely related to riskofdiagnosisof CIS is that more cases are cured with time by the natural regressionof disease thanby treatment ofearlier stages ofdisease. On the other hand, incidence rateis negativelyrelated to recency and frequency of prior negative screens, possibly becauseof theoccurrence of false negative tests. However, the effect of regression predominatesandthe unavoidable conclusion is that less frequent screening decreases the risk ofdiagnosisof CIS.111Table ofContentsAbstractiiList of TablesviList of FiguresviiiAcknowledgementixIntroduction1Review ofLiterature6Epidemiological Methods21Screening Models30Fixed Interval Model31Uniformity model33Poisson Model41ivData Analysis57Conclusion75Bibliography78Appendix 181Appendix 285VList of TablesTable 1. Corrected incidence rate of CIS or worse per 1000 womenyears 9Table 2. Odds ratios for risk factors comparing CIN cases withcontrols 13Table 3. Hypotheticalexampleofsampling matched controls under fixedintervalscreening 32Table 4. Odds ratio for annual vs bi-annual screeners 32Table 5, The distribution of screen intervals of lengths 1-5 years which spanDecember, 1979 by month of start of interval 54Table 6. Theoretical odds ratio ofinterval length relative to > 120 months, casesvs. controls, by amount of inter-subject screening intensity variability inPoisson model 57Table 7. Poisson regression of incidence rates on length of “last” interval,combined length of two preceding intervals, and abnormality in gappreceding the “last” interval 65Table 8. Logistic regression of prevalence cases on length of “last” interval,combined length of two preceding intervals, and abnormality in gappreceding the “last” interval 69Table 9. Model estimated incidence rates, prevalence odds and the ratio ofviprevalence odds to incidence rates by “last” interval length70Table 10. Conditional likelihood logistic regression ofa simulated matched casecontrol study 72Table 11. Conditional likelihoodlogistic regression ofa simulated matched casecontrol study with sampling of controls weighted by length of “last”interval 74viiList ofFiguresFigure 1. Distribution ofintervals spanning 1980for All intervals and intervalsending before 1981, Unweighted and Weightedby interval length 55viiiAcknowledgementI would like to thank Andy Coidman for all the valuable time and excellent supervisionhe has provided, Nhu Le for helping meout in a “crunch”, Brenda Morrison for herwillingness to offer advice and assistanceat all times, the B.C. Cancer Agency forsupplying the data, and finally, my wife, Gloria, for taking on my share ofthe childcare,laundry, and dishes while I finished this project.ixChapter 1IntroductionCervical cancer, i.e. cancer ofthe uterine cervix, appearsto develop in stages. “Thereis a continuous spectrum of histological [cell] abnormalitiesofthe squamous epithelium[type ofcell] ofthe cervix [the opening ofthe uterus] from dysplasia through carcinomain situ (CIS) to micro-invasive and invasive lesions [cancer]”. It is generallybelievedthat every case of invasive cancer of the cervix originated in dysplasiaand thenprogressed to carcinoma in situ before developing into cervical cancer.Not all cases ofdysplasia develop into invasive cancer, and the timeto cancer development in thosewhich do progress is variable.It is generally believed that the earlier cancer is detected andtreated, the better theprognosis. Since the stages of dysplasia and CIS are non-symptomatic they are onlydiscovered by accident or by screening. Screening is the practice oftesting for thepresence of disease in the absence of outward signs of disease.The rationale behindscreening is to enable the treatmentofthe disease ata stage when prognosis is better andtreatmentmay beless invasive. ThePapanicolaou (PAP)test is aprocedurefor detectingcell abnormalities which are believed to be precursors of cervical cancerand is thescreening method ofchoice for cervical cancer. A representative cell sampleis obtained1using an Ayre’s spatula from the transformation zone ofthe uterine cervix. The sampleis placed on a slide to be examined by microscope. Test results rate the degree of cellabnormality on a scale from one to four, with “class” ones being normal. Abnormalresults may be followed-up with more PAP tests or the individual may be referred forfurther examination.Prior to the introduction of colposcopy in the mid 1970s, treatment generally consistedofcone biopsy or hysterectomy. Both procedures are relatively serious and so treatmentwas not recommended unless there was evidence of persistent abnormality or theabnormality was sufficiently serious. Colposcopy2is a method ofvisually inspecting thecervix. This allows the localization of the treatment with such methods as cryotherapy(freezing) and laser therapy. Since these methods are relatively non-aversive and havefewer complications, they are applied quite soon after signs of abnormality.The effectiveness of the PAP test as a screen for cervical cancer is unknown sincerandomized trials have never been carried out. Screening was implemented before itseffectiveness was known3.The indirect evidence is sufficiently compelling as to renderfuture randomized controlled trials unethical. “The most persuasive evidence thatscreening for cervical cancer is effective comes from comparisons of cervical cancer inpopulations which introduced mass screening with different intensities and at differenttimes”3.Forexample, in B.C., the incidenceofclinically invasive cancer and associatedmorbidity have decreased by about 75% since the introduction of cervical screening4.2However, inferences must be drawn from observational studies which have variouspotential sources of bias5.Parkins et. al.6 report risk factors such as SES (socialeconomic status) to be related to the probability of screening. However, adjusting forSES does not seem to materially affect the relation between screening patterns and riskof cervical neoplasia7.Knox8argues that case-control studies are invalid ifthe factorswhich predispose to disease also affect the likelihood of screening. The same criticismalso applies to cohort studies. In spite ofthe inherent flaws ofobservational studies, thesignificantreduction in deaths from cervical cancerhas been attributed to screening usingthe PAPtest9’3.One ofthe risks associated with screening is that individuals might undergo unnecessarytreatment which in itself carries some risk. There is some evidence that stages even asfar along as carcinoma in situ (CIS) have significant rates of regression, i.e., return tonormal spontaneously, especially among younger individuals1.Since it is consideredunethical to withhold treatment from patients with cell abnormalities, it is impossible toestimate regression and progression rates directly’°, except in groups which declinetreatment. So indirect methods have been employed to try to estimate regression rates.One method which has been employed for this purpose compares the estimates ofincidence (the rate of new cases in a time interval) and prevalence (the proportion ofpopulation who are cases at a given point in time) at different stages of disease on theassumption thatifall cases ofdiseaseprogress, then currentprevalencecases atone stagewould eventually become incidence cases at the next’. We should find that,3Prevalence of carcinoma in situ or worse at timet,+Totalincidence ofcarcinoma in situ or worse during the intervalt,-t2=Prevalence of carcinoma in situ or worseat timet+Totalincidence of clinical cancer during intervalt,-t21.Typically such calculations indicate a lowerthan expected number of cases of moresevere disease. Such shortfalls are attributed to regression.But the results could alsobe due to false negative test results, i.e. results whichindicate absence ofdisease when,in fact, disease is present. Incidence ratesof earlier stages of disease may be inflatedand the prevalence rates of the same maybe deflated as a result of false negative testresults’. False negatives, along with inadequate screening, havebeen blamed for someof the fatalities observed”. The may arise from“inadequate cell collection, smearpreparations or smear interpretation”1.Estimates of false negative ratesvary from 6%to 55%”.Current research attempts to model the risk of disease, generally invasive cancer,as afunction ofscreening patterns using data from observational studiesand keeping in mindthepotentialimpactofdisease regression and false negativetestresults. Screen-detecteddiseases require special methods for estimating incidence rates because the time ofonsetis unknown. Although multivariate models are not typically employed with incidenceratedata, they are well suited forPoisson regression models. Case-control methods havebeen used to approximate risk of disease by prevalenceodds ratios. But this is notappropriate when the exposure variable is related to duration of disease. Somecasecontrols studies select controls on the basis of screening times, whichwould seem tointroduce a dependency between exposure variable status and sampling probabilities.4This issue is explored with theoretical screening models. The distributions ofintervallengths for cases and controls are derived under the null hypothesis thatthere is norelation between interval length and risk of disease. Finally these issues will beillustrated with analyses using data from the B.C. screening program.5Chapter 2Review ofLiteratureD.A. Boyes, B. Morrison, and colleagues1,undertook a cohort study with data from theBritish Columbia Cervical Cytology Screening Program covering the years 1949-1969.The data consisted oftwo cohorts ofindividuals who were born between 1914 and 1918or between 1929 and 1933 and who had been screened at least once prior to Jan.1,1970. The objective of the study was to provide estimates of prevalence andincidencerates ofdysplasiaorworse and carcinoma in situ or worse. Incidence rates ofcarcinomain situ or worse were estimated from the number of cases developingwhile undersurveillance relative to the total accumulated time at risk in the sample. Prevalencerateswere estimated from “abnormalities discovered at first contact”. Analternative estimateof prevalence is also given by the proportion of the population who are cases atanygivenpointin time. These estimates tookinto accountlosses dueto deathand populationvariation due to immigration and emigration. The initial estimates ofcumulativeincidence of carcinoma in situ or worse were 17.9 per 1000 for ages 18 to 38(Cohort2), 19.3 per 1000 for ages 33 to 53 (Cohort 1) and 29.8 per 1000 for ages18 to 53 (bothcohorts).6Being an observational study, there were concerns about thepossibility ofsamplebiases.In the beginning years of the program, testing was done primarily as a support serviceforthe diagnosisofsymptomatic women. Thus theestimates ofincidenceandprevalencetend to be inflated. There is a tendency for higher SES females to be screened morereadily although an effort was made to recruit women with low SES backgrounds. It isunclear what effect this bias might have on the estimates. Third, those who enter theprogram at older ages are different from those who enter at younger ages. A variety offactors may be involved in this selection bias.Boyes et. al. examined two potential sources offalse negatives, namely errors in the laband sampling (literally) errors. The rate oflab errors was determined to be about 8.5%for Cohort 1 and 15.7% for Cohort 2. These estimates were based on review of testswhich were originally coded negative but the patient subsequently developed the diseaseand interpolated to women who only presented once forexamination. This translates intoa dating error of about 26-38.6 months based on the average interval between the dateof the actual previous negative and the original date of diagnosis.Boyes et. al. refer to the remaining false negatives as “residual” false negatives andattempt to infer their rate from “the difference between apparent incidence rates derivedfrom short intervals between examinations and the rates derived from long intervalsbetween examinations”. They also considered the patients history of tests on theassumption that “after several smears have been taken any positive case is likely to be7a genuinely new case since, although a case may have been missed at one examination,it is unlikely to have been missed at two or three successive examinations”. Incidencerates werecalculated forvarious combinationsofinterval length and numbers ofpreviousnegative test. Theauthors observed that shortintervals with few preceding negativeteststend to have the highest rates. So they based their estimates of rates on those obtainedfrom long intervals with many preceding negative tests.The rationale of the method is straightforward. First, casesdiscovered after a short interval from a previous smear are likelyto be due to a classification error on the earlier smear. Secondly,after several smears have been taken any positive case is likely tobe a genuinely new case since, although a case may have beenmissed at examination, it is unlikely to have been missed at 2 or3 successive examinations.Table 1 presents the corrected incidence rate estimates based on Table 18 in Boyes et al.Using these corrected incidence rates, prevalence estimates were revised to adjust forcases “who were not new ‘incidence’ cases but were missed ‘prevalence’ cases”The discrepancy between observed and corrected incidence rates provided an estimate ofthe false negative rate forvarious agegroups. These ranged from 4.1 % in the45-49 agegroup to 12.9% in the 25-29 age group. Finally, the data were retabulated, correctingforestimated false negatives and adjusting the date ofonset for such cases by the amountobserved for lab false negatives.Next, Boyes et. al. compared prevalence and incidenceestimates in an effort to estimate8Table 1. Corrected incidence rate of CIS or worseper 1000 women yearsAge range CohortIncidence rate25-29 20.9830-34 2 1.0435-39 20.3640-44 10.4645-49 10.6050-54 10.32the rate of disease regression. They reasonedthatifpre-clinical cancer progresses to clinical cancer, thenat the endofany timeperiod the accumulatedincidence ofpre-clinicalcancershould be equal to the prevalence of pre-clinicalcancer plus theaccumulated incidence of clinical cancer. If allthe possiblesources of error ... have been taken into account,a ratio ofprevalence of pre-clinical cancer plus accumulated incidenceofclinical cancer to accumulated incidence of pre-clinical cancersubstantially lower than unity must implyan excess incidence ofpre-clinical cancer and indicate that regression is part of thenatural history.’Combining the two cohorts, Boyes et. al. estimate this ratioto be 0.59-0.96 forcarcinoma in situ or worse. Thus there is some evidence thatregression occurs.The method ofestimating incidence rates for varying numbersofprevious negative testsand for varying intervals since the last negative tests has becomea research paradigm.Several studies utilizing this paradigm in the investigation ofincidence rates for invasivesquamous-cell cancer in women who had at least one negative test,have been assembled9in Hakama et. aL’2.The followers ofthis paradigm frequently speak of “the protectiveeffect” of an interval of length x following y negative tests. The studies include cohortand case-control designs, and include data from Scotland, Iceland, Denmark, Norway,Sweden, Switzerland, Italy, and Canada. The present study will concentrate on screen-detected cases of CIS rather than incidence cases ofinvasive cancer since there are fewcases ofinvasive cancer mt the data available. However, analysis of CIS is of interestin its own right in that one of the postulated benefits of PAP test screening is theprevention of CIS with the attendant morbidity associated with its treatment.As previously mentioned, the Pap test is scored on a scale from Class 1 to Class 4, withClass 1 representing no evidence of abnormality, i.e. a negative test result. However,there are discrepancies in the literature concerning the definition of “negative” tests.Boyes et al.’ used a complicated algorithm to identify negative tests. The test result hadto be either a Class 1 or a Class 2 which did not meet any of the following conditions:(a) one of three successive Class 2’s; (b) one of a pair of Class 2’s separated by aninterval often months or more; or (c) followed directly by “a test of Class 3 or higher,or by a histological demonstration of dysplasia, carcinoma in situ, or invasivecarcinoma”. The majority ofstudies collected in Hakama” adapted the criteria used byBoyes et. al.’ “A negative smear is either class I; class II followed by a class-I smear;or class II followed by a class-Il smear within ten months, followed by a class-Ismear”. Some of the studies reported in Hakama adopted a liberal definition of“negative” in that “a negative smear is recorded when neither the cytological nor the10clinical examinationleads to furthercytologicalorgynaecological examinationapartfromsucceeding screenings”’3.Most cases of CIS aie discoveredas a result of follow-uptests to abnormal screenresults. PAP tests taken in this contextare called “diagnostic”as opposed to “screens”.The frequency of such diagnostictests is a different issue fromscreening frequency.According to Berrino et. al.14A usual way of coping with this problemhas been to excludediagnostic smears from the analysis (Clarke& Anderson15;LaVecchia eta17)or to exclude all the positivesmears from both theseries of cases and of controls(MacGregor etal’7).Berrino et al. include all teststaken “before the onset of symptoms”.Boyes et alcollapsed any such series with intervalsless than 10 months intoa single diagnostic test.Thus the results of differentstudies must be considered in light of theoperationaldefinitions of basic constructs such as“negative” tests and screening intervals,as theoutcomes may hinge on these.Oneofthe case-control studies included inthe Hakamacollection, conductedby Vecchiaet. al.7,included a group of 145 womenwith cervical intra-epithelial neoplasia (CIN),(a new classification category that combinesdysplasia and CIS stages), who wererecruited from women referredto a university gynaecology clinic orthe National CancerInstitute of Milan “for routine cervical screening”.That is, they were detectedbyroutine screening. The authors referto a “diagnostic” test for the CIN cases. Thispresumably means an abnormal test resultwhich leads to a positive diagnosis. Twenty11three percent of the cases were classified “histologically” as “milddysplasia” (CIN I),26% as moderate dysplasia (CIN II), and 51 % as severe dysplasia orcarcinoma in situ(CINIII). Thecontrol group consistedof “women found to havenormal cervical smearsat the same screening clinics where CIN subjects had been identified.They were alsomatched for age by 5-year intervals”. It is not clear what definition of “normal” wasbeing used, although one of the tables suggests that class 2 tests wereconsidered“normal”. Since the authors were interested in the effectiveness ofdifferentpatterns ofroutine screening, any tests obtained for the purposes ofdiagnosis “because ofbleedingor other symptoms suggestive of cervical neoplasia” were excluded. “Subjectswerespecifically asked whether they had been screened by Pap tests, thenumberoftimes theyhad been screened, and their age at the first, last, and any abnormal test.The referencepoint for timing appears to be date of diagnosis for the cases and date of interviewforthe controls, (i.e., the exact definitions are not clear to me). “The oddsratios (and 95%confidence intervals) obtained forvarious risk factors comparing CIN casesand controlsare presented in Table 2. (Evidently the index category is noprevious screens).The results of logistic regression analyses controlling for “age, social class,number ofvisits to doctor or clinic, number of sexual partners, age atfirst intercourse, education,cigarette smoking, number of previous hospital admission, oralcontraceptive use, andhistory of cervical ectopia” are comparable to the univariateresults presented in thetable. Controlling for the total number of tests resulted in an oddsratio of 0.37 (95%CI 0.13-1.03) for last tests 5 years versus >5 years ensuringthat the effect was not12Table 2. Odds ratios for risk factors comparing CIN cases with controls.Risk Factor (relative to no previous Odds Ratio 95% CIsmear)1 previous smear 0.27 0.10-0.712 previous smears0.12 0.06-0.25<3 years since lastsmear*0.09 0.04-0.203, <5 years since last smear 0.31 0.11-0.855 years since last smear0.45 0.20-0.70Risk Factor (vs class I/TI smears only) Odds Ratio 95% CINo previous smear 11.76 5.59-24.75One or more abnormal smears within oo -one year ofdiagnosis/interview***One or more abnormal smears outside of 7.18 2.83-18.19a year prior to diagnosis/interview*Excluding cases with a positive smear less than one year prior to diagnosis (notincluding “diagnostic smear”) results in an odds ratio of 0.07***Relative to “normal smears only (class I/IT)”entirely due to the total number of tests. In summary,Screening on one occasion, irrespective ofwoman’s age and timesince the smear was taken, reduced the risk of ... CIN to about aquarter (RR=0.27), The degree of protection increased withincreasing numberofprevious smears, and with decreasingintervalsince last smear, both trends in riskbeing highly significant. [And]women with previous positive smears remain at increased risk ofcervical neoplasia.The authors offer the following explanation for the findings with respect toCIN:The finding that screening reduces the risk of CIN may seem13surprising, because strictly speaking Pap smears do not protectagainst the development of CIN. They are used to detect CIN,which if destroyed, may not develop into invasive cancer. Theexplanation for reduction in risk with increasing number ofsmearscould be that women with a healthy cervix have more opportunityduring their lifetime than do those with disease to accumulatemultiple tests.Berrino16offers an alternative explanation.Since CIN is a long-lasting disorder it is unlikely that women withnewly diagnosed CIN would have been screened recently; if so,their CIN would have been detected then. Thus, previouslyscreened women are bound to be under-represented among CINcases detected in any given period. This bias may easily explainthe observedassociationand its quite strong temporal trendwithoutpostulating any protective effect of screening.Berrino seems to be referring to the effect oflength of screening interval on prevalenceunder constant incidence. The groups definedby interval length are being compared onthe odds ofdisease which reflects prevalence rather than incidenceofdisease. The oddsratios which La Vecchia et. al. reportare “prevalence11odds ratios as opposed toincidence rate ratios which are the usual indices of relative risk.In some situationsprevalence odds ratios approximate relative risk,but not when the factor underconsideration is interval length. This subject will be elaborated below.It should also be observed thatPAP testscouldserve to prevent the development ofCIN,at least some forms of CIN. The diagnostic category “CIN” covers a variety of stagesof disease, and although all of them are without symptoms and thus can onlybe screendetected, protection against advanced stages such as carcinoma in situcan be achievedby detection of earlier stages through screening and subsequent treatment.14Another case-control study reportedin the Hakama collection wascarried out byMacGregor et. al.’7 followingthe same paradigm ofassessing the relative risk ofdisease as a function oftime since last negative test andthe number ofprevious negativetests. They recruited allcases of invasive squamous carcinomaof the cervix whichappeared on the cancerregistry in the Grampian regionof Scotland between 1968 and1982 who had attendedfor screening at least once.Eighty of the cases so obtained wereidentified as having been“detected by routine screening”.It should be emphasized thatthe cases were invasivecases and thus probably quitedistinct from CIS cases withrespect to properties like durationofdisease. The followingmethod was used to samplecontrols:Five random controls, matched foryear of birth and with theadditional constraint that eachmust have entered the study(at thetime of her first negativesmear) before the date of diagnosis ofcancer in the patient [and have]been screened within six monthseither side ofthe date ofthescreening test atwhich the cancerwasdiagnosed. This was to ensure thatboth patients and controls weredrawn from the same populationof women - namely, thoseattending for routine (asymptomatic)screening.17The results are reported in termsof “relative protection”, “theinverses of the relativerisks”. “The relative protectiondecreased progressively with increasingtime since lastnegative smear”7.This study wouldseem to suffer from thesame problem ofinterpretation as the preceding one,at least for the screen-detected cases, namely,sincethe “exposure” variable, “timesince last negative test”, is relatedto the duration ofdisease, the prevalence odds ratiois not equal to the incidence oddsj0l•One of the requirements ofa valid case-control study is that selectionof controls be15independentoftheexposurevariable’8status. The MacGregoret. al. study uses differentcriteria to select cases and controls. Thecases are required to have had one “diagnostic”screen within a 15 yearperiodand at least one screen priorto that, whereas the controlsfor a given case are requiredto have had one screen prior to and another screenwithina year of the case’s “diagnostic” screen.The issue is whether these procedures forselecting cases and controlshave the same, if any, sampling biaseswith respect toscreening frequency, which isone ofthe “exposure” variables under consideration.Wewill subsequently examine the implicationsof these methods of selecting casesandcontrols under three theoretical models.Not all studies have reported the sameinverse relationship between risk of disease(invasive or otherwise) and screening frequency.For example, van Oortmarssen&Habbema’9explain the high incidence ratein the first two years followinga negativetest for the B.C. data as possibly being due to thetesting of symptomatic individualsbecause the “screening program” startedas a diagnostic support service. The Manitobastudy, alsoreportedinHakama20,proposeddifferentexplanationsfordifferentagegroupsfor “the lack of a trend towards increasingrisk of developing invasive disease withincreasing time since a negative smear”.The lack ofa trend towardsincreasing risk ofdeveloping invasivedisease with increasing time since anegative smear may beattributable to differentreasons dependingon theage during whichthe woman-years occurred. In the youngerwomen (<35 years ofage), much of the long-term follow-up, especiallyafter only onenegative smear, is in error because manywomen may havechanged their names, and subsequent smearswere recorded undernew names. This would account for thelow incidence after only16one smear among women under 35years of age.Among women under 40 yearsof age (after a five-year interval,45 years of age) in particular, thereis a relatively high rate ofmigration, and thecalculation ofwoman-years at risk does not takethis into account. Thus, thewoman-years will be overestimated,and the incidence rates underestimated,by progressively greateramounts with increasing time interval;and there will be asignificant effect after longer timeintervals, when the woman-years at risk are already small.A high rate ofhysterectomieswas experienced in Manitoba amongwomen aged between 40 and50 years ofage, in particular duringthe years 1969-1975.Hysterectomy would have the same effectas migration ofunderestimating incidencein a progressive mannerwith time. Among screened womenover 55 years ofage, a largeproportion of cases may be due tofalse negatives, since theincidence ofin-situ cervical cancer inwomen at these ages is low.A smaller proportion of women of theseages has been screened,and the low sensitivity of the test fordetecting invasive casesamong women who werescreened because of symptoms wouldhave a considerable effect in maskinga trend.2°Some screening programs are more organizedthan others in the sense that there isaneffort to screen the entire populationat regular intervals. Others tend to rely moreonindividual choice and thus are moreproneto selection biases. Some ofthe Scandinaviancountries appear to have implementedrelatively more organized screening programswhere an attempt is made to screenall individuals at risk at regular intervals. Theresults from the “organized” screening programstend to indicate increasing incidencerates of invasive cancer with increasingintervals and the increase is moregradual withmore previous negative tests. The“opportunistic” screening programs like B.C.‘s orManitoba’s do not show the samepattern. In fact the B.C. data suggests higherincidence rates of CIS are associatedwith shorter screening intervals21,which is17contrary to thehypothesis that frequentscreening reduces therisk of developingcarcinoma in situ.Thecohort studiesreported in Hakamaet al. estimated incidence ratesofinvasive cancerby age, number ofpreviousnegative tests, and timesince last negative testsby dividingthe number of cases observedin each categoryby the total number of woman-yearsatrisk observed foreach category22.Apparentlyno attempts have beenmade to fitmultivariate modelsto incidence data, The presentstudy will fit a Poisson regressionmodel, also called a log-linearmodel, which isa particular case of generalized linearmodels3°The Poissonmodel is appropriate for modellingcounts ofindependenteventsunder a Poisson-likeprocess with a constant rate.The canonical model,to be use here,assumes that the covariate effectsare multiplicative with respectto the expected numberof counts. The logarithm ofcounts observed in the covariateclasses are fit to a linearfunction of the covariates withan optional “offset” which is givena coefficient of one,If, for this example, the offsetis taken to be the log of thetime-at-risk accumulated ina covariate class, then the log-linearmodel effectively models thelog of the estimatedincidence rate as a linearfunction of the covariates. IfY, represents the number ofincidence cases in a covariateclass defined by a covariatevector x then the log-linearmodel can be written in theform3°log{E(Y)}= offset - /3’x.18The method of analysis used inthe case-control studies consistedprimarily of logisticregression with conditional likelihoodfunctions”. Logistic regression is alsoa specialcase of generalized linear models.If ir1 represents the probabilityof disease for thejthcovariate class, then7r1/(l-ir) is the odds ofdisease. Logistic regressionmodels the logofthe odds ofdiseaseas a linear funcion ofthe covariates. Usingthe same notation forcovariates as in the case of Poissonregression, this is expressed symbolicallyaslog()= 13’x1.1 ‘ir1The adaptation of the logisticmodel to case-control studies involves conditioningon anindividual being sampled for thestudy. Breslow and Day26demonstrate usingBayes’theorem, that if sampling is independentof exposure status then the model coefficientsfor the exposure variables arethe same as for the population as a whole. Whencontrolsare matched with a particular case,the method ofconditional likelihoods canbe used toeliminatenuisanceparameters. This generalprincipleofstatistical inferenceis discussed,for example, in Cox and Hinkley,197423.The unconditioned likelihood function mayinvolveparameters, such as age, whichare known to be important covariates, but whichare not of interest in the given study,i.e., they are nuisance parameters. The methodof conditional likelihood replaces the unconditionallikelihood with a conditionallikelihood which conditions on the nuisanceparameters. Estimates ofthe parameters ofinterest are obtained from the maximum likelihoodsolution ofthe conditional likelihoodequation. With one-to-manycase-control matcheddesigns, a likelihoodfunction foreachmatched set is constructed which conditionson the fact that exactly one ofthe members19of the set is a case. Followingthe presentation in Armitage24,if theprobability ofdisease ofan individual ina matched set, s, is given by exp(a+Bx1),1=0,1,...,cwherec is the number of controls in each matchedset, then the probability that the observedcase is diseased given thatexactly one member of the set isdiseased is given by,exp(a+13x0)exp(a5-t-fx)’1=0,1,...,c. The term exp(Y5)factors out of the numerator and denominatorgiving,exp(fir0)exp(13x1)The conditional likelihood is then theproduct of such terms from each matchedset.20Chapter3Epidemiol.ogicalMethodsThe following isa review of the basic principlesof sampling in case-controlstudies’8.One first selectsa case series. Often these compriseall cases reported ina given period.It is recommended, however,that the cases form anuetiologicallyflhomogeneous group,i.e. they probably havea common causal historyof disease development. Controlsareselected from a poolof eligible controls. Thecrucial criterion for eligibilityis that theindividual would havebeen includedas a case had they developedthe disease. Controlscan be matched withcases or not. Rothman’8demonstrates how the odds-ratioobtainedfrom a case-control studyprovides an estimateof relative risk as follows:The relevant data ondisease incidence fora time period oflengtht might be summarized asI=--,1P1tand21i=J?,Potwhere 1 and I are the incidencerates among exposed andunexposed, respectively, a andb are the respective numbers ofindividuals who developed disease duringtime interval t, and P1andP0are the respective population sizes.... The cases in a case-control study are the individualswho became ill during the timeperiod, that is a total of(a + b) individuals. ... Ifa proportion, k,of the combined exposed andunexposed cohorts is taken ascontrols, and the number of such controlsis c for exposed and dfor unexposed, then the incidence ratesamong exposed andunexposed could be estimatedas= k--,Ct[Actually this is an approximation whichassumes that the risk ofdisease is small. The correct estimate isa/{t(c/k+a)}.]and [continuing with Rothman’s presentation]4k.-,the relative incidence, or rate ratio (RR,often referred to asrelative risk), is obtained asRR=.i=.4bcSince the sampling fraction, k, is identicalfor both exposed andunexposed, it divides out, asdoes t. Theresulting quantity, ad/bc,is the exposure odds ratio (ratio ofexposureodds among cases toexposure odds among controls), often referredto simply as theodds ratio. This cancelation of the sampling fractionfor controlsin the odds ratio thus provides an unbiased estimateof theincidence rate ratio from case-controldata [Sheehe25;Miettinen26].The central condition forconducting valid casecontrol studies is that controls be selectedindependently of22exposure status to guaranteethat the sampling fraction canberemoved from the odds ratiocalculation.’8Breslow and Day26give thesame caution.Onefundamental sampling requirementto which attentionis drawnis that the samplingfractionsfor cases and controls must bethesame regardless of exposure category.If exposed subjects aremore or less likelyto be included in the sample than aretheunexposed, serious biascan result.The ultimate aim of many epidemiologicalinvestigations is to estimate the relative riskof contracting a disease in a giventime period for those exposed to somecondition incomparison to those not exposed.Relative risk expresses the ratio ofrisk ratesfor twogroups, i.e., relative risk is the ratio ofthe probabilities of developingdisease for theexposed and unexposed groups.The odds ratio is the ratio ofodds ofcontracting diseasefor those exposed relative tothose not exposed, where the oddsofcontracting disease isdefined as the probability of contractingdisease divided by the probability ofnotcontracting disease. Forrare diseases, theprobability ofnot contractingdisease is closeto unity, so the odds ratio reducesto the relative risk.The probability ofcontracting disease ina given time period, P(t), can be approximatedby the cumulative incidence rate. Thisis demonstrated by Breslowand Day27 whocredit Elandt-Johnson28with the followingexpression for the instantaneous incidencerate, X(t),23X(t)1<dP(t),1-P(t) dtand hence,1—P(t) = exp{-A(t)}, (3.1)where,A(t) = X(u)du,the cumulative incidencerate. Taking logarithmsgives,A(t) —log{1-P(t)}P(t),when the disease is rareor t is small. Relativerisk is defined as the ratio ofincidencerates in exposed versus non-exposedindividuals. Followingthe presentation in Breslowand Day, ifr=X1/X2,theratio oftwo incidencerates, which is the definitionofrelativeriskadopted by Breslowand Day, is constant overa period oftime t, then from(3.1) wehave,24P1(t) = 1 — exp{—A1Q)}= 1 - exp{—X(u)du}= 1 - ex{—r.{X2(u)du}= 1 - [exp{—A2(t)}]’= 1 -rP2(t)providing the disease is rare or the time period isshort. Breslow and Day observe that“In general, the ratio of disease risks is slightlyless extreme, i.e., closer to unity, thanis the ratio of the corresponding rates”26.The usual way ofestimating incidence rates ina cohort study is to divide the number ofcases occurring in a given time period by the number of “person-years”at-riskcontributed by the cohort population. Person-years is thetotal of all time at-risk fromall individuals. To estimate incidence rates for stratawhich are time dependent, i.e., anindividual might belong to one strata level at one point in time and to anotherstrata levelat another time, the usual procedure is to assign the case to the strata levelat which thedisease “occurs”, and to partition the individual’s total time at risk amongthe strataaccording to time spent in the strata. Theprincipleat workis stated by Breslow and Dayas follows,The correct assignment of each increment in person-yearsoffollow-up is to that same exposure category to which a death25would be assigned should it occur at that time.This procedure is difficult to applyfor screen-detected diseases since the exact time ofdisease incidence is unknown. All that is knownis that the disease occurred at sometime between the start and the end ofthe screeninterval. Ifthe disease is rare, this factwill nothave much impact on the denominatorestimates since they are comprised mostlyof intervals which do not result indiagnosis. The problem is in how to classify thecases, i.e., which numerators to increment, withrespect to time dependent covariates,e.g. age, date, etc.One method which has been used isto date the onset at the midpoint between the dateof diagnosis and last negative screen priorto that. The time at risk is partitioned overtime dependent categories as if the case were incidentat the midpoint. However thisprocedure violates the principle that time at-riskshould be applied to that category towhich the case would have been assigned had itoccurred at that time. If the case isassigned to the category containing the midpoint and if portionsof time-at-risk areassigned to categories according to overlap, then portions of theinterval which overlapcategories otherthan the one containing the midpoint will be assignedto categories otherthan the one to which the case would have beenassigned had disease occurred at thattime. Whatever method one choosesto locate the time of disease (e.g. intervalmidpoint), the principle would seem to imply that the at-riskportion of the intervalshould be applied to the person-years of the samecategory as the case. How is the26principle to be applied to control intervals? The time at risk should be applied to thesame category that the case would have been assigned had disease occurred. This wouldseem to imply that the entire interval should be assigned to one category, namely the oneto which the case would have been assigned had disease occurred during the interval.This is perhaps the alternative method employed in some analyses by Boyes et. al.,namely,The denominators are obtained by calculating the number ofyearsat risk between pairs of successive smears and allocating thisnumber to the age groups within which the mid-point occurs.These years of risk are then aggregated over all pairs of smearsand all women to produce appropriate denominators, expressed asperson-years of risk which can be used in the calculation of theevent rates.1This description is ambiguous in that reference is made to “the age groups”, i.e. morethan one age group, within which the mid-point occurs. But assuming that the entireinterval is assigned to the unique age group containing the midpoint, this will meancontributions being made to one category from times when individuals actually belongto another category. The categories will tend to be blended. But at least the methoddoes not violate the principle ofassigning time-at-risk to the category to which the casewould have been assigned had disease occurred.If instead of counting the number of incident cases in a given time period one were tocount the number ofindividuals with disease at a given point in time and divide this bythe number without disease, we would be estimating the prevalence odds of disease’8.27The prevalence odds is a function of the average duration of disease, since the numberof individuals found with disease at a given point in time is equal to the number whocontracted the disease in the past and still have it. Under certain circumstances theprevalence odds approximately equalstheincidentrate multipliedby theaveragedurationofdisease18.Rothman gives the following presentation. First the incidence rate equalsthe inverse of the average time until incidence. To see this, imagine following a finitepopulation until everyone gets disease. At that point the incidence can be calculated bythe number ofpeople divided by the sum oftheir waiting times. Invert this and you getthe average waiting time. Assume the population is in a steady state, the number ofpeople entering the disease pool equals the number ofpeople leaving it, and let Nbe thetotal number ofpeople, P the number with disease, Ithe incidence rate, I’ the incidencerate of exiting from the disease pool, then for any time interval & we have, by thesteady state assumption that,It(N—P) = I’I.tP.And hence, since the mean duration in a state equals the reciprocal ofthe incidence ratefor exiting from that state, i.e., either going from a state of health to a state of diseaseor vice versa, ifi5 represents the average duration of disease, then O=i”, and hence,I&(N-P) = (l/D)&Pp=(3.2)N-PThat is, the prevalence odds equals the product of the incidence rate and the averageduration of disease.28Ifthe prevalencerate is small then N-PN and the prevalence oddsis approximatelyequal to theprevalencerate which in turn willbe equal to the incidencerate (fora steadystate population) whenthe average durationof disease is one unitof time. Prevalenceodds ratios equal incidencerate ratios i.e. relative risk,when the average duration ofdisease is equal in the twoexposure groups undercomparison. Or, alternatively,iftheaverage duration of diseaseis estimable for the groupsthen the prevalenceodds ratiocould beadjusted accordinglyto provide an estimate fortheincidence ratios, i.e. relativerisk.Thatcase-control studies ofscreen-detected diseasesinvolveprevalenceodds ratios ratherthan incidence odds ratioshas been reportedby Sasco et. al.29,but the implicationsinterms of the findings of suchstudies as MacGregoret. al. and La Vecchia et. al. havenotbeen articulated except, perhaps,in Berrino’s interpretation ofLa Vecchia’s findings.29Chapter 4ScreeningModelsThis study addressestwo issues raisedby case-control studiesinvolving screen-detecteddisease, namelythe effects ofproposed methodsof sampling controlsand the use ofprevalence oddsratios to estimaterelative risk. We examinethese issues in thecontextof three screeningmodels: a fixedinterval model whichis oversimplified butwhichserves to illustratesome of the issuesinvolved; a uniformitymodel which islessrestrictive than the fixedinterval model; anda Poisson model. The primaryobjectiveis to see whetherthe proposed methodof matching controls,based on having ascreenwithin six monthsof the “dateof diagnosis” of acase, affects the resultingodds ratio.Sincetheindependentvariable, interval length,may berelatedto theprobability ofbeingsampled as a control,there is a possibilityof sampling bias. Intervallength is alsorelated to duration ofdisease,so by the theory developedaround equation (3.2),the oddsratio, which is a prevalenceodds ratio, may notapproximate relativerisk.We will assumethat the cases consistof all cases of diseasediagnosed withina studyperiod oflength L. Thecontrols willbe assumed to be matchedwith the cases on the30basis ofhaving a screen within a matching period oflength 1 centred around the date ofthe screen which led to diagnosis for the matched case. In the study by MacGregor‘7Lwas 15 years and 1 was one year, The requirements forcontrols mustbe modifiedat the two ends of the study period. Since a prerequisite for eligibility as a case or acontrol is that the screening interval ends within the study period. Controls must alsobe at-risk for disease, i.e. currently free of disease and without prior treatment thatwould preclude disease. To simplify matters we will assume that the tests are error-freeand that the disease under consideration does not regress. The null hypothesis is thatscreening does not have any effect other than identifying the presence of disease.Fixed Interval ModelConsider apopulation which consists oftwo groups in equal numbers, say 100,000each,one group of individuals who are screened annually and another who are screened biannually (i.e., every two years), with 50,000 screened in each year. Assume also thatthe incidence rates are constant, small, and equal in the two groups, say 1 per 1000 peryear, and that any diagnosed individuals are immediately replaced by others from anexternal source. Suppose each case occurring within a 2 year period is selected andmatched with a control drawn randomly from those tested in the same year the case isdiagnosed. Assume diagnosis only occurs as a result ofa screen and is immediate. Thesituation would be as depicted in Table 3.The reason 100 diagnosed cases occur in the bi-annual screening groups on each testingoccasion is that 50 were incident during the past year while another fifty were left over31Table 3. Hypothetical example of sampling matched controls underfixed intervalscreening.odd year even yearscreeningpatternscreened incidence diagnosed sampled screened incidencediagnosed sampledcases cases controls casescases controlsannually 100,000 100 100 133.5100,000 100 100 133.5even years 0 50 0 050,000 50 100 66.5odd years 50,000 50 100 66.50 50 0 0from the previous year since they werenot screened and therefore not diagnosed at thetime. Two hundred controls are sampled randomly fromthe 150,000 who were screenedin a given year, resulting ina 2-1 ratio of annual to bi-annual screeners among thecontrols. Ifwe now compute the odds ratio forscreening frequency we have the setupshown in table 4.Table 4. Odds ratio for annualvs bi-annual screeners.annual bi-annualOdds ratio=134/2671/2200267casescontrols200133The incidence risk ratio is unity, but the case-control methodologyemployed results in32an odds ratio of0.5. The inference we would draw from the results ofthis study, undernormal circumstances, would be that annual screening reduces the risk of disease.Uniformity modelThepreceding example assumed fixed screening intervals. This is probably not realisticsinceintervallengths vary withinindividuals. In the uniformity model wedo not assumeany connections between intervals within an individual, however, we do not rule themout either. We only assume that the distribution ofscreening intervals is stationary, i.e.,the same at any point in time, and “uniform”, i.e., the starting times for intervals of agiven length are uniformly distributed over a period of time which encompasses allpossible starting times of intervals that end in the study period. Thus in the uniformitymodel, the units ofanalysis are intervals with lengths and starting times which satisfy thestationarity and uniformity conditions. This model, like the fixed interval model, is notintended to be realistic but, rather, hypothetical for the purposes of illustration. We donot attempt to develop a realistic model of screening patterns.Assuming a constant incidence rate of disease I, the probability that an individualdevelops disease during an interval of length U=u is approximately lu, for ifD is arandom variable which takes the value 1 when disease occurs during an interval and 0otherwise, then,P(D=1IU=u) = 1-exp{-A(u)} Iii.(4.1)Assume that Uis a continuous random variable representing interval length with density33f,then by Bayes’ theorem and (4.1), the conditional density of U given D= 1,fD=1’is given by,- P(D= 1IU=u)•f(u)(4.2)fuD=l(u)- fP(D= 1IU=u)•f(u)duluf(u)flu.fu)duuf(u)This gives the distribution of interval lengths for cases. Hence, the odds for intervalsof length u conditional on intervals having length u or v and disease occurring is givenby,fuID=l(u) u•f(u)(4.3)fuDl(v)vNext we consider screening histories containing intervals that end within a matchingperiod of length 1 centred at the time of diagnosis of a case. The matching period iscentred at the screening date which “led” to the diagnosis of a case. The case-controlmethodology which is under examination requires cases to be matched with controls whohave a screen within specified limitsofthe “diagnostic” screen. Since an individual mayhavemorethan oneintervalending in thematchingperiod, formathematicalconvenienceand to ensure uniqueness we will require control intervals to span the start of thematching period, i.e. begin before and end after the start of the matching period. Theevent ofan interval spanning a point in time t will be represented by a random variable34S which takes the value 1 if an interval spans t, and 0 otherwise.Let the matching period for the selection of controls for a particular case begin at timet and have length 1. The probability that the endpoint ofan interval which spanst landsin the matching period follows from the uniformity assumption. Since the starting pointof an spanning interval of length u is uniformly distributedover (t-u, t) and the sub-interval ofstarting points which land the endpointin(t, t+l) has length 1 while the entireinterval of possible starting points has lengthu, the probability of the endpoint landingin (t, t+l) is min(1, i/u). The event ofending in the matching period is representedbythe random variable M which takes the value 1 when an intervalends in the matchingperiod and 0 otherwise. Then we have,P(M= 1IU=u, 5= 1) = min(1, i/u).(4.4)The probability of an interval having lengthu given that it spans t (5=1), ends in thematching period (M=1), and doesn’t lead to diagnosis (D=0) is givenby,- P(M=1, S=1IU=u, D=0)• fuD.o(u)(4.5)fuIM=1,s1,D=o(u)- P(M= 1, S=1ID=0)— P(M= 1, S=1 U=u)•fuIDo(u)- P(M=1, S=1ID=0)by Bayes’ theorem and the independence of interval span and endpoint location fromdisease status. The conditional probability of spanning t and ending in the matchingperiod, given that the interval has length u isP(M= 1, S= 1 U=u) = P(M= 1IS= 1, U=u) • P(S= 1 U=ii).(4.6)35We already have an expression for the first factor, namely min(l,i/u). For theprobability that an interval oflengthu spans a time t we can assume that t € (0, L-l) ifcases diagnosed within 112 ofthe end ofthe study period are matchedwith controls withintervals ending within 1 of the end of thestudy period. Also note that the intervalendpoint can range from 0 to L, but not beyond for otherwise if disease weredetectedat an endpoint beyond L the individual would not qualify as a case in thestudy. By theassumption ofuniform starting times for intervals oflengthu over any specified period,itfollows that theendpoints ofintervalsoflengthu are uniformly distributed over (0, L).Therefore, the probability of spanning a pointt in (0, L-l) is given by the ratio of thelength of the region containing favourable endpoints to the length of all endpoints (L).The length of favourable endpoints depends on the relative sizes ofu and 1, for ifu> 1then for t c (L-u, L-l) therange ofendpoints forwhich the interval spanstwill be limitedto (L-t, L). Otherwise the range of favourable endpoints will have lengthu. Theseresults are summarized as follows,(4.7)-, 0t<L-u, u>iLP(S==lIU=u) =, L-utL-l, u>l.f,luLSubstituting min(l, i/u) for P(M=1IS=1, U=u) and (4.7) in (4.6) gives,36(4.8)= I,Ot<L-u, l<uuL LP(M=1, S=1 U=u)=, L-ut<L-l , i<u, luLNext we compute- P(D=O U=u)•f,(u)(4.9)fu1Do(u) —_________________________P(D=OIU=u)• f(u)duf(u)P(D=OIU=u)f(u)dusince P(D=O U=u) 1-lu 1. This assumes thatif[a,b] is the supportoffthenP(D=OIU=u) 1-lu is true for all u e [a,b]. Substituting (4.8) and (4.9) in (4.5)gives,I, Ot<L-u, l<uLfUM=1,S=1,D=O(u)ocf(u)xl(L-t)L-ut<L-l, l<uf,luL37which shows that the distribution of intervals which span t and land in the matchingperiod approaches the distribution of intervals starting from a point t as 14’ 0. Nowdividing by the same expression for v in place of u, and assuming without lossofgenerality that u < v gives,(4.10)1 , 0t<L-v, l<u0t<L-v, ul<vL-vt<L-u, l<uL-tfuM=1,s=1,D=o(u) f(u)<fu1M=1 ,S= 1 ,D=OO’)f(v)l(L-t), Lv t<L-1, u 1<vX, L-ut<L-l, l<uUf,lvVThis expression gives the odds for intervals oflength u which meet the selection criteriafor controls when attention is restricted to intervals of length u or vassuming theregularity conditions hold. Henceunder theseassumptions the ratiooftheodds for cases(4.3) to odds for matched controls (4.10) gives the (prevalence) odds ratio, ‘I’,381 , Ot<L-v, l<uOt<L-v, ul<vL-vt<L-u, l<uL-tV , L-vt<L-l, ul<vl(L-t)L-ut<L-l, l<uUf,ivVThus under the given regularity assumptions, the prevalence odds ratio willequal theratio ofinterval lengths ignoring complications that arise withinan interval length oftheend ofthe study period, which may be a sizable period depending on theintervals undercomparison. It mightbeadvisable to stratify the analyses in comparing differentintervallengths by the locations of the case diagnosis dates.Ignoring complications arising near the end of the studyperiod, the distribution ofinterval lengths given that they span a point in time tis given by,— P(S== 1IU=u) •f(u)f= -P(S 1IUu)• f(u)duuf(u)E(U)’39since P(S=1 U=u) ulL. This is the same as the distribution derivedearlier forcase intervals (equation 4.2). But the controls arerequired to satisfy the additionalconstraintoflanding in an intervaloflength 1 which we have seen effectivelyweights thedistribution of controls inversely by interval length.40Poisson ModelPAP tests may be considered rare events,and if screening intervals are independent ofpast intervals, then the screening process maybe modelled by a Poisson process. Wewill assume thePoissonprocess is time homogenouswith parameter. This implies thatinterval lengths have an exponential distribution with mean14t. Once the process hasbeen going for a while the distribution of starting timesof intervals relative to thebeginning of the process will be distributed as thesum of independent exponentials.Hence the distribution of starting times withina given small interval will beapproximately uniform. Thus thePoisson model approximately satisfies theassumptionsofthe uniformity model in the previous section. However, the Poisson modelspecifiesthe distribution of the interval lengths. Once again, the objective ofthis excercise istoderive the odds ratio which results from selecting controlsto match cases on the basisof having a screen within a specified time of the date of diagnosis under the nullhypothesis.I. Distribution of Spanning IntervalsWe will first calculate some useful distributions which will be needed for the calculationof case and control screen interval distributions. As before, let t be the start ofamatching period. We now define two random variables. Let T0 be thetime of the lasttest before the start of the matching period, and let T1 be the time of the first test afterthe start of the matching period. Let N(t) be the number of tests during timet. The41probability that there are n tests beforet is given by,e’ t”(4.11)P(N(t)=n) =_____n!For n 1, P(Toe[t0,t+dt0J,T1c[t,t1+dt1]IN(t)=n)e(n-I(n— 1)!dt0 • e”1• dt1—________________________________n!np_____ e”’dt01.The four factors in the numerator represent theprobability ofn-i tests beforet0,one testindt0,no tests betweent0 andt1,and finally, onetest in dt1.We will not consider n=Osince there is then no screen prior to t. Thus multiplyingby P(N(t)=n), for n 1, andsumming over n gives,P(T0E[t,t0÷dtJ,T1E[t,t1÷dt],N(t)1)(4.12)(n—i) —=(itY’edt01,((n—i)(n- 1)!2e”dt0dt1,=2e1’’dt0dt.Let U=TrT0,the length of the smear interval spanningt, then the Jacobian associatedwith the parameter transformation (t1,t0)— (u,t0)is42J(t,tIt,u)= - = i.i - = 101 08t0 3u au at0The marginal distribution for U is obtained by integrating (4.12) overt0wheret0 rangesfrom max{(t-u),O} to t, giving,(4.13)dt0, utfu,N(1)1(u)dt0 , u<tu2te , utu<tNow as t —oo(i.e., the interval spans a point distant from the start oftheprocess) andP(N(t) 1)-1, then,fu,N(t)l(u) —f(u)=ii2ue.(4.14)The interval length thus has a Gamma distribution(equal to the sum oftwo independentexponentials with parameter t).43II. Distribution of Spanning Intervals Ending in the Matching Period.Wewill now calculate the samedistribution, i.e., thedistribution oftheinterval from thelast smear (T0)prior to t to the time of the next smear, T1,subject to the condition thatT1 t+l, i.e., the endpoint ofthe spanning interval lands in the matching period. ForN(t)l,P(Toe[to,to+dto],T1e[t,tj+dtjIN(t)=n,T1<t+l)n-1kI• udt• e_h1_t0)• dt= (n-i)!e’(Lt)• (i—e’)= nIA—edtdt.t 1—e1Multiplying the terms by P(N(t)=n) and summing gives,P(T0c[t,t+dt],T1c[t,t+dt],N(t)i T1<t+l)2-(t-t)(4.15)= pedtdt0 1Again, let U=T1-T0,and integrate over T0 to get the marginal distribution of U. Therange ofintegration is determined by the possible values oft0 under the constraint ont0and t1 with respect to t. The integral may appear in two different forms depending onwhether 1<t or 1>t. We are typically only interested in the case t>1, i.e., where thelength ofthe interval for having a smear is small compared to the time since the process44started. Thus, we have(4.16)I di , Ou<l01+1—udi , lu<tJl—e’JU,N(t)l(Tl+1k14)— 1-uj+1—uI dt , tu<t+lI1—e10 , ut÷lHence, (4.17)2ue0u<l1— el<+u<tf(u= 1—e”1JU,NQ)1171+1\ /p2(t+l—u)e, tu<t+l1— e’0 , ut+lWe may now use this expression to derive the distribution of the screen interval lengthfor spanning intervals required to fall within a matching interval. Let t — oo, thenP(N(t) 1) — 1 and,(4.18)2,u<l1— e1fu,N(I)lIt+l(u) —fuIT<I+l(u) =IA2le’ul1—Note that if 1 is small in (4.18) above (i.e.,lxtC’)then for practical purposes we have45f (ii = (4.19)JU17i-fl\iand the distribution of intervals which terminateat t is the same as the overalldistribution ofintervals. This is the sameresult thatwas found forthe uniformity model,and suggests that the method of sampling controlsbased on the criterion of having ascreen close to the date ofdiagnosis ofa case, does not affect the distribution ofintervallengths.III. Risk ofDisease Over the Study Period.The preceding distributions are not yet directly applicableto case-control studies sincethey have not been calculated separately for cases and controls.We will now considerthe derivation of appropriate distributions forcases and controls. We will shift ourattention from matching intervals oflength1 to the entire study period oflength L. Lett be the start of the study period. Then the study periodconsists of the time period (t,t+L). Any individual screened positive in (t, t+L) willbe a case in the study. Let T0be the last screen prior to t, M=m be the number of screensin the study period, and 1be the1thscreen after t but before t+L, 1=1,... ,m. (see diagram below).;mH46Define U1=T1-T0,u=i+-1, 1=1,.. . ,rn-1. Thus the U1’s are the intervals betweenscreens ending in the study period. Notethat U1 is a spanning interval and thus wealready have its distribution. All we needto do is weight this distribution byu1(whereI is the incidence) to obtain the probability ofbeing diagnosed as a case in a spanninginterval oflength U1=u1.Thus we will now considerthe situations of screen intervalswhich begin and end within the study period.IV. Risk ofDisease from Intervals Contained inthe Study Period.Consider the joint distribution of two screening times 7, 7÷ in the interval(t, t+L)during which there are M=m screens. For rn 2,P(7[t,t1+dt],7’[t1÷,t+dt1],M=m){(t— r)}’1eIL(ii){JL(t÷L— t)}m_i1e-p(z+L—:,,,)= (i-1)!•dt e(rn-i- 1)!(t.—t)1(t+L—t.)m_i_1= me_iL.(i—1)!(rn—i—1)!dt1 , i=1,2,...,rn—1.We wish to obtain the distribution of(]=1÷-I. Now consider the transformation(1,—(ti, u). The marginal distribution of U,f1,M(uI+l,rn),is obtained by integrating over t. Now t 7 t+L-U÷1,thus, fori=1,2,...,rn-l,But,47t+L-u.fU,,M(u+l,m)=‘dt,•1+L—u1(tj_t)(t+L_tj_Uj+i)m_i_ld(L_u1+)m_i(i— i)!(rn—z— 1)! (m— 1)!so that,m(L_rn-ifU,,M(uI+l,m)=(rn—i)!e’-” , i=l,2,...,m—l.The above is true for 0 u•÷1L. Foru1>L,fU+l,M(uI+l,rn)=0.First notice that the formula foru11 does not depend explicitly upon i so that the samedistribution holds for all u, thus for V rn 2, we may write,rnafi_ rn-iurn=!.Lk1U,JU,M”’/—1’This (not unexpected) result says that conditional on the number oftests in a diagnosticinterval thedistribution ofeach inter-screen interval is the same. Now we must calculatewhattheprobability is that an individual with a screen interval U=u, will develop cancerin the study period. Under the null hypothesis this probability is approximately iii,where I is the incidence rate. Conditional on there being M=rn tests in the diagnosticinterval there willbern-i testintervals so that the probability that an individual will havecancer diagnosed at the end of an interval of length u in a study period containing rn48screens is approximately given byP(D= 1, U=u, M=rn) (rn-i) ul.fUM(u,rn)where the factor (rn-i) is included since there are (rn-i) intervals. This calculation isapproximate because we are ignoring the possibility that disease develops in earlierintervals when calculating the contribution of later intervals. Thus, for m 2,mit ‘m—1TP(D=i, U==u, M==m) uIiiU,(in-2)!In order to obtain P(D,u) we must now sum over m.P(D= 1, U=u) uI•-(4.20)j.L2u(L-u)Ie , 0u<L, m20 , otherwise(Noticethe sum from rn 2 since there are no screening intervals within the study periodotherwise).V. Risk ofDisease from the Spanning Interval.We must now calculate the contribution of spanning intervals in which theperson isfound to have disease at their first test after t. We have calculated the distribution forsuch tests conditional on one existing (equation 4.18). In this case we are interested inthe unconditional probabilities490u<L, m1(4.21)f(u) = 2Le” , uL,m10 , otherwiseVI. Overall Risk ofDiseaseIn order to calculate the overall distribution of cases detected at screening in the studyperiod, (t, t+L), we need only multiply (4.21) by ul and add to (4.20), giving,P(D= 1, U=u) =2uILe , u 0(4.22)But,P(D= 1)=P(D= 1, U=u)du = ILso that the distribution of screening intervals amongst the cases, P(U=u D=1) is,P(U=uID=1) =j2ue , u0,i.e., the distribution of the spanning interval (equation 4.14). Surprisingly, the resultdoesn’t depend onL. Under the null hypothesis the distribution ofinterval lengths is thesame for cases and controls. But we have just found that the distribution of intervallengths for cases is the same as the distribution of spanning intervals, while thedistribution of interval lengths for controls is the underlying distribution. The Poissonand uniformity models both agree on these two results.50VII. Inter-subject Variabifity.The fixed interval model considered previously has no within subjects variation,whilethe Poisson modeljust examined has no between subjects variation.In the case-controlframeworkboth models lead to biased estimatesofrelativerisk in thatcasestend to havelonger intervals than controls under H0. We can expand the Poisson model toincludebetween subjects effects by mixing the “p” parameter. That is, assumethat there is adistribution g,() of screening intensities within the population. For the caseswehave,P(D= 1, U=u M=) =i2uILefrom equation (4.22). Thus,P(D= 1, U=u, M=1.)=2uILeg(j),P(D= 1, U=u)fuILeg(ji)d1,P(D= 1)= f f2uILeg(J.L)d,du,and thus the conditional distribution isfueg(j)d(4.23)uID-1j’ [If()is chosen to be conjugate to2uethen P(uID) will have a simple form.The density can be written in exponential form as,exp[(-ji)u + log(u) + log(2)]Hence the conjugate prior is a gamma distribution (Cox & Hinkley22),say,I’(a)51Evaluating the numerator of (4.23),a a-i(4.24)f2ue•e =____. fLa+1edU1’(a) 1’(a) j= uf3ar(a÷2)F(a)(u+f3)2’= a(a÷1).13aU(u+I3)2The denominator evaluates to unity, hence, (4.24) is the conditional density,f1D=i(u).Next we must consider the controls. We found that for short matching intervals thedistribution of screens is approximately ue” (equation 4.19), that is,P(D=O, U=uIM=1L) =Then,P(D=O, U=u, M=t) =p(D=O, U=u) =fP(D=O) =feg(p)didu,f(u)= fIAeg(J2)dIUID=O.1 1eg(I)dI1duThis gives,52a (4.25)UIDO—(u+I3)’Summary and ResultsThe uniformity and Poisson models give very similar results. First, the distributionofintervals which span a point in time converges to the distributionofintervals starting atany point ifthe spanning intervals are required to end in a decreasing matching period.Second, the distributionofintervals for cases equals the distributionofspanning intervalsunder the null hypothesis.Under the fixed interval model each individual’s screeninghistory can be assumed tohave an interval which spans a given point in time t. Hence the distributionof spanningintervals is equivalent to the distribution of screening frequenciesfor sequences offixedlength intervals. In the example, it was assumed that halfthepopulation were screenedannually and the other half were screened bi-annually. Thus we wouldfmd thedistribution of spanning intervals to be a half for intervals of length one and twoyears,the same as the distribution ofintervals for the cases. This differsfrom the distributionofintervals that end at a given time which favoured one year intervals by a ratiooftwoto one.53We used the B.C. screening programdata to investigate the reasonableness of theassumption of uniform starting times.Table 5 presents the distribution of screenintervals which span Dec. 1979by the starting month for intervals oflength 1-5 years.Table 5. The distribution ofscreen intervalsoflengths 1-5 years which span December,1979 by month of start of interval.Years Dec Nov Oct SeptAug Jul Jun May Apr Mar Feb Jan1 1979 52 153 200 122 121120 175 246 180 169 153 1692 1979 45 68 49 3923 27 38 46 32 56 53 581978 33 48 48 51 2928 40 45 46 46 54 493 1979 16 24 22 18 12 1117 17 9 29 9 181978 11 24 17 16 159 17 14 19 33 18 161977 7 16 21 12 9 11 1217 16 16 17 2041979 0 6 49 5 8 8 17 13 9 5 81978 5 11 3 3 11 4 79 10 17 11 61977 1 8 9 6 3 18 5 10 12 9 51976 7 8 14 8 3 312 6 4 16 8 651979 2 9 7 3 52 2 2 7 5 2 41978 0 5 5 5 0 19 7 6 5 7 21977 2 2 5 3 3 47 5 2 8 5 41976 8 4 6 2 2 8 22 5 5 6 51975 0 8 7 3 33 5 2 9 6 6 12Except for a slight decrease in screening during vacationperiods, the distribution ofstarting times appears to be quite uniform.Figure 1 displays the distributions of interval lengthsof all intervals which span 1980(Dec.1979) as well as the distribution ofintervals whichspan 1980 and which end priorto 1981. The third distribution depicted is that of intervals which span1980 and endprior to 1981, but the frequencies of the intervalshave been weighted by the intervallengths.54-2-3>C)cza)c-4a)>3-50-J-6-71030 4060Interval length (months)Figure 1. Distributionofintervals spanning 1980 for Allintervals and intervals endingbefore 1981, Unweightedand Weighted by interval length.The weighted distributionapproximates closely the distributionof all spanning intervals.The distribution of unrestrictedspanning intervals correspondsto the distribution of “inprogress” intervals ata given point in time. If each individualhad tests repeated atregular intervals but these intervalsvaried between individuals thenthis distribution205055would correspondto the distribution ofscreening frequencies inthe population. Thedistribution ofintervals whichend in a short timerange approximates thedistribution ofintervals that end ata given point in time.The empirical results displayedin Figure 1support the relationbetween the two distributionsas predicted under the uniformityassumptions. That is,the probability ofan interval ending ina narrow range equals theprobability ofspanninga point in time weighted inverselyby the length ofthe interval.Implications of the Poissonmodel with inter-subjectvariablility were exploredfordifferent parameter valuesof the conjugate priorrepresenting differentdegress ofvariability. The meanand variance ofa Gammadistribution, in terms oftheparametersandfl,are a/fl and a/fl2,respectivley.We chose to fix the meanscreening intensityto betheinverse oftheunweighted sample meanofthe screen interval lengthswithin theB.C. cervical screening programdata set (to be described below).The mean intervallength for 668,751 intervalswas 27.10 months witha standard deviation of 27.39months. Values for a andftselected to givea mean of 1/27.1 and coefficientsofvariation of 30%, 60%, and100% to represent low, medium,and high variabilityrespectively. The principleoutcome of interest is theresulting odds ratio forvariouscategories of interval length.To parallel the analysesto follow, we chose intervalcategories: 10-18 months,19-30 months, 31-42 months,43-54 months, 55-120 months,and > 120 months. The oddsratios ofinterval category“exposure”, (e.g. exposuretoa 10-18 month screen interval),relative to the index category, >120 months, are givenby,56P(UE[l0, 18] j D=l) / P(UE(120, 0°)ID=l)P(UE[1O, 18]ID=0) I P(UE(120, co)ID=0)The probabilities wereobtained by integrating the appropriateconditional density overthe specified range. Theresults are presented in Table6.Table 6. Theoreticalodds ratio of interval length relative to >120 months, cases vs.controls, by amount of inter-subjectscreening intensity variability in Poissonmodel.Variability (coefficient ofvariation)Interval lengthlow (30%)]medium (60%) high (100%)10-18 months 0.1250.280 0.36819-30 months 0.2120.418 0.51631-42 months 0.3070.535 0.62943-54 months 0.3960.622 0.70555-120 months 0.5830.763 0.818Evidently lowerinter-subjectvariabilityin screening intensityresults in morepronouncedeffects of interval length on prevalenceodds ratios.57Chapter5Data AnalysisB.C. has a centralised cytologyscreening program whichhas been in operation since1949. Until the earlysixties, it was more ofa diagnostic support service thanascreening program. The proportionof women over 20 everscreened was about3% in1955’. With the widespreaduse oforal contraceptives in theearly sixties, the Pap test,which was applied in conjunctionwith the dispensing oforal contraceptives, becameincreasingly less of a diagnostictool and more of a screeningtest. By 1962 it wasestimated that 53% of womenover 20 had everbeen screened, and by 1969 thisfigurerose to 78%’. The samplesare obtained by general practitionersand gynaecologists forthe most part, and then sentto a central laboratory wherethey are interpreted. Patientsare assigned identificationnumbers which theirphysiciansare supposed toprovidealongwith the sample for patientidentification, but nameand date ofbirth are also recorded.The test results and other informationare entered into a centralizedcytology computerfile. The result, along withrecommendations for furthercare, is returned to thereferring physician who isresponsible for the careof the patient. Physiciansareresponsible for advising appropriatescreening intervals for theirpatients. If, forexample, a test is abnormaland not merely benign atypia,physicians will be sent a58reminder fora repeat test ifone is not obtained withinfour months. Also, forcases withhistories of severe dysplasiaor carcinoma in situ,reminders are automatically sentannually.D.A. Boyes, B. Morrison,and colleagues’, undertooka cohort study with data from theBritish ColumbiaScreeningProgram covering theyears 1949-1969. Their objectivewasto provide estimates ofprevalence and incidencerates of dysplasia or worseandcarcinoma in situ or worse.Two cohorts wereselected to cover as widea range ofageas possible as well as providingsome overlap. “The recordsofall women who had beenborn in the years 1914-1918,and 1929-1933, and whohad had 1 or more cervicaltests[prior to 1969], were pulledfrom the identity filesof the British Columbia CentralCytology Laboratory”1(52,452and 66,701 women respectively).Extensive recordlinkage procedures were carriedout in order to minimize duplications.The followinginformation was extractedfrom the data base for everyqualifying women:(1) Identifying information,i.e. the patient’s surname(first 12 characters), first name(first 8 characters), second initial,month and year of birth, and husband’sfirst name(first 4 characters);(2) The month, yearand cytologicalclass ofevery smear taken upto the end of 1969 andfor women with subsequentpositive histological findings, theoriginal cytological classas assigned when the smear wasfirst read, as well as a reviewed classbased upon are59examination of thespecimens;(3) All consequentradiation and surgical procedures,including cervical biopsyandhysterectomy. Hysterectomiesperformed for reasonsthat were not the consequenceoffindings obtained from screeningcould not be completely documentedunless a womancame for a repeat smearfollowing hysterectomy.(4) Histological diagnosesbased on biopsies or othersurgical procedures.Follow-up on the originaltwo cohorts has been updatedto 1992 by Morrison21.Thisprovides a longer periodofobservation for each womanas well as a greater overlap inage for the two cohorts(30 years instead of 5). The originalstudy involved 121,722woman. Successful linkagewas achieved for 43% providingupdated data for 71,236women. There were two reasonswhy linkage failed in abouthalfthe women. (1) Fileswhich had a historyofeither all negative, no histology,no test in past 7 years, or deathbetween 1976 and 1985 wereremoved from the archives.(2) The linkage method wasconservative.The “raw” data wereprepared for analysis as follows: First,the data consisting ofPAPtest, histology, and death recordswere sorted chronologicallywithin subjects. Therecords were then processedsequentially. A pair ofrecordswas defined to constitute ascreen intervalifthe starting recordhad class 1, the ending recordwas a Pap testrecord,60and theinterval was at least 10 months. This operational definition ofscreening intervalswas designed to achieve two objectives. First, we wanted to examine the risk ofdiseaseassociated with routine screening, as opposed to diagnostic testing which is often timesdone when the presence of disease is suspected. And since we didn’t have anyinformation other than the times and results of tests, we had to resort to the method ofexcluding intervals under 10 months. We decided to exclude intervals that started withan abnormal test result for the same reason. The excluded intervals create “gaps” in anindividual’s screening history. Although the gaps are not studied as screen intervals,they do play a role in the estimation ofincidence rates and in the formation ofcovariateclasses based on conditions preceding screen intervals.Any records following either diagnosis of CIS or worse or a hysterectomy wereexcluded.Each screen interval was assigned to a covariate class to contribute towards thecomparison of cases and non-cases. For the cases this interval was the last intervalbefore diagnosis. So each valid screen interval is included as a “last” interval andcontributes to a covariate class defined by the following factors:(1) Cohort: born 1914-1918 or 1929-1933,(2) Period: the timeperiod when the interval ended was categorized as: pre- 1963, 1963-611975, 1976-1992. These cutpoints werechosen because the three periods werebelievedto differ with respect to samplecharacteristics and/or screening practices. Priorto 1963screening tended to serve primarilyas a diagnostic tool. At about 1963the use oforalcontraceptives became moreprevalent and since Pap tests were frequentlyperformed inconjunction with the dispensingof oral contraceptives thetest became more of ascreening instrument. Priorto the mid 70s the diagnosis and treatmentofabnormalitieswas quite a serious procedure,so it was common practiceto wait until the disease hadreached an advancedstage before taking action. However withthe introduction of thecolposcope (an instrument whichallows the visual inspection of thetissue) in the mid70s, treatment was applied morereadily.(3) Interval: screen interval lengths were broken intocategories: 10-18, 19-30, 31-42,43-54, 55-119, 120+ months, the ideabeing to approximate 1,2,3,4,and 5-10 yearintervals. This category is referredto as the “last interval” category sincecases wereassigned to the “last” interval.(4) Previous: the combinedlengths of the two screen intervals precedingthe “last”interval were grouped in the followingcategories: 10-36, 37-60, 61-84, 85-119,120+months, nopreceding intervals, and onepreceding interval. The reason for lookingatpreceding intervals is to address the factof false negative tests, i.e. tests which reportno abnormalities when in fact disease is present.Thus an individual who tests negativetwice is less likely to have disease than onewho has tested negative only once. It may62be useful to be ableto estimate the risk ofdisease associated withvarious patterns ofscreening intervals.The choice of categoriesfor the combinedlength of the twopreceding intervalsis intended to groupindividuals who weretested at fixed intervals ofone, two, three,and four years.It also identifiesindividuals who had oneor nopreceding screens.(5) Preceding gapabnormality: three categoriesof abnormality occurring inthe “gap”,ifany, precedingthe “last” interval: one forintervals with no precedinggap (either the“last” interval beganwith the termination ofa screen interval witha class 1 test resultor it was the individual’sfirst record); a secondcategory for “minor”signs ofabnormality includingclass 2 or class 9 (inadequatesample) test resultsor class 1, 2, or9 less than 10 monthsapart; and a third categoryfor “major” signs ofabnormalityincluding class 3 or 4 testresults or any histologyrecord. The inclusion ofthis factorwas motivated by an attemptto control what is probablya very important predictorofdisease, namely previousabnormalities.The date of diagnosisfor cases with CIS or worsewas defined to be the endpointofthelast screen interval precedingdiagnosis. The rationale forthis definition was that testsdone subsequent to the lastscreen interval were probablydone for diagnostic purposesrather than routine screening.The exact date of onset ofdisease cannot be identifiedwith much accuracy,so the decision is somewhatarbitrary. The total time-at-riskcontributed by an individualextends from the start ofthe firstscreen interval to the end63of the last record or until diagnosis or hysterectomy. It is partitioned into segmentsaccording to starting points of screen intervals.Gaps are assigned to the precedingscreen interval. The segments are then assignedto covariate classes according to thecovariate pattern of the screen interval defining thesegment. Period is defined by theendpoints of the screen intervals. This method ofassigningtime at risk is method B ofBoyes et. al. and follows the principle that timeat risk is assigned to the category towhich the case would have been assigned had diseasedeveloped at that time.There were a total of 1198 cases of CIS or worse among1.1 million records fromroughly 120,000 subjects. The numbers ofcases and incidence rate estimates bycovariate class are given in Appendix 1. There appearsto be a trend for incidence ratesto decrease with increasing interval length.A Poisson regression analysis3°was performed on the incidence data, predicting the logof the number of cases from cohort, period, length of last interval, combined length oftwo intervals preceding thelastuinterval, and the degree of abnormality in the gappreceding the “last” interval. Log time-at-risk was included as the offset. Poissonregression fits the log of the estimated incidence rates to a linear combination of thecovariates which has the effect oftreating the covariates as having multiplicative effectson incidence rates. The covariates, being categorical, are represented by terms for allbut one of the category levels - the index category. Treatment contrasts were used tocompare the effect ofeach level with the effect ofthe index category. The combination64of all index categories is representedby the intercept. Let n represent the number ofcases, tthe aggregated time-at-riskandx thevector offactor categories and an intercept,then the model can be expressed as,log{E(n)} = log(t)f3”xThe results of the model fit arepresented in Table 7.Table 7. Poisson regression of incidencerates on length of “last” interval, combinedlength of two preceding intervals,and abnormality in gap precedingthe “last” interval.Parameter Coef s.e.t value riskIntercept—11.1 0.330 —33.7CohortPeriodborn 1929—331963—19751976—19920.543 0.256 2.120.192 0.200 0.9580.074 0.221 0.336Period*Cohort period*cohortl0.094period*cohort2—0.4260.268 0.3520.281 —1.52Index categories: >120 months forlast and preceding twointervals and no preceding gap forpreceding gap abnormality.The inclusion ofan interaction term for cohort andperiod significantly improved the fitLength of 55—120 months0.273 0.175 1.56 1.31last interval 43—54 months0.581 0.168 3.46 1.7931—42 months 0.890 0.166 5.352.4419—30 months 0.966 0.161 6.02 2.6310—18 months 1.31 0.1598.22 3.71Combined 85—120 months0.217 0.235 0.921 1.24length of 61—84 months —0.051 0.223 —0.2300.950two intervals 37—60 months —0.478 0.211—2.26 0.620preceding 10—36 months —0.605 0.210 —2.880.546the last one previous0.259 0.214 1.21 1.30no previous 0.620 0.212 2.921.86Preceding gap minor 0.507 0.0677.59 1.66abnormality major 1.23 0.2175.70 3.4265and thus was included in the model.No other interactions weresignificant. The modelwas also fit withoutan offset,and including log ofthe timeat risk whose coefficient wasfound to be 0.943 witha standard error of 0.052. This implies that thedata areconsistent with assumed relationbetween incidence and time-at-risk.Visual inspectionofresiduals plotted againstfitted values and by factordid not indicate any blatant signsof misfit. The dispersion parameterwas estimated to be 1.1 using formula(6.4) fromMcCullagh & Nelder3°whichis given by,(—IJo2 =X21(n-p) = I(n-p).jU1where n is the number ofcells withnon-zero time-at-risk,p is the numberofparametersin the model,y, is the observed count in cell i, fi is the model predictedcount for celli. A dispersion value greater than onesuggests that there is ‘over-dispersion’in thedata30.The residual deviance was406.3 on 552 degrees of freedom.The rates presented in Table 7are derived from taking exponents of thecoefficients.Since “treatment” contrasts wereused (i.e., the coefficient of the index factor categorywas set equal to zero), taking the exponentsof other categories provides estimatesofincidencerates relativeto theindexcategory. The index category was takento be > 120years for both interval length factors andnopreceding abnormality for theprecedinggapabnormality factor. The baseline incidencerate for the covariate class definedby allindex categories is givenby the exponent of the intercept which equals 0.181cases per661000 person-years (note that the exponent oftheintercept must be multiplied by 12,000to transform the units from person-months to 1000 person-years). The degrees offreedom can be taken to be 552°, Thus thet values can be compared with 1.96 forsignificance at the .05 level. All but one ofthe interval length levels were significantaswell as the two shortest combined preceding interval lengths,the level representing noprevious screens, and both levels ofthe preceding condition factor. It wouldseem thatlength of screen interval is associated with reducedrisk of disease, but having had acouple of screens in the recent past is beneficial if the outcomeis negative.The Poisson model can be used to estimate incident rates forjoint combinations offactors. If we match up categories from the combined lengths ofthe preceding twointervals with the length ofthe current interval, wecan get an approximation ofthe riskof disease with annual, bi-annual, tn-annual, andquadra-annual screening patterns.Since if an interval length is to be recommended, it is assumed that it willbe followedon a regular basis. Thus, forexample, the interval length category of 19-30 months willbe combined with the category of 37-60 months for the combined length of thetwoprevious intervals. Together they represent the risk of a bi-annual screening pattern.The incidence rate relative to the combination ofbaseline categories is estimatedby theproduct of the exponents of the respective coefficients. The estimated rates relative tobaseline are: 2.02 for annual screeners, 1.63 for bi-annuals, 2.31 fortn-annuals, and2.22 for individuals who are screened every four years. When the effect of previousscreens is added to the effect of the “last” screen the two effects tend to dilute each67other. But the effectofthe “last” evidently wins out since theincidence rateis still twicethe rate of the reference group which consists of individuals whose “last”interval wasgreater than 10 years and whose combined lengths of the previous two intervalsis alsogreater than 10 years.Incidence rates are based on numbers of cases occurring in units oftime. We alsoexamined the proportion ofintervals in each covariate class which resulted in diagnosisof disease. This gives an index of the prevalence of disease in each covariate class.The prevalence rates are given in Appendix 2. Here the reverse trend of ratewithrespect to interval length is observed. We modelled the prevalence rateswith logisticregression, using the same set of covariates as in the Poisson model. The results aresummarized in Table 8.Treatment contrasts were again used so the exponents of the coefficientsgive estimatesfor the odds ratios of each of the factor levels with respect to the indexlevels. Theresults are the same as the Poisson regression results except forthe effect of intervallength which in this case indicates decreasing risk with decreasing lengthrather than theother way around. But this does not necessarily imply thatindividuals arebetteroffwithshorterintervalsbecause you have to makeit through relatively moreintervals when theyare shorter. The proper index for comparison is the risk per unit time.However thisdoes not quite equal incidence rates as they havebeen computed here, since simplydividing the prevalence rate for an interval by the length ofthe interval does notaccount68Table 8. Logistic regression ofprevalence cases on length of “last” interval,combinedlength of two preceding intervals,and abnormality in gap preceding the“last” interval.InterceptParameterCoef s.e. t value risk—5.98 0.329 —18.2CohortPeriodborn 1929—331963—19751976—19920.526 0.257 2.050.210 0.201 1.040.120 0.222 0.539Cohort*Periodcohort*perjodl 0.131cohort*period2-0.4070.269 0.4880.282 —1.45Index categories: >120 months forlast and preceding two intervals andnopreceding gap for precedinggap abnormality.for the time between intervals, the“gaps” during which diagnostic testingis presumablyoccurring.By equation (3.2) the ratio ofprevalence odds to incidence rate equalsthe averageduration of disease. We computed modelestimates of prevalence odds and incidencerates for each level ofthe “last”interval category. We then divided the prevalenceoddsby the corresponding incidence rate toobtain an estimate of the average durationofdisease for each “last” intervalcategory. The results are presented in Table9.Length of 55—120months —0.430 0.175 —2.460.651last interval 43—54 months—0.601 0.167 —3.590.54831—42 months —0.652 0.165 —3.940.52119—30 months —0.955 0.159—5.99 0.38510—18 months —1.08 0.158—6.84 0.340Combined 85—120 months0.230 0.236 0.973 1.26length of 61—84 months —0.0400.224 —0.177 0.961two intervals 37—60 months—0.470 0.212 —2.21 0.625preceding 10—36 months —0.6130.211 —2.91 0.542the last one previous0.291 0.215 1.35 1.34no previous 0.658 0.2133.09 1.93Preceding gap minor0.554 0.067 8.29 1.74abnormality major1.28 0.219 5.88 3.6069Table 9. Model estimated incidence rates, prevalence odds and the ratio ofprevalenceodds to incidence rates by “last” interval length.Incidence rate Prevalence rate Prevalence!“Last” interval length (/1000 person- (/1000 persons) Incidenceyears) (years)10-18 months 0.673 0.860 1.2819-30 months 0.477 0.974 1.4531-42 months 0.442 1.32 2.9943-54 months 0.325 1.39 4.2855-120 months 0.238 1.65 6.93>120 months 0.181 2.53 14.0The categories correspond roughly to intervals of length 1, 2, 3, 4, 5-10,and > 10years. Clearly the estimated duration ofdisease is related to “last” intervallength. Onewould expect the duration of disease to be longer on average if diagnosedafter a longscreen interval rather than a short screen interval. However, the averagedurationestimates are unrealistic in that they seem to correspond quite closely withthe averageintervallength within each category, which would imply that on average diseaseoccurredshortly after the start of the screen interval. Perhaps the duration is inflated bytheoccurrence of false negative tests.The assumption of constant incidence overlooks one important factor, namely thestageof disease between normality and CIS, namelydysplasia. Assuming a negative test to70be accurate, an individual cannot developCIS without first going througha period ofdysplasia. There isa period of grace, namely the sojourn time for dysplasia,duringwhich CIS cannot occur sincedysplasia must firstrun its course.Hence theinterval overwhich disease can develop isshorter than the nominal interval.This would have theeffect of deflating the ratiosof incidence rates to prevalenceodds ratios, which mayaccount for the observed results.We also simulated the methodology ofthe case-controls studies discussed earlier.Eachof the cases was matched withseven controls on the basis ofyear ofbirthand to withinfive months of “date ofdiagnosis”. That is,controls were required to havehad a screeninterval end within five monthsoftheendpointofthelast screen intervalbefore diagnosisof the matched case. The datawere then subjected to a conditional likelihoodlogisticregression analysis using the PECANpackage31,The same covariates wereusedexcluding cohort and period since thesewere controlled by design. The resultsarepresented in Table 10.The same pattern of resultsare observed as for the unconditionallogistic regressionmodel, although the two modelsare not identical. The case-control study matchedbyyear of birth and date of diagnosisto within 5 months, while the cohort study merelycontrolled for cohort and periodofdiagnosis to within 1-2 decades. Therisk estimatesare somewhat less extreme for interval lengthand more extreme for thecombined length71Table 10. Conditional likelihoodlogistic regression of a simulated matchedcase-control study.ParameterCoef s.e. z score riskLength of 55—120months —0.744 0.228 —3.270.475last interval 43—54 months —0.9090.219 —4.14 0.40331—42 months —0.954 0.220—4.35 0.38519—30 months —1.31 0.213—6.16 0.26910—18 months —1.42 0.212—6.68 0.242Combined 85—120 months0.014 0.267 0.054 1.02length of 61—84 months—0.157 0.250 —0.628 0.855two intervals 37—60 months—0.703 0.237 —2.96 0.495preceding 10—36 months—0.858 0.235 —3.65 0.424the last one previous 0.0440.243 —0.180 0.957no previous 0.292 0.2441.19 1.34Preceding gap minor0.544 0.076 7.20 1.72abnormality major1.20 0.272 4.40 3.31Index categories: >120 monthsfor last and preceding twointervals and no preceding gapfor preceding gap abnormality.of preceding screen intervals. Thestandard errors also tend to be slightly larger.Although the two methods ofmodellingprevalence rates differ to some extent, they bothsuggest the same conclusion, namely that intervallength is directly related to risk ofdisease, contrary to results of the analysisof incidence rates.We have seen that prevalence rates, as given byunconditional logistic regression withcohort data or conditional likelihood regression withcase-control data, are inflatedrelative to incidence rates by a factor roughlyequal to the average length ofdurationofdisease. Thus, crude estimates of incidence ratescould be obtained from prevalencerates by adjusting for the likelyrelation between the factor and disease duration.Anotherpossibility would be to weight the samplingofcontrols, in case-control studies,to adjust for the effect ofinterval length as was donein Figure 1. This has the effect of72changing the distribution ofinterval lengths from that ofintervals startingor terminatingat apointin time to that ofintervals spanning apoint, which, as we have seen, under theuniform and Poisson models of screening, corresponds to the distributionof intervallengths for cases under the null hypothesis. This method may have theadvantage, oversimply correcting the obtained prevalence odds, ofsimultaneouslycorrecting odds ratiosobtained for factors which may be associated with intervallength.The suggested method was examined in another simulated,matched, case-control studyusing a weighted sample ofcontrols. The criteria for controlselection were the same asbefore, only this time the probability of samplingany given control from those eligiblewas weighted inversely by the lengthof the “last” interval (to the nearest year). Theresults are presented in Table 11.73Table 11. Conditionallikelihood logistic regression ofa simulated matched case-controlstudy with sampling of controls weighted by length of “last” interval,Parameter Coef. s.e. z score riskLength of 55—120 months —0.062 0.187 —0.335 0.940last interval 43—54 months 1.15 0.186 6.21 3.1731—42 months 0.834 0.183 4.56 2.3019—30 months 0.977 0.177 5.51 2.6610—18 months 1.70 0.179 9.50 5.48Combined 85—120 months 0.146 0.281 0.520 1.16length of 61—84 months —0.216 0.265 —0.818 0.806two intervals 37—60 months —0.671 0.251 —2.67 0.511preceding 10—36 months —0.836 0.250 —3.35 0.433the last one previous —0.031 0.256 —0.122 0.969no previous 0.437 0.258 1.69 1.55Preceding gap minor 0.515 0.076 6.74 1.67abnormality major 1.16 0.293 3.95 3.18Index categories: >120 months for last and preceding two intervalsand no preceding gap for preceding gap abnormality.Weighting the sampling of controls by length of “last” interval succeeded in reversingthe direction of trend in the relation between length of “last” interval and relative risk.The resulting risk estimates are not too far offthose produced by the Poisson regressionmodel, although the estimate for the 43-54 month category appears to be a little high.Strangely, the standard errors are also closer those of the Poisson model than the onesproduced by the unweighted sample matched case-control analysis.The ratios of unweightecl sample (prevalence) to weighted sample (“incidence”) oddsratios is 0.044, 0.102, 0.156, 0.127, and 0.505 for “last” interval categoriescorresponding to 1,2,3,4, and 5-9 years respectively. The relative sizes roughlycorrespond to the relative lengths of the intervals except for the “4” category,74Chapter 6ConclusionThe analysis of screen detected disease involves a number of issues with respect to thedesign and analysis of cohort and case-control studies. Non-standard methods arerequired to estimate incidence rates for time-dependent factors such as interval length.Case-control studies of screen detected disease produce estimates of prevalence oddsratios rather than incidence risk ratios. This can be misleading when the exposurevariable is related to duration ofdisease. In the case of screening interval length, sinceCIS is a long lasting disease, it is not surprising thathigher prevalencerates are observedamong individuals with longer screening intervals. But higher prevalence rates do notimply higher incidence rates, since after the completion of one screen interval anotherone begins with, perhaps, the same risk of disease.To examine the effect of a method of sampling controls in a popular case-controlparadigm, whereby controls are required to have had a screen near the time ofdiagnosisof a case, theoretical screening models were considered. All models imply that thedistributions of interval lengths are different for cases and controls under the nullhypothesis of no effect of screening. The distribution of interval lengths for sampled75controls converges to the underlyingdistribution, whereas the distribution of intervallengths for cases is the same as thedistribution of spanning intervals, favouring longerintervals.Incidence rates evidently cannotbe accurately inferred from prevalence rates unless theduration of disease is known. Brookmeyer,Day, and Moss32have proposed a methodofestimating the duration ofdisease along with false negative rates. Unfortunately, themethod presupposes that regressiondoes not occur, and this would certainly notbe thecase for CIS.With respect to the specificfindings ofthisstudy, it would appear that frequent screeningdoes not provide protection against the developmentof CIS. It is beneficial to have hadnegative screens in the past, but the incidence of diagnosedCIS is higher for shorterscreening intervals than for longer ones.One possible explanation for this paradoxicalresult is that the natural regression of the diseasemay result in some cases, which ariseand then regress during long screen intervals, goingundetected, The time that theseindividuals actually had the disease is being erroneously appliedto the time-at-risk,deflating the incidence rate. And even more importantly,an incidence case is goinguncounted. Such errors would be less likely to occurunder a frequent screening pattern.However, frequent screening should result in some avoidanceofdisease in thatprecursorstages of disease, in this case dysplasia, are detected andtreated. Thus frequentscreeners should be disease free altogether, exceptfor rapidly developing subtypes.76Another possible explanationfor the observation of higherincidence rates in shorterintervals is that individuals whoare at risk are morelikely to go for frequent tests.Wetried to control for thissomewhat by controllingfor degree of abnormalityoccurring inthe gap preceding the“last” interval. However, itis generally believed thathigh riskindividuals are less likelyto be screened frequently.Although recency ofprevioustests provides some protectionagainst disease, it does notappear to be enoughto counter the effect of intervallength. It would appear thattheonly logical conclusion wouldbe that the best protection againstthe diagnosis of CIS isinfrequent screening.A shortcoming of thepresent study is the lack ofcontrol or information about thecircumstances surrounding the decisionto be tested. The possibilityof confoundingvariables is always a concernwith observational studies. Eventhough the present studycasts some doubt on thebenefit of frequent screening for theprevention ofdiagnosis ofCIS, as opposed to invasivecancer, a proper randomizedcontrol trial may still beconsidered unethical becauseof the possible implications formore advanced stages ofdisease.77Bibliography[1] Boyes D.A., Morrison E.G., Knox E.G., Draper G.J., Miller A.B., Acohortstudy ofcervical cancer screening in British Columbia. Clinical & InvestigativeMedicine, 1982; 5:1-29.[2] Eliman R., Indications for colposcopy from a UK viewpoint, in Miller A.B.,Chamberlain J., Day N.E., Hakama M., Prorak P.C. (Eds) Cancer Screening,1991, Cambridge University Press.[3] Day N.E., Screening for cancer of the cervix, Journal ofEpidemiology andCommunity Health, 1989, 43:103-106.[4] Anderson G.H., Boyes D.A., Benedet J.L., et. al. Organisation and results ofthe cervical cytology screeningprogramme in British Columbia, 1955-85,BritishMedical Journal, 1988, 296:975-978.[5] TaskForceAppointedby theConference ofDeputyMinisters ofHealth: Cervicalcancer screening programs, Canadian Medical Association Journal,1976,114:1003-1033.[6] Parkins in Hakama M., Miller A.B., Day N.E., (Eds) Screeningfor Cancer ofthe Uterine Cervix. 1986. IARC Scientific Publications, No.76: Lyon.[7] La Vecchia C., Franceschi S., Decarli A., et. al. “PAP” smear and the risk ofcervical neoplasia: quantitative estimatefrom a case-control study,The Lancet,1984, Oct:779-782.[8] Knox G. Case-control studies of screeningprocedures, Public Health, 1991,105:55-61.[9] ChronicDiseaseReports: Deathsfromcervicalcancer - UnitedStates1984-1986.MMWR, 1989, 38:650-659.[10] Miller A.B., Knight J., Narod S. The naturalhistory ofcancerofthe cervix, andthe implicationforscreening policy, in[11] HutchinsonM.L., Agarwal P., DenaultT. et. al. A new lookatcervicalcytology,Acta Cytologica, 1992, 36:499-504.78[12] Hakama M., Miller A.B., Day N.E. (Eds) Screeningfor Cancer ofthe UterineCervix. 1986. IARC Scientific Publications, No. 76: Lyon.[13] Magnus and Langmark in Hakama M., Miller A.B., Day N.E. (Eds) Screeningfor Cancer ofthe Uterine Cervix. 1986. IARC Scientific Publications, No. 76:Lyon.[14] Berrino in HakamaM., MillerA.B., Day N.E. (Eds) Screeningfor CanceroftheUterine Cervix, 1986. IARC Scientific Publications, No. 76: Lyon.[15] Clark E.A., Anderson T.W., Does screening by ‘PAP’ smears help preventcervical cancer? The Lancet, 1979, July:1-4[16] Berrino F., Papanicolaou smears and risk ofcervical neoplasia, The Lancet,1984, Nov:1099-1100.[17] MacGregor J.E., Moss S.M., Parkin D.M., Day N.E., A case-control study ofcervical cancer screening in north east Scotland, British Medical Journal, 1985,290:1543-1546[18] Rothman K.J.,ModernEpidemiology, 1986, LittleBrown and Company:Toronto.[19] van Oortmarssen G.J., HabbemaJ.D.F., in HakamaM., Miller A.B., DayN.E.,(Eds.) Screening for Cancer of the Uterine Cervix. 1986. IARC ScientificPublications, No. 76: Lyon.[20] ChoiN.W., NelsonN.A. in HakamaM., Miller A.B., Day N.E. (Eds) Screeningfor Cancer ofthe Uterine Cervix. 1986, IARC Scientific Publications, No.76:Lyon.[21] MorrisonB.J., ColdmanA.J., Boyes D.A., Anderson G.H., Forty yearfollow-upof cervical screening: the cohort study of British Columbia, 1994. Unpublishedmanuscript.[22] Habbema J.D.F. in Hakama M., Miller A.B., Day N.E. (Eds) ScreeningforCancerofthe Uterine Cervix. 1986. IARC Scientific Publications, No.76: Lyon.[23] Cox D.R., Hinkley D.V. Theoretical Statistics, Chapman and Hall, New York,1974[24] ArmitageP., Berry G. StatisticalMethods in MedicalResearch, 1987, BlackwellScientific Publications:London.[25] Sheehe P.R. Dynamic risk analysis in retrospective matched pair studies ofdisease. Biometrics 1962; 18:323-341.79[26] Miettinen O.S. Estimability and estimation in case-referent studies. Am. J.Epidemiol. 1976; 103:226-235.[27] Breslow N.E., Day N.E., Statistical Methods in Cancer Research, vol.1 1980,vol.2 1987 International Agency for Research on Cancer: Lyon.[28] Elandt-Johnson R. Definition ofrates: some remarks on their use and misuse.American Journal Of Epidemiology, 1975; 102:267-271.[29] Sasco A.J., Day N.E., Walter S.D., Case-control studiesfor the evaluation ofscreening, Journal of Chronic Disease, 1986, 39:399-405.[30] McCullagh P., Nelder J.A., Generalized linearmodels, (2’ Ed.)Chapman andHall: New York, 1989.[311Storer B.E., Wacholder S., Breslow N.E., Maximum likelihood fitting ofgeneralrelative risk models to stratified data. Applied Statistics. 1983, 32:172-81.[32] Brookmeyer R., Day N.E., Moss S. Case-control studies for estimation ofthenatural history ofpreclinical disease from screening data, Statistics in Medicine,1986, 5:127-138.80Appendix 1Number of cases and incidence ratesper thousand person—yearsby:— Cohort (l=born 1914-1918, 2=born 1929—1933)— Period (l=pre—1963, 2= 1963—1975,3=1976—1992)— Degree of abnormality in Gappreceding last screen(l=no gap, 2=minor—<lO monthsor class 2 or 9,3=major-class 3, 4 or histology)— Length of last screen interval,— Combined length of two screensprior to last.- Cells with no time-at-riskindicated with “-“Cohort 1Period 2 10-18Length of last screen19—30 31—42interval43—540.750.520.280.430. 00(months)55—120 >1207 0.326 0.870 0.000 0.001 0.996 0.702 1.19o o.oo0 0.00Cohort 1 Length of last screeninterval (months)Period 1 10—18 19—30 31—4243—54 55—120 >120n rate n rate n rate n raten rate n rate4 0.741 0.560 0.000 0.000 0.000 0.000 —0 0.000 0.000 0.000 0.000 0.00GapiNo prey1 prey10—3637—6061—8485—120>120Gap 2No prey1 prey10—3637—6061—8485—120>120Gap 3No prey1 prey10—3637—6061—8485—120>1203 0.80 1 0.25 0 0.00 00.002 2.50 0 0.00 0 0.000 —0 0.00 0 0.00 0 0.000 —0 0.00 0 0.00 0 0.000 —153.33 0 0.00 0 — 0 —0 0.00 0 — 0 — 0 —0 — 0 — 0 — 0 —0 0.00 0 0.00 0 0.00 00.000 0.00 0 0.00 0 0.00 0 —0 0.00 0 0.00 0 — 0 —0 0.00 0 0.00 0 —0 —0 0.00 0 — 0 —0 —9 1.762 0.740 0.000 0.000 0.000 0.000 0.006 6.090 0.000 0.000 0.000 0.000 0.000 —0 0.000 0.000 —0 —0 —0 —0 —00— 0 0.00 0 — 0 — 0 —— 0 — 0 — 0 — 0—0 0.000 —0 —0 0.000 —0 —0 —0 0.000 —0 —0 —0 —0 —0 —0000000— 0 0.00— 0 —— 0 —— 0 —— 0 —— 0 —— 0 —0000000n rate n rate n rate n rate n raten rateGap 1No prey 11 1.42 12 0.98 11 0.96 141 prey 10 0.85 12 0.95 3 0.35 510—36 13 0.35 5 0.34 1 0.20 137—60 13 0.82 2 0.15 1 0.17 261—84 3 0.80 2 0.46 1 0.42 08185—120 2 1.14 1 0.56 2 1.89 2 2.44 0 0.00 0 0.00>120 1 1.65 1 1.79 1 3.16 0 0.00 0 0.00 0 —Gap 2No prey1 prey10—3637—6061—8485—120>120Gap 3No prey1 prey10—3637—6061—8485—120>120Cohort 15 2.212 0.596 0.585 1.133 2.800 0.000 0.002 27.001 12.241 5.771 12.300 0.000 0.000 0.002 1.001 0.404 0.942 0.711 1.071 2.680 0.000 0.001 26.490 0.001 42.860 0.000 0.000 0.0010000003 2.220 0.001 0.840 0.000 0.000 0.00o o.oo0 0.000 0.000 0.000 0.00o o.oo0 0.000 — 1.423 1.920 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 —0 —0 —0 0.002 1.881 2.010 0.000 0.000 0.000 0.000 0.000 —0 0.000 0.000 —0 —0 —0 0.000 —0 —0 —0 —0 —0 —Period 3 10-18n rateLength of last screen interval (months)19—30 31—42 43—54 55—120 >120n rate n rate n rate n rate n rateGap 1No prey1 prey10—3637—6061—8485—120>120Gap 2No prey1 prey10—3637—6061—8485—120>120Gap 3No prey1 prey10—3637—6061—8485—120>1200 0.000 0.005 0.205 0.444 1.251 0.520 0.000 0.000 0.003 0.521 0.330 0.000 0.001 2.440 0.000 0.001 1.801 2.431 8.061 12.171 10.770 0.001 1.555 0.333 0.217 1.271 0.324 1.610 0.000 0.004 1.141 0.380 0.000 0.000 0.000 0.00o o.oo0 0.000 0.000 0.000 0.001 32.096 0.314 0.342 0.260 0.001 0.221 0.330 0.001 0.361 0.520 0.001 0.780 0.00o o.oo0 0.000 0.004 4.910 0.002 0.202 0.402 0.650 0.000 0.000 0.002 1.351 0.660 0.001 2.380 0.00o o.oo0 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.001 0.133 0.242 0.281 0.210 0.000 0.001 2.830 0.001 0.600 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.005 2.390 0.001 0.122 0.153 0.382 0.332 0.410 0.000 0.000 0.002 1.040 0.001 1.620 0.000 0.000 0.00o o.oo0 0.000 0.000 0.000 0.0000000000. 000.000 • 000. 000.0082Cohort 2 Length of last screen interval (months)Period 1 10—18 19—30 31—42 43—54 55—120 >120n rate n rate n rate n rate n rate n rateGap 1No prey1 prey10—3637—6061—8485—120>120Gap 2No prey1 prey10—3637—6061—8485—120>120Gap 3No prey1 prey10—3637—6061—8485—120>1208 2.06 7 1.80 6 2.25 1 0.39 2 1.43 0 0.002 1.52 0 0.00 0 0.00 0 0.00 0 0.00 0 —0 0.00 0 0.00 0 0.00 0 0.00 0 0.00 0 —0 0.00 0 0.00 0 0.00 0 0.00 0 — 0 —0 0.00 0 0.00 0 0.00 0 — 0 — 0 —0 0.00 0 — 0 — 0 — 0 — 0 —0 — 0 — 0 — 0 — 0 — 0 —2 2.98 1 1.93 2 7.47 0 0.00 0 0.00 0 —0 0.00 0 0.00 0 0.00 0 0.00 0 0.00 0 —1 23.03 0 0.00 0 0.00 0 — 0 — 0 —0 0.00 0 0.00 0 0.00 0 — 0 — 0 —0 0.00 0 — 0 — 0 — 0 — 0 —0 0.00 0 — 0 — 0 — 0 — 0 —0 — 0 — 0 — 0 — 0 — 0 —0 0.00 0 0.00 0 0.00 0 — 0 — 00 0.00 0 — 0 — 0 — 0 — 00 — 0 — 0 — 0 — 0 — 00 — 0 — 0 — 0 — 0 — 00 — 0 — 0 — 0 — 0 — 00 — 0 — 0 — 0 — 0 — 00 — 0 — 0 — 0 — 0 — 0>120Cohort 2 Length of last screen interval (months)Period 2 10—18 19—30 31—42 43—54 55—120n rate n rate n rate n rate n rate n rateGap 1No prey 321 prey 3110—36 4937—60 3061—84 1285—120 6>120 2Gap 2No prey 321 prey 1810—36 1437—60 1061—84 385—120 4>120 0Gap 3No prey 0lprev 02.14 391.60 281.02 141.30 102.08 62.73 33.02 16.53 192.79 130.99 61.50 41.73 15.72 10.00 00.00 00.00 01.94 28 1.64 38 1.50 32 1.24 6 0.771.43 19 1.50 12 0.94 8 1.12 0 0.000.71 4 0.61 0 0.00 0 0.00 0 0.000.56 1 0.13 2 0.37 0 0.00 0 0.001.05 2 0.69 2 0.88 0 0.00 0 0.001.27 1 0.80 1 1.14 0 0.00 0 0.001.71 0 0.00 0 0.00 0 0.00 0 —4.35 11 3.77 8 2.24 2 0.76 0 0,002.73 7 2.61 1 0.40 0 0.00 1 5.280.99 3 1.52 0 0.00 0 0.00 0 0.000.89 3 1.47 0 0.00 0 0.00 0 0.000.81 0 0.00 2 3.10 0 0.00 0 —2.18 0 0.00 0 0.00 0 0.00 0 0.000.00 0 0.00 0 0.00 0 0.00 0 —0.00 0 0.00 0 0.00 00.00 0 0.00 0 0.00 00.00 0— 0831 5.76 0 0.00 0 0.00 0 0.00 0o o.oo 0 0.00 0 0.00 0 0.00 00 0.00 0 0.00 0 0.00 0 —0 0.00 0 0.00 0 0.00 0 —0 0.00 0 0.00 0 — 0 —10—3637—6061—8485—120>120— 0 —— 0 —0 0.00 00 — 00 — 019—30 31—42 43—54 55—120 >120Cohort 2 Length of last screen interval (months)Period 3 10-18n rate n rate n rate n rate n rate n rateGap 1No prey 0 0.00 0 0.00 0 0.00 0 0.00 1 0.44 8 0.361 prey 1 1.83 3 2.81 1 0.76 3 1.06 2 0.33 7 0.5710—36 29 0.47 10 0.28 5 0.33 6 0.47 2 0.18 0 0.0037—60 10 0.35 14 0.40 9 0.41 5 0.22 1 0.06 0 0.0061—84 3 0.40 5 0.41 3 0.29 5 0.41 3 0.27 1 0.2085—120 3 0.83 3 0.47 3 0.54 9 1.12 1 0.13 0 0.00>120 2 0.78 1 0.25 1 0,26 2 0.35 0 0.00 1 0.61Gap 2No prey 0 0.00 1 23.76 0 0.00 0 0.00 1 1.13 0 0.001 prey 0 0.00 1 2.74 1 3.20 1 1.91 4 2.36 0 0.0010—36 13 0.88 2 0.23 1 0.32 0 0.00 0 0.00 2 0.7037—60 10 1.22 5 0.74 5 1.54 2 0.63 1 0.28 0 0.0061—84 4 2.10 2 1.00 2 1.62 0 0.00 0 0.00 0 0.0085—120 1 1.16 0 0.00 1 1.56 0 0.00 1 1.14 1 2.08>120 1 1.65 1 2.20 0 0.00 2 4.23 0 0.00 0 0.00Gap 3No prey 0 0.00 0 0.00 1 35.29 0 0.00 0 0.00 0 0.00lprev 0 0.00 0 0.00 0 0.00 0 0.00 0 0.00 0 0.0010—36 3 1.96 0 0.00 1 2.85 0 0.00 0 0.00 0 0.0037—60 1 1.14 0 0.00 0 0.00 0 0.00 0 0.00 0 0.0061—84 1 4.74 0 0.00 0 0.00 0 0.00 0 0.00 0 0.0085—120 1 7.72 0 0.00 0 0.00 0 0.00 0 0.00 0 0.00>120 0 0.00 0 0.00 0 0.00 0 0,00 0 0.00 0 0.0084Appendix 2Number of cases and prevalence rates per thousand individualsby:— Cohort (l=born 1914-1918, 2= born 1929-1933)— Period (l=pre—1963, 2= 1963—1975, 3=1976—1992)- Degree of abnormality in Gap preceding last interval(1=no gap, 2=minor—<1O months or class 2 or 9,3=major—class 3 or 4 or histology)— Length of last screen interval,— Combined length of two screens prior to last.— Cells with no time-at-risk indicated with“-“Cohort 1Period 2 10-18Length of last screen interval19—30 31—42 43—54(months)55—120 >120n rate n rate n rate n rate n rate n rateGap 1No prey 11 1.88 12 2.12 11 3.01 14 3.46 7 2.38 6 8.721 prey 10 1.13 12 2.05 3 1.11 5 2.33 6 6.29 2 14.2910—36 13 0.45 5 0.72 1 0.62 1 1.29 0 0.00 0 0.0037—60 13 1.09 2 0.32 1 0.53 2 1.92 0 0.00 0 0.00Cohort 1 Length of last screen interval (months)Period 1 10—18 19—30 31—42 43—54 55—120 >120n rate n rate n rate n rate n rate n rateGap 1No prey 9 2.30 4 1.57 3 2.51 1 1.14 0 0.00 0 0.00lprev 2 0.96 1 1.17 2 7.97 0 0.00 0 0.00 0 —10—36 0 0.00 0 0.00 0 0.00 0 0.00 0 0.00 0 —37—60 0 0.00 0 0.00 0 0.00 0 0.00 0 0.00 0 —61—84 0 0.00 0 0.00 1166.67 0 0.00 0 — 0 —85—120 0 0.00 0 0.00 0 0.00 0 — 0 — 0 —>120 0 0.00 0 — 0 — 0 — 0 — 0 —Gap 2Noprev 6 8.65 0 0.00 0 0.00 0 0.00 0 0.00 0 0.00lprev 0 0.00 0 0.00 0 0.00 0 0.00 0 0.00 0 —10—36 0 0.00 0 0.00 0 0.00 0 0.00 0 — 0 —37—60 0 0.00 0 0.00 0 0.00 0 0.00 0 — 0 —61—84 0 0.00 0 0.00 0 0.00 0 — 0 — 0 —85—120 0 0.00 0 — 0 0.00 0 — 0 — 0 —>120 0 — 0 — 0 — 0 — 0 — 0 —Gap 3Noprev 0 0.00 0 0.00 0 0.00 0 — 0 0.00 0 —lprev 0 0.00 0 — 0 — 0 — 0 — 0 —10—36 0 — 0 — 0 — 0 — 0 — 0 —37—60 0 — 00.000 — 0 — 0 — 0 —61—84 0 — 0 — 0 — 0 — 0 — 0 —85—120 0 — 0 — 0 — 0 — 0 — 0>120 0 — 0 — 0 — 0 — 0 — 08561—84 3 1.06 2 1.00 1 1.32 0 0.00 1 6.90 0 0.0085—120 2 1.54 1 1.21 2 5.95 2 10.81 0 0.00 0 0.00>120 1 2.15 1 3.79 1 9.90 0 0.00 0 0.00 0 —Gap 2No prey 5 3.21 2 2.20 3 7.06 3 6.55 0 0.00 1 14.71lprev 2 0.84 1 0.86 0 0.00 3 8.72 2 13.24 0 0.0010—36 6 0.80 4 2.01 1 2.62 0 0.00 1 14.09 0 0.0037—60 5 1.61 2 1.51 0 0.00 0 0.00 0 0.00 0 0.0061—84 3 3.89 1 2.38 0 0.00 0 0.00 0 0.00 0 0.0085—120 0 0.00 1 5.62 0 0.00 0 0.00 0 0.00 0 —>120 0 0.00 0 0.00 0 0.00 0 0.00 0 0.00 0 —Gap 3No prey 2 40.82 0 0.00 0 0.00 0 0.00 0 0.00 0 0.001 prey 1 21.28 1 62.50 0 0.00 0 0.00 0 — 0 —10—36 1 9.26 0 0.00 0 0.00 0 0.00 0 0.00 0 —37—60 1 17.54 1100.00 0 0.00 0 0.00 0 0.00 0 —61—84 0 0.00 0 0.00 0 0.00 0 — 0 — 0 —85—120 0 0.00 0 0.00 0 0.00 0 — 0 — 0 —>120 0 0.00 0 0.00 0 — 0 — 0 — 0 —Cohort 1 Length of last screen interval (months)Period 3 10—18 19—30 31—42 43—54 55—120 >120n rate n rate n rate n rate n rate n rateGap 1No prey 0 0.00 0 0.00 0 0.00 0 0.00 5 20.66 6 5.531 prey 0 0.00 1 3.42 4 15.94 0 0.00 0 0.00 4 5.4710—36 5 0.25 5 0.70 0 0.00 1 0.60 1 0.90 2 3.8237—60 5 0.57 3 0.45 2 0.63 3 1.10 2 1.13 0 0.0061—84 4 1.60 7 2.77 2 1.26 2 1.26 3 2.75 1 2.9785—120 1 0.69 1 0.69 2 2.02 1 0.94 2 2.46 1 4.33>120 0 0.00 4 3.42 0 0.00 0 0.00 2 3.02 0 0.00Gap 2No prey 0 0.00 0 0.00 0 0.00 0 0.00 0 0.00 1 6.41lprev 0 0.00 0 0.00 0 0.00 1 13.51 0 0.00 1 8.6210—36 3 0.72 4 2.43 2 4.26 0 0.00 0 0.00 0 0.0037—60 1 0.45 1 0.81 1 2.09 1 2.72 2 7.69 1 11.2461—84 0 0.00 0 0.00 0 0.00 0 0,00 0 0.00 0 0.0085—120 0 0.00 0 0.00 1 7.75 0 0.00 1 11.77 0 0.00>120 1 3.53 0 0.00 0 0.00 0 0.00 0 0.00 0 0.00Gap 3Noprev 0 0.00 0 0.00 0 0.00 0 0.00 0 0.00 0 —1 prey 0 0.00 0 0.00 0 0.00 0 0.00 0 0.00 0 0.0010—36 1 2.53 0 0.00 0 0.00 0 0.00 0 0.00 0 0.0037—60 1 3.80 0 0.00 0 0.00 0 0.00 0 0.00 0 0.0061—84 1 12.19 0 0.00 0 0.00 0 0.00 0 0.00 0 —85—120 1 15.87 0 0.00 0 0.00 0 0.00 0 0.00 0 0.00>120 1 17.24 1 62.50 0 0.00 0 0.00 0 0.00 0 0.0086Cohort 2 Length of last screen interval (months)Period 1 10—18 19—30 31—42 43—54 55—120 >120n rate n rate n rate n rate n rate n rateGap 1No prey1 prey10—3637—6061—8485—120>120Gap 2No prey1 prey10—3637—6061—8485—120>120Gap 3No prey1 prey10—3637—6061—8485—120>120Cohort 28 2.66 7 3.75 6 7.07 1 1.71 2 9.90 0 0.002 1.93 0 0.00 0 0.00 0 0.00 0 0.00 0 —0 0.00 0 0.00 0 0.00 0 0.00 0 0.00 0 —0 0.00 0 0.00 0 0.00 0 0.00 0 — 0 —0 0.00 0 0.00 0 0.00 0 — 0 — 0 —0 0.00 0 — 0 — 0 — 0 — 0 —0 — 0 — 0 — 0 — 0 — 0 —2 3.98 1 4.10 2 24.10 0 0.00 0 0.00 00 0.00 0 0.00 0 0.00 0 0.00 0 0.00 01 30.30 0 0.00 0 0.00 0 — 0 — 00 0.00 0 0.00 0 0.00 0 — 0 — 00 0.00 0 — 0 — 0 — 0 — 00 0.00 0 — 0 — 0 — 0 — 00 — 0 — 0 — 0 — 0 — 00 0.00 0 0.00 0 0.00 0 — 0 — 00 0.00 0 — 0 — 0 — 0 — 00 — 0 — 0 — 0 — 0 — 00 — 0 — 0 — 0 — 0 — 00 — 0 — 0 — 0 — 0 — 00 — 0 — 0 — 0 — 0 — 00 — 0 — 0 — 0 — 0 — 0Period 2 10-18n rate n(months)55—120 >120rate n rateLength of last screen interval19—30 31—42 43—54rate n rate n rate n4.23 28 5.21 38 6.96 323.12 19 4.84 12 4.32 81.52 4 1.96 0 0.00 01.22 1 0.42 2 1.66 02.30 2 2.21 2 3.99 02.83 1 2.53 1 5.24 03.73 0 0.00 0 0.00 0Gap 1No prey 321 prey 3110—36 4937—60 3061—84 1285—120 6>120 2Gap 2No prey 321 prey 1810—36 1437—60 1061—84 385—120 4>120 0Gap 3No prey 0lprev 09.198 . 060. 000. 000. 000. 000. 002.89 392.16 281.34 141.77 102.88 63.72 34.08 19.62 194.12 131.42 62.20 42.60 18.62 10.00 00.00 00.00 060000009.590.000. 000. 000. 000. 009.87 11 12.116.12 7 8.552.20 3 4.931.97 3 4.801.81 0 0.004.74 0 0.000.00 0 0.008 10.281 1.830 0.000 0.002 13.890 0.000 0.002 5.680 0.000 0.000 0.000 0.000 0.000 0.00o o.oo1 62.50o o.oo0 0.000 —0 0.000 —0.00 0 0.00 0 0.00 00.00 0 0.00 0 0.00 00.00 0— 08710—3637—6061—8485—120>120100007.41 0 0.00 0 0.00 0 0.00 0 — 0 —0.00 0 0.00 0 0.00 0 0.00 0 — 0 —0.00 0 0.00 0 0.00 0 —0 0.00 00.00 0 0.00 0 0.00 0 — 0— 0 —0.00 0 0.00 0 — 0 — 0 — 0 —Cohort 2 Length of last screen interval(months)Period 3 10—18 19—30 31—42 43—54 55—120 >120n rate n rate n rate n rate n rate n rateGap 1No prey 0 0.00 0 0.00 0 0.00 0 0.00 1 3.77 8 6.321 prey 1 2.76 3 6.55 1 2.46 3 5.09 2 2.62 7 9.3510—36 29 0.60 10 0.60 5 1.02 6 2.11 2 1.34 00.0037—60 10 0.46 14 0.85 9 1.30 5 0.99 1 0.42 0 0.0061—84 3 0.54 5 0.89 3 0.92 5 1.83 3 1.96 1 2.7085—120 3 1.11 3 1.03 3 1.70 9 5.09 1 0.93 00.00>120 2 1.03 1 0.54 1 0.83 2 1.59 0 0.00 1 7.46Gap 2Noprev 0 0.00 147.62 0 0.00 0 0.00 1 8.48 00.001 prey 0 0.00 1 6.49 1 10.64 1 9.35 4 18.690 0.0010—36 13 1.26 2 0.52 1 1.01 0 0.00 0 0.00 210.2037—60 10 1.73 5 1.63 5 4.91 2 2.83 1 2.11 00.0061—84 4 2.98 2 2.24 2 5.11 0 0.00 0 0.00 00.0085—120 1 1.62 0 0.00 1 5.18 0 0.00 1 8.40 1 27.78>120 1 2.32 1 4.72 0 0.00 2 19.61 0 0.00 00.00Gap 3No prey 0 0.00 0 0.00 1200.00 0 0.00 0 0.00 00.00lprev 0 0.00 0 0.00 0 0.00 0 0.00 0 0.00 00.0010—36 3 2.86 0 0.00 1 9.01 0 0.00 0 0.00 00.0037—60 1 1.62 0 0.00 0 0.00 0 0.000 0.00 0 0.0061—84 1 6.54 0 0.00 0 0.00 0 0.000 0.00 0 0.0085—120 1 12.35 0 0.00 0 0.00 0 0.00 0 0.00 00.00>120 0 0.00 0 0.00 0 0.00 0 0.00 0 0.00 00.0088


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics

Country Views Downloads
China 21 43
United States 17 0
Japan 8 1
Russia 6 0
City Views Downloads
Shenzhen 18 42
Ashburn 13 0
Tokyo 7 0
Saint Petersburg 6 0
Beijing 2 1
Buffalo 2 0
Mountain View 1 0
Sunnyvale 1 0
Quanzhou 1 0
Hakodate 1 1

{[{ mDataHeader[type] }]} {[{ month[type] }]} {[{ tData[type] }]}
Download Stats



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items