Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

The value of the tuberculin skin test size in predicting the development of tuberculosis in contacts… Morán Mendoza, A. Onofre 2004

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-ubc_2004-902323.pdf [ 28.94MB ]
Metadata
JSON: 831-1.0091780.json
JSON-LD: 831-1.0091780-ld.json
RDF/XML (Pretty): 831-1.0091780-rdf.xml
RDF/JSON: 831-1.0091780-rdf.json
Turtle: 831-1.0091780-turtle.txt
N-Triples: 831-1.0091780-rdf-ntriples.txt
Original Record: 831-1.0091780-source.json
Full Text
831-1.0091780-fulltext.txt
Citation
831-1.0091780.ris

Full Text

T H E V A L U E O F T H E T U B E R C U L I N SKIN T E S T SIZE IN PREDICTING T H E D E V E L O P M E N T O F T U B E R C U L O S I S IN C O N T A C T S O F ACTIVE C A S E S by A. O N O F R E M O R A N M E N D O Z A M.D., Universidad Aut6noma de San Luis Potosf, Mexico, 1986 M . S c , Universidad Nacional Autonoma de Mexico, 1992 A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE R E Q U I R E M E N T S F O R THE D E G R E E OF D O C T O R O F P H I L O S O P H Y in F A C U L T Y O F G R A D U A T E STUDIES (Health Care and Epidemiology) We accept this thesis as conforming to the required standard T H E UNIVERSITY OF BRITISH COLUMBIA April, 2004 © A. Onofre Monan Mendoza, 2004 A B S T R A C T Objectives: A) to identify the risk factors for tuberculosis (TB) development among contacts of TB cases. B) To determine the tuberculin skin test (TST) size cut-offs that best predict the development of TB for various groups. Methods: A retrospective cohort study of the contacts of active TB cases recorded in the Division of TB control (DTB) of BC from 1990 to 2000 was carried out to identify the development of TB. The prognostic factors for TB development were obtained from the DTB and from several provincial databases. Contacts with HIV or previous TB were excluded. The prognostic factors evaluated were: TST size, age, gender, ethnicity, latent TB infection (LTBI) treatment, type of contact, previous B C G , irnmunosuppressive conditions and others. Results: Among the 33,146 contacts identified in the DTB, 228 developed TB during the 12-year follow-up period (TB rate 688/100,000) and 82% of TB cases occurred within 2 yrs after exposure. The main prognostic factors for TB development were: malnutrition, no LTBI treatment, TST size >5 mm, being a household contact, and young age (all hazard ratios >5; p-values <0.0001). Other significant factors were: malignancy, use of corticosteroids, injection drug use, previous B C G status, alcoholism, exposure to a smear positive TB case, lower socioeconomic status and being an Aboriginal person. The TST sizes that best predicted the development of TB varied among different groups: household contacts with TST sizes 0-4 mm (reference category) had a TB rate of 1,014/100,000; those with 5-9 mm a TB rate of 2,162/100,000 (adjusted hazard ratio: 10.0; 95% CI: 3.3 to 28.9). Children 0-10 yrs old with 0-4 mm had a TB rate of 806/100,000; those with 5-9 mm a TB rate of 5,556/100,000 (adjusted hazard ratio: 6.0; 95% CI: 0.8 to 47.4). Immunosuppressed contacts with TST sizes of 0-4 mm had a TB rate of 686/100,000; those with 5-9 mm a TB rate of 2,000/100,000 (adjusted hazard ratio: 2.6; 95% CI: 0.3 to 22.9). The risk of TB increased progressively with larger TST sizes. Conclusions: Household contacts, close contacts 0-10 yrs old and immunosuppressed contacts may benefit from preventive treatment regardless of their TST size. T A B L E O F C O N T E N T S A B S T R A C T i i T A B L E OF CONTENTS i i i LIST OF T A B L E S iv LIST OF FIGURES xi LIST OF ABBREVIATIONS xiv LIST OF DEFINITIONS xv DEDICATION xvii A C K N O W L E D G E M E N T S xviii HYPOTHESES 1 OBJECTIVES 1 B A C K G R O U N D 2 RATIONALE 5 METHODS 6 STATISTICAL A N A L Y S I S 13 ETHICS 16 D A T A B A S E S M A N A G E M E N T 17 RESULTS 90 DISCUSSION 268 LIMITATIONS 292 CONCLUSIONS 296 BIBLIOGRAPHY 298 APPENDIX I TST Size Guidelines 305 APPENDIX II Information Available in the DTB databases 309 APPENDIX III Quality Control and Special Issues in Data Management 341 APPENDIX IV Proportional Hazards Assessment 348 - i i i -LIST O F T A B L E S T A B L E 1. I N F O R M A T I O N C O N T A I N E D IN T H E D T B D A T A B A S E S 18 T A B L E 2. N U M B E R O F S U B J E C T S A N D V A R I A B L E S IN T H E D T B FILES 1 9 9 0 - 2 0 0 1 2 0 T A B L E 3. D U P L I C A T E D C O N T A C T S IN T H E O R I G I N A L T B D A T A B A S E 21 T A B L E 4. N U M B E R O F D U P L I C A T E S B Y F R E Q U E N C Y A N D T I M E O F O C C U R R E N C E 2 2 T A B L E 5. C O N T A C T S A C C O R D I N G T O T H E F R E Q U E N C Y A N D P E R I O D O F A P P E A R A N C E 2 3 T A B L E 6. C A S E S R E - C A T E G O R I Z E D 2 5 T A B L E 7. C O N T A C T S A C C O R D I N G T O T H E N U M B E R O F T B C A S E S E X P O S E D 2 5 T A B L E 8. C O N T A C T S A C C O R D I N G T O T H E N U M B E R O F T B C A S E S E X P O S E D ( C O L L A P S E D ) 2 6 T A B L E 9. N U M B E R O F U N I Q U E C O N T A C T S IN D T B D A T A B A S E S S E A R C H E D IN B C L H D B Y P E R I O D 2 8 T A B L E 10. N U M B E R O F U N I Q U E C O N T A C T S F R O M T H E D T B S E A R C H E D A N D F O U N D A T B C L H D 2 9 T A B L E 11. N U M B E R O F U N I Q U E C O N T A C T S F O U N D IN T H E B C L H D D A T A B A S E S 31 T A B L E 12. D E S C R I P T I O N O F P R O G N O S T I C F A C T O R S A N D T H E I R A V A I L A B I L I T Y IN T H E D A T A B A S E S 9 2 T A B L E 13. D I S T R I B U T I O N O F T H E T S T SIZE A M O N G C O N T A C T S (1ST T S T ) 105 T A B L E 14. O C C U R R E N C E O F T B A M O N G C O N T A C T S A F T E R E X P O S U R E T O T H E S O U R C E C A S E . . 1 1 0 T A B L E 15. N U M B E R O F T B C A S E S B Y A G E (IN Q U A R T I L E S ) 112 T A B L E 16. Cox R E G R E S S I O N . O N L Y A G E IN T H E M O D E L 113 T A B L E 17. C O X R E G R E S S I O N . O N L Y A G E ( C A T E G O R I Z E D ) IN T H E M O D E L 114 T A B L E 18. C O X R E G R E S S I O N . O N L Y G E N D E R IN T H E M O D E L 116 T A B L E 19. C O X R E G R E S S I O N . O N L Y L T B I T R E A T M E N T IN T H E M O D E L 118 T A B L E 2 0 . C O X R E G R E S S I O N . O N L Y L T B I T R E A T M E N T IN T H E M O D E L (0, < 6 M A N D > 6 M ) . . .121 T A B L E 2 1 . C O X R E G R E S S I O N . O N L Y N U M B E R O F T B C A S E S E X P O S E D IN T H E M O D E L 124 T A B L E 22 . C O X R E G R E S S I O N . O N L Y E T H N I C I T Y IN T H E M O D E L 126 T A B L E 2 3 . C O X R E G R E S S I O N . O N L Y C O N T A C T T Y P E IN T H E M O D E L 128 T A B L E 24 . C O X R E G R E S S I O N . O N L Y S M E A R R E S U L T S O F T B S O U R C E C A S E IN T H E M O D E L . 131 T A B L E 2 5 . C O X R E G R E S S I O N . O N L Y I M M U N O S U P P R E S S I O N IN T H E M O D E L 133 T A B L E 2 6 . C O X R E G R E S S I O N . O N L Y I M M U N O S U P P R E S S I V E C O N D I T I O N S IN T H E M O D E L 135 T A B L E 2 7 . C O X R E G R E S S I O N . O N L Y H I G H - R I S K IN T H E M O D E L 138 -iv-T A B L E 2 8 . Cox R E G R E S S I O N . O N L Y TST SIZE (FIRST A P P L I E D ) IN T H E M O D E L 141 T A B L E 2 9 . Cox R E G R E S S I O N . O N L Y TST SIZE ( M A X I M U M O F 1 S T O R 2 N D ) IN T H E M O D E L 145 T A B L E 30 . Cox R E G R E S S I O N . O N L Y BCG S T A T U S IN T H E M O D E L 147 T A B L E 3 1 . Cox R E G R E S S I O N . O N L Y SE S T A T U S IN T H E M O D E L 1 5 0 T A B L E 32 . Cox R E G R E S S I O N . O N L Y SE S T A T U S (3 C A T E G O R I E S ) IN T H E M O D E L 151 T A B L E 3 3 . Cox R E G R E S S I O N . O N L Y P R E V I O U S TST R E S P O N S E IN T H E M O D E L 153 T A B L E 34 . Cox R E G R E S S I O N . O N L Y A B N O R M A L C H E S T X - R A Y IN T H E M O D E L 155 T A B L E 35 . Cox R E G R E S S I O N ( U N I V A R I A T E A N A L Y S I S ) 1 5 7 T A B L E 36 . Cox R E G R E S S I O N M O D E L W I T H A L L V A R I A B L E S I N C L U D E D ( I S T 2 N D TST) 1 5 9 T A B L E 3 7 . Cox R E G R E S S I O N M O D E L W I T H A L L R E L E V A N T V A R I A B L E S I N C L U D E D ( 1 S T 2 N D TST) 161 T A B L E 38 . Cox R E G R E S S I O N M O D E L U S I N G F O R W A R D STEPWISE ( 1 S T 2 N D TST) 162 T A B L E 3 9 . Cox R E G R E S S I O N M O D E L U S I N G B A C K W A R D STEPWISE ( I S T 2 N D TST) 163 T A B L E 4 0 . F I N A L C O X R E G R E S S I O N M O D E L ( I S T 2 N D TST) 166 T A B L E 4 1 . Cox R E G R E S S I O N M O D E L W I T H A L L V A R I A B L E S I N C L U D E D ( 1 S T TST) 1 6 9 T A B L E 4 2 . Cox R E G R E S S I O N M O D E L U S I N G F O R W A R D STEPWISE ( 1 S T TST) 1 7 0 T A B L E 4 3 . Cox R E G R E S S I O N M O D E L U S I N G B A C K W A R D STEPWISE ( 1 S T TST) 171 T A B L E 44 . F I N A L C O X R E G R E S S I O N M O D E L ( I S T TST) 174 T A B L E 4 5 . Cox R E G R E S S I O N M O D E L F O R C O N T A C T S WITH N O LTBI Rx ( 1 S T TST) 181 T A B L E 4 6 . N U M B E R O F TB C A S E S A N D T H E TB R A T E S F O R T H E D I F F E R E N T TST SIZES ( I S T TST). C O N T A C T S W I T H N O LTBI T R E A T M E N T R E C E I V E D 182 T A B L E 4 7 . TB RISK A C C O R D I N G T O T H E TST SIZE F R O M T H E I S T TST ( U N I V A R I A T E A N A L Y S I S ) . C O N T A C T S W I T H N O LTBI T R E A T M E N T 183 T A B L E 4 8 . TB RISK A C C O R D I N G T O T H E TST SIZE F R O M T H E I S T TST ( M U L T I V A R I A T E A N A L Y S I S ) . C O N T A C T S W I T H N O LTBI T R E A T M E N T . . . . 183 T A B L E 4 9 . N U M B E R O F TB C A S E S A N D T H E TB R A T E S F O R T H E D I F F E R E N T TST SIZES ( I S T TST). H O U S E H O L D C O N T A C T S 185 T A B L E 50 . TB RISK A C C O R D I N G T O T H E TST SIZE O F T H E I S T TST ( U N I V A R I A T E A N A L Y S I S ) . H O U S E H O L D C O N T A C T S 186 T A B L E 5 1 . TB RISK A C C O R D I N G T O T H E TST SIZE O F T H E I S T TST ( M U L T I V A R I A T E A N A L Y S I S ) . H O U S E H O L D C O N T A C T S 186 - V -T A B L E 52. N U M B E R O F T B C A S E S A N D T H E T B R A T E S F O R T H E D I F F E R E N T T S T SIZES ( 1 S T T S T ) . N O N - H O U S E H O L D C O N T A C T S 188 T A B L E 53 . T B RISK A C C O R D I N G T O T H E T S T SIZE O F T H E 1 S T T S T ( U N I V A R I A T E A N A L Y S I S ) . N O N - H O U S E H O L D C O N T A C T S 188 T A B L E 54 . T B RISK A C C O R D I N G T O T H E T S T SIZE O F T H E 1 S T T S T ( M U L T I V A R I A T E A N A L Y S I S ) . N O N - H O U S E H O L D C O N T A C T S 189 T A B L E 55 . N U M B E R O F T B C A S E S A N D T H E T B R A T E S F O R T H E D I F F E R E N T T S T SIZES ( 1 S T T S T ) . C A S U A L C O N T A C T S 191 T A B L E 56 . T B RISK A C C O R D I N G T O T H E T S T SIZE O F T H E 1 S T T S T ( U N I V A R I A T E A N A L Y S I S ) . C A S U A L C O N T A C T S 191 T A B L E 57 . T B RISK A C C O R D I N G T O T H E T S T SIZE O F T H E 1 S T T S T ( M U L T I V A R I A T E A N A L Y S I S ) . C A S U A L C O N T A C T S 192 T A B L E 58. N U M B E R O F T B C A S E S A N D T H E T B R A T E S F O R T H E D I F F E R E N T T S T SIZES ( 1 S T T S T ) . N O N - I M M U N O S U P P R E S S E D C O N T A C T S 194 T A B L E 59 . T B RISK A C C O R D I N G T O T H E T S T SIZE O F T H E 1 S T T S T ( U N I V A R I A T E A N A L Y S I S ) . N O N - I M M U N O S U P P R E S S E D C O N T A C T S 194 T A B L E 6 0 . T B RISK A C C O R D I N G T O T H E T S T SIZE O F T H E 1 S T T S T ( M U L T I V A R I A T E A N A L Y S I S ) . N O N - I M M U N O S U P P R E S S E D C O N T A C T S 195 T A B L E 6 1 . N U M B E R O F T B C A S E S A N D T H E T B R A T E S F O R T H E D I F F E R E N T T S T SIZES ( 1 S T T S T ) . I M M U N O S U P P R E S S E D C O N T A C T S 197 T A B L E 62 . T B RISK A C C O R D I N G T O T H E T S T SIZE O F T H E 1 S T T S T ( U N I V A R I A T E A N A L Y S I S ) . I M M U N O S U P P R E S S E D C O N T A C T S 198 T A B L E 63 . T B RISK A C C O R D I N G T O T H E T S T SIZE O F T H E I S T T S T ( M U L T I V A R I A T E A N A L Y S I S ) . I M M U N O S U P P R E S S E D C O N T A C T S 198 T A B L E 64 . N U M B E R O F T B C A S E S A N D T H E T B R A T E S F O R T H E D I F F E R E N T T S T SIZES ( 1 S T T S T ) . No P R E V I O U S B C G 2 0 0 T A B L E 65 . T B RISK A C C O R D I N G T O T H E T S T SIZE O F T H E 1 S T T S T ( U N I V A R I A T E A N A L Y S I S ) . No P R E V I O U S B C G 201 T A B L E 6 6 . T B RISK A C C O R D I N G T O T H E T S T SIZE O F T H E 1 S T T S T ( M U L T I V A R I A T E A N A L Y S I S ) . No P R E V I O U S B C G 201 T A B L E 67 . N U M B E R O F T B C A S E S A N D T H E T B R A T E S F O R T H E D I F F E R E N T T S T SIZES ( 1 S T T S T ) . P R E V I O U S B C G 2 0 3 -vi-T A B L E 6 8 . T B RISK A C C O R D I N G T O T H E T S T SIZE O F T H E 1 S T T S T ( U N I V A R I A T E A N A L Y S I S ) . P R E V I O U S B C G 2 0 4 T A B L E 69 . T B RISK A C C O R D I N G T O T H E T S T SIZE O F T H E 1 S T T S T ( M U L T I V A R I A T E A N A L Y S I S ) . P R E V I O U S B C G 2 0 4 T A B L E 70 . N U M B E R O F T B C A S E S A N D T H E T B R A T E S F O R T H E D I F F E R E N T T S T SIZES ( 1 S T T S T ) . C O N T A C T S 0 - 1 0 Y R S O L D 2 0 7 T A B L E 7 1 . T B RISK A C C O R D I N G T O T H E T S T SIZE F R O M T H E I S T T S T ( U N I V A R I A T E A N A L Y S I S ) . C O N T A C T S 0 - 1 0 Y R S O L D 2 0 7 T A B L E 72 . T B RISK A C C O R D I N G T O T H E T S T SIZE F R O M T H E 1 S T T S T ( M U L T I V A R I A T E A N A L Y S I S ) . C O N T A C T S 0 - 1 0 Y R S O L D 2 0 8 T A B L E 7 3 . N U M B E R O F T B C A S E S A N D T H E T B R A T E S F O R T H E D I F F E R E N T T S T SIZES ( 1 S T T S T ) . C O N T A C T S > 1 0 Y R S O L D 2 1 0 T A B L E 74 . T B RISK A C C O R D I N G T O T H E T S T SIZE F R O M T H E I S T T S T ( U N I V A R I A T E A N A L Y S I S ) . C O N T A C T S > 1 0 Y R S O L D 2 1 0 T A B L E 75 . T B RISK A C C O R D I N G T O T H E T S T SIZE F R O M T H E 1 S T T S T ( M U L T I V A R I A T E A N A L Y S I S ) . C O N T A C T S > 1 0 Y R S O L D 211 T A B L E 76 . N U M B E R O F T B C A S E S A N D T H E T B R A T E S F O R T H E D I F F E R E N T T S T SIZES ( 1 S T T S T ) . C A N A D I A N - B O R N C O N T A C T S 2 1 3 T A B L E 77 . T B RISK A C C O R D I N G T O T H E T S T SIZE F R O M T H E 1 S T T S T ( U N I V A R I A T E A N A L Y S I S ) . C A N A D I A N - B O R N C O N T A C T S 2 1 4 T A B L E 78 . T B RISK A C C O R D I N G T O T H E T S T SIZE F R O M T H E 1 S T T S T ( M U L T I V A R I A T E A N A L Y S I S ) . C A N A D I A N - B O R N C O N T A C T S 2 1 4 T A B L E 79 . N U M B E R O F T B C A S E S A N D T H E T B R A T E S F O R T H E D I F F E R E N T T S T SIZES ( 1 S T T S T ) . A B O R I G I N A L C O N T A C T S 2 1 6 T A B L E 80 . T B RISK A C C O R D I N G T O T H E T S T SIZE F R O M T H E 1 S T T S T ( U N I V A R I A T E A N A L Y S I S ) . A B O R I G I N A L C O N T A C T S 2 1 6 T A B L E 81 . T B RISK A C C O R D I N G T O T H E T S T SIZE F R O M T H E 1 S T T S T ( M U L T I V A R I A T E A N A L Y S I S ) . A B O R I G I N A L C O N T A C T S 2 1 7 T A B L E 82. N U M B E R O F T B C A S E S A N D T H E T B R A T E S F O R T H E D I F F E R E N T T S T SIZES ( I S T T S T ) . F O R E I G N - B O R N C O N T A C T S 2 1 9 T A B L E 83 . T B RISK A C C O R D I N G T O T H E T S T SIZE F R O M T H E 1 S T T S T ( U N I V A R I A T E A N A L Y S I S ) . F O R E I G N - B O R N C O N T A C T S 2 2 0 -vii-T A B L E 84. T B RISK A C C O R D I N G T O T H E T S T SIZE F R O M T H E 1 S T T S T ( M U L T I V A R I A T E A N A L Y S I S ) . F O R E I G N - B O R N C O N T A C T S 220 T A B L E 85 . N U M B E R O F T B C A S E S A N D T H E T B R A T E S F O R T H E D I F F E R E N T T S T SIZES ( 1 S T T S T ) . I N J E C T I O N D R U G USERS 2 2 2 T A B L E 86. R E D U C E D C O X R E G R E S S I O N M O D E L (1 S T O R 2 N D T S T ) 2 2 3 T A B L E 87 . N U M B E R O F T B C A S E S A N D T H E T B R A T E S F O R T H E D I F F E R E N T T S T SIZES ( 1 S T O R 2 N D T S T ) . C O N T A C T S W I T H N O L T B I T R E A T M E N T R E C E I V E D 2 2 4 T A B L E 88 . T B RISK A C C O R D I N G T O T H E T S T SIZE F R O M T H E 1 S T O R 2 N D T S T ( U N I V A R I A T E A N A L Y S I S ) . C O N T A C T S W I T H N O L T B I T R E A T M E N T 2 2 5 T A B L E 89 . T B RISK A C C O R D I N G T O T H E T S T SIZE F R O M T H E 1 S T O R 2 N D T S T ( M U L T I V A R I A T E A N A L Y S I S ) . C O N T A C T S W I T H N O L T B I T R E A T M E N T 2 2 5 T A B L E 9 0 . N U M B E R O F T B C A S E S A N D T H E T B R A T E S F O R T H E D I F F E R E N T T S T SIZES ( 1 S T O R 2 N D T S T ) . H O U S E H O L D C O N T A C T S 2 2 7 T A B L E 9 1 . T B RISK A C C O R D I N G T O T H E T S T SIZE O F T H E 1 S T O R 2 N D T S T ( U N I V A R I A T E A N A L Y S I S ) . H O U S E H O L D C O N T A C T S 2 2 8 T A B L E 92 . T B RISK A C C O R D I N G T O T H E T S T SIZE O F T H E 1 S T O R 2 N D T S T ( M U L T I V A R I A T E A N A L Y S I S ) . H O U S E H O L D C O N T A C T S 2 2 8 T A B L E 93 . N U M B E R O F T B C A S E S A N D T H E T B R A T E S F O R T H E D I F F E R E N T T S T SIZES ( 1 S T O R 2 N D T S T ) . N O N - H O U S E H O L D C O N T A C T S 2 3 0 T A B L E 94 . T B RISK A C C O R D I N G T O T H E T S T SIZE O F T H E 1 S T O R 2 N D T S T ( U N I V A R I A T E A N A L Y S I S ) . N O N - H O U S E H O L D C O N T A C T S . . . . 231 T A B L E 9 5 . T B RISK A C C O R D I N G T O T H E T S T SIZE O F T H E 1 S T O R 2 N D T S T ( M U L T I V A R I A T E A N A L Y S I S ) . N O N - H O U S E H O L D C O N T A C T S 231 T A B L E 96 . N U M B E R O F T B C A S E S A N D T H E T B R A T E S F O R T H E D I F F E R E N T T S T SIZES ( 1 S T O R 2 N D T S T ) . C A S U A L C O N T A C T S 2 3 3 T A B L E 9 7 . T B RISK A C C O R D I N G T O T H E T S T SIZE O F T H E 1 S T O R 2 N D T S T ( U N I V A R I A T E A N A L Y S I S ) . C A S U A L C O N T A C T S 2 3 4 T A B L E 9 8 . T B RISK A C C O R D I N G T O T H E T S T SIZE O F T H E 1 S T O R 2 N D T S T ( M U L T I V A R I A T E A N A L Y S I S ) . C A S U A L C O N T A C T S 2 3 4 T A B L E 99 . N U M B E R O F T B C A S E S A N D T H E T B R A T E S F O R T H E D I F F E R E N T T S T SIZES ( 1 S T O R 2ND T S T ) . N O N - I M M U N O S U P P R E S S E D C O N T A C T S 2 3 6 -vii i-T A B L E 100. T B RISK ACCORDING TO T H E T S T SIZE OF T H E 1 S T OR 2ND T S T (UNIVARIATE ANALYSIS) . NON-IMMUNOSUPPRESSED CONTACTS 237 T A B L E 101. T B RISK ACCORDING TO T H E T S T SIZE OF T H E 1S T OR 2ND T S T (MULTIVARIATE ANALYSIS). NON-IMMUNOSUPPRESSED CONTACTS 237 T A B L E 102. N U M B E R OF T B CASES A N D T H E T B RATES FOR T H E DIFFERENT T S T SIZES (I S T 2ND T S T ) . IMMUNOSUPPRESSED CONTACTS 239 T A B L E 103. T B RISK ACCORDING TO T H E T S T SIZE OF T H E I S T OR 2ND T S T (UNIVARIATE ANALYSIS). IMMUNOSUPPRESSED CONTACTS 240 T A B L E 104. T B RISK ACCORDING TO T H E T S T SIZE OF T H E I S T OR 2ND T S T (MULTIVARIATE ANALYSIS). IMMUNOSUPPRESSED CONTACTS 240 T A B L E 105. N U M B E R OF T B CASES A N D T H E T B RATES FOR T H E DIFFERENT T S T SIZES 1 S T OR 2ND T S T . No PREVIOUS B C G 242 T A B L E 106. T B RISK ACCORDING TO T H E T S T SIZE OF T H E 1S T OR 2ND T S T (UNIVARIATE ANALYSIS). No PREVIOUS B C G 243 T A B L E 107. T B RISK ACCORDING TO T H E T S T SIZE OF T H E 1 S T OR 2ND T S T (MULTIVARIATE ANALYSIS). N O PREVIOUS B C G 243 T A B L E 108. N U M B E R OF T B CASES A N D T H E T B RATES FOR T H E DIFFERENT T S T SIZES I S T OR 2ND T S T . PREVIOUS B C G 245 T A B L E 109. T B RISK ACCORDING TO T H E T S T SIZE OF T H E 1S T OR 2ND T S T (UNIVARIATE ANALYSIS). PREVIOUS B C G 246 T A B L E 110. T B RISK ACCORDING TO T H E T S T SIZE OF T H E 1 S T OR 2ND T S T (MULTIVARIATE ANALYSIS). PREVIOUS B C G 246 T A B L E 111. N U M B E R OF T B CASES A N D T H E T B RATES FOR T H E DIFFERENT T S T SIZES 1S T OR 2ND T S T . C O N T A C T S 0-10 YRS OLD 248 T A B L E 112. T B RISK ACCORDING TO T H E T S T SIZE FROM T H E 1 S T OR 2ND T S T (UNIVARIATE ANALYSIS). C O N T A C T S 0-10 YRS OLD 249 T A B L E 113. T B RISK ACCORDING TO T H E T S T SIZE OF T H E 1S T OR 2ND T S T (MULTIVARIATE ANALYSIS). C O N T A C T S 0-10 YRS OLD 250 T A B L E 114. N U M B E R OF T B CASES A N D T H E T B RATES FOR T H E DIFFERENT T S T SIZES 1 S T OR 2ND T S T . C O N T A C T S >10 YRS OLD 252 T A B L E 115. T B RISK ACCORDING TO T H E T S T SIZE FROM T H E 1S T OR 2ND T S T (UNIVARIATE ANALYSIS). C O N T A C T S >10 YRS OLD 253 -ix-T A B L E 116. T B RISK A C C O R D I N G T O T H E T S T SIZE O F T H E 1 S T O R 2 N D T S T ( M U L T I V A R I A T E A N A L Y S I S ) . C O N T A C T S > 1 0 Y R S O L D 2 5 3 T A B L E 117. N U M B E R O F T B C A S E S A N D T H E T B R A T E S F O R T H E D I F F E R E N T T S T SIZES 1 S T O R 2 N D T S T . C A N A D I A N - B O R N C O N T A C T S 2 5 5 T A B L E 118. T B RISK A C C O R D I N G T O T H E T S T SIZE F R O M T H E 1 S T O R 2 N D T S T ( U N I V A R I A T E A N A L Y S I S ) . C A N A D I A N - B O R N C O N T A C T S 2 5 6 T A B L E 119. T B RISK A C C O R D I N G T O T H E T S T SIZE F R O M T H E l S T O R 2 N D T S T . C A N A D I A N - B O R N C O N T A C T S ( M U L T I V A R I A T E A N A L Y S I S ) 2 5 7 T A B L E 120. N U M B E R O F T B C A S E S A N D T H E T B R A T E S F O R T H E D I F F E R E N T T S T SIZES 1 S T O R 2 N D T S T . A B O R I G I N A L C O N T A C T S 2 5 9 T A B L E 121 . T B RISK A C C O R D I N G T O T H E T S T SIZE F R O M T H E 1 S T O R 2 N D T S T ( U N I V A R I A T E A N A L Y S I S ) . A B O R I G I N A L C O N T A C T S 2 6 0 T A B L E 122. T B RISK A C C O R D I N G T O T H E T S T SIZE F R O M T H E I S T O R 2 N D T S T . A B O R I G I N A L C O N T A C T S ( M U L T I V A R I A T E A N A L Y S I S ) 261 T A B L E 123. N U M B E R O F T B C A S E S A N D T H E T B R A T E S F O R T H E D I F F E R E N T T S T SIZES 1 S T O R 2 N D T S T . F O R E I G N - B O R N C O N T A C T S 2 6 3 T A B L E 124. T B RISK A C C O R D I N G T O T H E T S T SIZE F R O M T H E 1 S T O R 2 N D T S T ( U N I V A R I A T E A N A L Y S I S ) . F O R E I G N - B O R N C O N T A C T S 2 6 4 T A B L E 125. T B RISK A C C O R D I N G T O T H E T S T SIZE F R O M T H E 1 S T O R 2 N D T S T . F O R E I G N - B O R N C O N T A C T S ( M U L T I V A R I A T E A N A L Y S I S ) 2 6 5 T A B L E 126. N U M B E R O F T B C A S E S A N D T H E T B R A T E S F O R T H E D I F F E R E N T T S T SIZES 1 S T O R 2 N D T S T . I N J E C T I O N D R U G U S E R S 2 6 7 T A B L E 127. C O M P A R I S O N O F S T U D I E S R E P O R T I N G T S T SIZE A N D I N C I D E N C E O F T B 2 7 9 T A B L E 128. P R O P O S E D C U T - O F F S F O R A POSITIVE T S T IN C O N T A C T S O F T B C A S E S 2 8 9 -x-LIST OF FIGURES F I G U R E 1. A G E DISTRIBUTION O F C O N T A C T S 9 3 F I G U R E 2. E T H N I C I T Y O F C O N T A C T S 9 5 F I G U R E 3. T Y P E O F C O N T A C T 9 6 F I G U R E 4. G E N D E R O F C O N T A C T S 9 7 F I G U R E 5. S O U R C E C A S E A F B S M E A R 9 8 F I G U R E 6. I M M U N O S U P P R E S S I V E C O N D I T I O N S 1 0 0 F I G U R E 7. E C O L O G I C S O C I O E C O N O M I C S T A T U S 103 F I G U R E 8. T S T SIZE ( I S T A P P L I E D A F T E R C O N T A C T ) 104 F I G U R E 9. T S T SIZE DISTRIBUTION ( C A T E G O R I Z E D ) 105 F I G U R E 10. T B C A S E S C L A S S I F I C A T I O N 106 F I G U R E 11. P U L M O N A R Y T B C A S E S C O N F I R M E D M I C R O B I O L O G I C A L L Y 107 F I G U R E 12. P U L M O N A R Y T B C A S E S W I T H POSITIVE S M E A R . 1 0 7 F I G U R E 13. P U L M O N A R Y T B C A S E S WITH POSITIVE C U L T U R E 108 F I G U R E 14. S U R U V I V A L C U R V E . T B O C C U R R E N C E A L L C O N T A C T S 1 0 9 F I G U R E 15. A G E DISTRIBUTION O F T B C A S E S 112 F I G U R E 16. T B R A T E B Y A G E G R O U P S IN C O N T A C T S O F T B C A S E S (1 2 - Y R F O L L O W - U P ) 113 F I G U R E 17. S U R V I V A L C U R V E . T B O C C U R R E N C E B Y A G E G R O U P S 115 F I G U R E 18. G E N D E R DISTRIBUTION O F T B C A S E S 116 F I G U R E 19. S U R V I V A L C U R V E . T B O C C U R R E N C E B Y G E N D E R 1 1 7 F I G U R E 20 . L T B I T R E A T M E N T O F T B C A S E S 118 F I G U R E 2 1 . S U R U V I V A L C U R V E . T B O C C U R R E N C E B Y L T B I T R E A T M E N T 1 1 9 F I G U R E 22 . L T B I T R E A T M E N T O F T B C A S E S ( D E T A I L E D ) 120 F I G U R E 23 . S U R V I V A L C U R V E . T B O C C U R R E N C E B Y L T B I T R E A T M E N T ( D E T A I L E D ) 121 F I G U R E 24 . T B C A S E S . N U M B E R O F T B C A S E S E X P O S E D 123 F I G U R E 2 5 . S U R V I V A L C U R V E . T B O C C U R R E N C E B Y T B E X P O S U R E 125 F I G U R E 2 6 . E T H N I C I T Y O F T B C A S E S 126 F I G U R E 2 7 . S U R U V I V A L C U R V E . T B O C C U R R E N C E B Y E T H N I C I T Y 127 F I G U R E 28 . T B C A S E S . T Y P E O F C O N T A C T 128 F I G U R E 29 . S U R V I V A L C U R V E . T B O C C U R R E N C E B Y C L O S E N E S S O F C O N T A C T 129 F I G U R E 30 . T B C A S E S . I N F E C T I V I T Y O F S O U R C E C A S E 1 3 0 -xi-F I G U R E 3 1 . S U R V I V A L C U R V E . T B O C C U R R E N C E B Y S M E A R R E S U L T S O F T B S O U R C E 131 F I G U R E 32 . T B C A S E S . I M M U N O S U P P R E S S E D A T T H E T I M E O F C O N T A C T 132 F I G U R E 3 3 . T B C A S E S . I M M U N O S U P P R E S S I V E C O N D I T I O N S A T T H E T I M E O F C O N T A C T 133 F I G U R E 34 . S U R V I V A L C U R V E . T B O C C U R R E N C E B Y I M M U N E S T A T U S 134 F I G U R E 35 . S U R V I V A L C U R V E . T B O C C U R R E N C E B Y I M M U N O S U P P R E S S I V E C O N D I T I O N S 1 3 6 F I G U R E 36 . T B C A S E S . H I G H - R I S K S U B J E C T S 137 F I G U R E 37 . S U R V I V A L C U R V E . T B O C C U R R E N C E IN H I G H - R I S K S U B J E C T S 1 3 9 F I G U R E 38 . T B C A S E S . D I S T R I B U T I O N O F T S T SIZE ( I S T A P P L I E D ) 140 F I G U R E 39 . T B C A S E S . D I S T R I B U T I O N O F T S T SIZE C A T E G O R I Z E D ( 1 S T A P P L I E D ) 141 F I G U R E 4 0 . S U R V I V A L C U R V E . T B O C C U R R E N C E B Y T S T SIZE ( I S T T S T ) 142 F I G U R E 4 1 . T B C A S E S . D I S T R I B U T I O N O F T S T SIZE ( M A X I M U M O F 1 S T O R 2 N D T S T ) 143 F I G U R E 4 2 . T B C A S E S . D I S T R I B U T I O N O F T S T SIZE C A T E G O R I Z E D ( 1 S T 2 N D T S T ) 144 F I G U R E 4 3 . S U R V I V A L C U R V E . T B O C C U R R E N C E B Y T S T SIZE ( 1 S T 2 N D T S T ) 146 F I G U R E 44 . T B C A S E S . D I S T R I B U T I O N O F B C G S T A T U S 147 F I G U R E 4 5 . S U R V I V A L C U R V E . T B O C C U R R E N C E B Y B C G S T A T U S 148 F I G U R E 4 6 . T B C A S E S . D I S T R I B U T I O N O F G E O G R A P H I C A L S O C I O E C O N O M I C S T A T U S 149 F I G U R E 4 7 . S U R V I V A L C U R V E . T B O C C U R R E N C E B Y S E S T A T U S (QUINTILES) 150 F I G U R E 4 8 . S U R V I V A L C U R V E . T B O C C U R R E N C E B Y S E S T A T U S ( R E D U C E D Q U I N T I L E S ) 152 F I G U R E 4 9 . S U R V I V A L C U R V E . T B O C C U R R E N C E B Y P R E V I O U S T S T R E S P O N S E 154 F I G U R E 50 . S U R V I V A L C U R V E . T B O C C U R R E N C E B Y O L D T B IN C H E S T X - R A Y 156 F I G U R E 5 1 . T S T SIZE A N D RISK O F T B . C O N T A C T S WITH N O L T B I T R E A T M E N T ( 1 S T T S T ) 184 F I G U R E 52 . T S T SIZE A N D RISK O F T B . H O U S E H O L D C O N T A C T S ( 1 S T T S T ) 187 F I G U R E 53 . T S T SIZE A N D RISK O F T B . N O N - H O U S E H O L D C O N T A C T S ( 1 S T T S T ) 1 9 0 F I G U R E 54 . T S T SIZE A N D RISK O F T B . C A S U A L C O N T A C T S ( 1 S T T S T ) 193 F I G U R E 5 5. T S T SIZE A N D RISK O F T B . N O N - I M M U N O S U P P R E S S E D C O N T A C T S ( 1 S T T S T ) 1 9 6 F I G U R E 56. T S T SIZE A N D RISK O F T B . I M M U N O S U P P R E S S E D C O N T A C T S ( 1 S T T S T ) 1 9 9 F I G U R E 57 . T S T SIZE A N D RISK O F T B . C O N T A C T S WITH N O P R E V I O U S B C G ( 1 S T T S T ) 2 0 2 F I G U R E 58 . T S T SIZE A N D RISK O F T B . C O N T A C T S WITH P R E V I O U S B C G ( I S T T S T ) 2 0 5 F I G U R E 59 . T S T SIZE A N D RISK O F T B . C O N T A C T S 0 - 1 0 Y E A R - O L D ( I S T T S T ) 2 0 9 F I G U R E 6 0 . T S T SIZE A N D RISK O F T B . C O N T A C T S > 1 0 Y E A R S O L D ( I S T T S T ) 2 1 2 F I G U R E 6 1 . T S T SIZE A N D RISK O F T B . C A N A D I A N - B O R N C O N T A C T S ( I S T T S T ) 2 1 5 F I G U R E 62 . T S T SIZE A N D RISK O F T B . A B O R I G I N A L C O N T A C T S ( 1 S T T S T ) 2 1 8 -xi i -F I G U R E 63. T S T SIZE A N D RISK O F T B . F O R E I G N - B O R N C O N T A C T S (1 T S T ) 221 F I G U R E 64. T S T SIZE A N D RISK O F T B . C O N T A C T S W I T H N O L T B I Rx (1 s t 2 n d T S T ) 226 F I G U R E 65. T S T SIZE A N D RISK O F T B . H O U S E H O L D C O N T A C T S (1 s t 2 n d T S T ) 229 F I G U R E 66. T S T SIZE A N D RISK O F T B . N O N - H O U S E H O L D C O N T A C T S (1 s t 2 n d T S T ) 232 F I G U R E 67. T S T SIZE A N D RISK O F T B . C A S U A L C O N T A C T S (1 s t 2 n d T S T ) 235 F I G U R E 68. T S T SIZE A N D RISK O F T B . N O N - I M M U N O S U P P R E S S E D C O N T A C T S ( 1 s t 2 n d T S T ) . .238 F I G U R E 69. T S T SIZE A N D RISK O F T B . I M M U N O S U P P R E S S E D C O N T A C T S (1 s t 2 n d T S T ) 241 F I G U R E 70. T S T SIZE A N D RISK O F T B . C O N T A C T S W I T H N O P R E V I O U S B C G ( 1 s t 2 n d T S T ) . ...244 F I G U R E 71. T S T SIZE A N D RISK O F T B . C O N T A C T S W I T H P R E V I O U S B C G ( 1 s t 2 n d T S T ) 247 F I G U R E 72. T S T SIZE A N D RISK O F T B . C O N T A C T S 0-10 Y E A R S O L D ( 1 s t 2 n d T S T ) 251 F I G U R E 73. T S T SIZE A N D RISK O F T B . C O N T A C T S >10 Y E A R S O L D ( I s t 2 n d T S T ) 254 F I G U R E 74. T S T SIZE A N D RISK O F T B . C A N A D I A N - B O R N C O N T A C T S (1 s t 2 n d T S T ) 258 F I G U R E 75. T S T SIZE A N D RISK O F T B . A B O R I G I N A L C O N T A C T S ( 1 s t 2 n d T S T ) 262 F I G U R E 76. T S T SIZE A N D RISK O F T B . F O R E I G N - B O R N C O N T A C T S ( 1 s t 2 n d T S T ) 266 -xi i i -LIST OF ABBREVIATIONS AFB: acid-fast bacilli. AR: active registry. BC: British Columbia. BCCA: British Columbia Cancer Agency. BCCDC: British Columbia Centre for Disease Control. BCG: Bacille Calmette-Guerin. BCLHD: British Columbia Linked Health Database CHSPR: Centre for Health Services and Policy Research. CHTB: Country with high TB prevalence. CRF: chronic renal failure. DTB: Division of TB Control of British Columbia. EMB: ethambutol. HSP: hospital separations records of British Columbia. INH: isoniazid. LTBI: latent tuberculosis infection. MOTT: mycobacteria other than MTB. MSP: Medical Services Plan of British Columbia. MTB: tubercle bacilli (M tuberculosis, M. bovis or M. africanum). PHM: Pharmacare databases of British Columbia. PPD: purified protein derivative. PPR: previous positive response to the TST. PZA: pyrazinamide. RM: rifampin. SE status: socioeconomic status. SM: streptomycin. STD: sexually transmitted diseases. TB: tuberculosis; unless specified, this term refers to the disease. TST: tuberculin skin test. VS: Vital Statistics of British Columbia. -xiv-LIST OF DEFINITIONS Contact: Person exposed to an active TB case. Active TB case: Patient with the diagnosis of active tuberculosis. Source TB case: Active TB case that a contact was exposed to; it is also designated the infectious case. Tuberculosis: Disease caused by M. tuberculosis or TB complex, diagnosed on microbiological, histopathological or clinical grounds. Active Tuberculosis: Active disease caused by tubercle bacilli (M tuberculosis, M. bovis or M. qfricanum), diagnosed on microbiological, histopathological or clinical grounds. Tuberculosis infection: Infection caused by M. tuberculosis or TB complex, assessed through the tuberculin skin test. Pulmonary TB: Active TB affecting the lung parenchyma; few cases with tuberculosis in the mediastinal lymph nodes were included in this category. Pleural TB: Active TB diagnosed in the pleural space (in a pleural effusion). Extra-pulmonary tuberculosis: Active TB located outside the thorax. Latent tuberculosis infection (LTBI) treatment: Treatment aimed to prevent the development of tuberculosis in subjects deemed to be TB infected. Preventive TB treatment: Treatment aimed to prevent the development of tuberculosis in contacts of active TB cases that are deemed to have become TB infected. In this study it is used interchangeably with LTBI treatment Tuberculin skin test: Test utilized to diagnose tuberculosis infection through the intracutaneous application of a specific quantity of purified protein derivative, with the subsequent assessment of the reaction to it, which is measured as the size of the skin induration. Tuberculin Skin Test Size: The size of the induration in the skin after the application of the Tuberculin skin test. Type of contact: It refers to the closeness of the contact with the active TB case and according to the definitions used in the Canadian Tuberculosis Standards contacts can be: close household, close non-household and casual. -xv-Close-household contacts: are those who live in the same household as the infectious case. Household contacts are considered by definition to share breathing space on a daily basis with the source case. Close non-household contact: are those who have regular, prolonged contact with the source case and share breathing space daily, but do not live in the same household. These include regular sexual partners and close friends. Casual contacts: are others who spend time regularly but less frequently with the infectious case. These may include classmates, colleagues at work, or members of a club or team. -xvi-DEDICATION A Gerry, mi amada esposa, por su inccmsable apoyo. -xvii-ACKNOWLEDGEMENTS Dr. Moran has a Scholarship from the UASLP and the Programa de Mejoramiento del Profesorado, Secretaria de Education Publica, Mexico, to accomplish a PhD in Health Care and Epidemiology at the Health Care and Epidemiology Department at UBC. Dr. Moran is greatly indebted to Ms. Fay Hutton, data manager at the Division of TB Control in British Columbia, for providing the databases used in this study, as well as for her very efficient and opportune help with all the corrections and quality control issues that emerged throughout the 2 years when the study was carried out. The validation of much of the data, as well as the follow-up of the contacts was possible thanks to the information from the British Columbia Linked Health database provided by the Centre for Health Services and Policy Research, and to the efficient support given by Ms. Denise Morettin. Dr. Michael Rekart (Director) and to Mr. Rob MacDougall (Surveillance Analyst), from the Division of the Sexually Transmitted Diseases/AIDS Control at the British Columbia Centre for Disease Control, kindly provided the information regarding the HIV status of the study subjects. Special gratitude to Dr. Marion, Dr. FitzGerald, Dr. Elwood and Dr. Patrick for their valuable help during the completion of this thesis. -xviii-Hypotheses HYPOTHESES 1) There are clinically important prognostic factors that may identify which contacts of active tuberculosis (TB) cases are at particular risk of developing TB. 2) Tuberculin skin test (TST) size is a useful predictor of active TB among contacts of active TB cases. OBJECTIVES A) To identify the risk factors that best predict the development of TB among contacts of active TB cases. B) To identify the TST size cut-offs that best predicts the development of TB in contacts for various high-risk groups. Background B A C K G R O U N D Tuberculosis is the first disease that the World Health Organization ( W H O ) declared a global emergency and it is still an important global medical concern 1. It has been estimated that one 2 3 third of the world population (over 1.8 bil l ion people) is infected with M. tuberculosis ' . There were 3, 671, 973 new T B cases notified to the World Health Organization ( W H O ) in the year 2000 4 , and 3, 813, 109 new cases in the year 2001 ( W H O report 2003) 5; according to the same report, the global incidence rate o f T B is growing at approximately 0.4% per year 5. There were 1.87 mil l ion people who died from T B in 1997, with a global case fatality rate of 23% 3 . Tuberculosis is currently the 3 r d leading cause o f mortality in the world among adults 15-59 years o ld 6 . The control and elimination of T B requires successful completion of treatment of T B cases, plus the identification and treatment o f persons with latent tuberculosis infection ( L T B I ) 7 " 1 1 . Exposure to an individual with T B who coughs carries a substantial risk o f acquiring the infection, particularly i f the source case harbours a large number o f bacilli in their sputum (such as in laryngeal/endobronchial T B , or those with cavitary disease), and there is a prolonged close contact with the source case, or the contact has altered immunity 1 2 " 1 4 . Between 15%-50% of household contacts o f T B cases become infected 1 5" 1 7. Those infected have an estimated 10% lifetime risk o f developing T B i f treatment for L T B I is not taken, with the highest risk in the first 2 years, when 50% o f the cases o c c u r 1 7 ' 1 8 . The T S T is the only widely available diagnostic method o f determining infection with the tubercle bacillus 1 6 ' 1 9 " 2 3 . PPD-S is the antigen that serves as the international standard reference for the W H O , and 5 tuberculin units (TU) o f PPD-S is the recommended amount to be administered in the T S T 1 6 " 1 9 ' 22,24,25 c^gjrtjy m e diagnosis o f T B infection is based on a positive reaction to TST, which depends on the size o f the induration 48-72 hours after the PPD-S is in j ec ted 1 8 ' 2 5 ' 2 6 . However I T 1 C\ O O there are factors that cause false-negative reactions " ' (Table 1 in Appendix I). On the other hand, previous Bacillus Calmette-Guerin ( B C G ) vaccination, and exposure to mycobacteria other than M. tuberculosis ( M O T T ) , are the main causes of false-positive TST * 17 19 22 reactions * ' . Unfortunately, the role o f infection by M O T T is very difficult to assess, due to the cross reactivity to antigens of M O T T , B C G and M. tuberculosis21' 2 S , the lack of Background standard antigens for the skin tests ' , and the geographical variation of the prevalence of MOTT 2 8 . Previous BCG vaccination can cause TST indurations that range from 0 mm to 25 mm, depending on the age the individual was vaccinated and the time lapse between BCG vaccination and the TST 2 9" 3 2. Nonetheless, the TST reactions due to BCG vaccination are smaller than those due to natural infection33, and in populations with high prevalence of TB infection, or in close contacts of TB cases, the likelihood of true infection is greater than the likelihood of a false positive TST reaction; because of this, BCG vaccination status is ignored when interpreting the TST by some guidelines17' 1 8 ' 2 6 , but not by all 3 4 . At present, the recommended cut-offs to define a positive TST reaction, and therefore the need for LTBI treatment, are based on expert opinion35, taking into account the risk of being infected with M. tuberculosis (contacts of active TB cases and groups with high TB prevalence), and/or the risk for progressing to TB if infected (TB contacts and immunosuppressed patients) as shown in Tables 2 and 3 in Appendix I2' 1 7 ' 1 8 ' 2 6 ' 3 4 ' 3 6" 3 8 . However, there are variations among the guidelines from different countries 1 8 ' 2 6 ' 3 4 ' 3 6 and even within countries across time 1 8 ' 2 6 ' ^ 3 9 _ 4 5 . [Other factors that have also been associated with the risk of developing TB are age, gender and race13'46"48.] Studies have reported different associations (positive linear, J-shaped and growth-shaped curves) between TST reaction size categories and the incidence of TB 4 9 " 5 7 ' but they differ in the study populations, the PPD type and strength used, the TST size categories, the reading date, and in the criteria to diagnose TB. Furthermore, as none of these studies was designed to assess the TST as a prognostic test, no adjustment has been done for the variables that cause false-positive and false-negative TST reactions. One study whose objective was to measure the incidence of TB in contacts of active cases during a 2-year follow-up period58. In this study the TST reaction size was assessed as a categorical variable, using categories 5 mm apart. It was found that the TST reaction size > 15 mm was the only significant predictor of TB disease in a Poisson model that included also: age < 35 years, being Asian born, household contact, previous BCG vaccination and isoniazid prophylaxis58. However, there were only 8 incident cases of TB in the 2 years and 10 TU PPD was used for the TST rather than the recommended 5 TU of PPD. The only study designed to correlate TST size with TB incidence was done in BCG vaccinated school children in Singapore59. Unfortunately, the investigators used 1 TU of PPD-RT 23 for -3-Background the TST, they had as few as 3 TB cases in some of the TST size categories, there was no adjustment for confounding factors, no information regarding the follow-up or the criteria used to diagnose TB, and only Pearson X 2 and Fisher exact test were used. On the other hand, even when there have been studies assessing the role of individual prognostic factors with the risk of developing tuberculosis, there have been few studies assessing prognostic factors of TB development in a multivariate analysis model60' 6 1 , and these have been performed in people in refugee immigration evaluation who were receiving LTBI treatment, not in contacts of TB cases. Hence, it is unknown when the exposure to TB occurred, and consequently the accurate estimation of risk of TB after exposure is not feasible. There were no other articles found specifically addressing the main objectives of this thesis in the search performed in PubMed from 1966, nor in the references retrieved directly from the articles. This information is important due to the reasons presented in the rationale section below. Rationale RATIONALE An essential step in TB control and eventual elimination is the treatment of LTBI. The diagnosis of LTBI is based on a positive TST, which has been defined in TB guidelines from different countries mostly according to expert opinion. The recommended cut-offs for a positive TST have been selected using the approach for a diagnostic test, assuming a theoretical sensitivity and specificity that would be expected in different populations at risk (see Tables 2 and 3 in Appendix I). A n important problem is that there is no gold standard for the diagnosis of LTBI; in addition, several factors, commonly found in contacts, cause false-positive and false-negative TST results. To our knowledge, there has been no attempt to define the cut-offs for a positive TST using a prognostic approach, adjusting for the currently known prognostic factors, and especially for the groups in whom decisions regarding the treatment for LTBI are usually considered. There have been studies reporting the association of isolated prognostic factors with the risk of developing TB in contacts of active cases, but to our knowledge no study has assessed the role of the currently reported prognostic factors for the TB development in a multivariate analysis, i.e. when the role of the other prognostic factors is taken into consideration. The proper identification of the main risk factors for developing TB, as well as the TST sizes that best predict the development of TB for different high-risk groups may facilitate more focused efforts to deliver treatment of LTBI to more at risk groups. Although all contacts with LTBI should receive treatment many do not but with more precise risk assessment clinicians and patients might be more inclined to accept this intervention. Methods METHODS DESIGN. A retrospective cohort study of all the contacts of active TB cases recorded in the Division of TB Control (DTB) of British Columbia from 1990 to 2000 was carried out to identify the development of TB during a 12-year follow-up period (up to 2001). The TST results and the information regarding the prognostic factors of interest for the development of TB were obtained for all contacts from the DTB and from several provincial databases. SOURCES OF D A T A . The information for this research project was obtained from the databases provided from the following sources. A) The Division of TB Control (DTB) database at the British Columbia Centre for Disease Control (BCCDC): TB control activities in British Columbia (BC) are a centralized activity and a registry of all active cases and their contacts is maintained in the DTB. Although TB is a notifiable disease, the DTB is informed of cases from several sources, including the provincial laboratory located in the B C C D C and the Pharmacy Division of B C C D C , which is responsible for dispensing all anti-tuberculous medication throughout BC. These additional safeguards ensure that no case of active or suspect TB can be treated in BC without the DTB being informed. Currently, there are over 300 TB incident cases and approximately 3,000 contacts registered annually in BC. A l l the relevant annually recorded information of TB cases and their contacts at the DTB databases from 1990 until 2001 was used for this research project. In addition and with the aim of validating the information available in the DTB databases for the variables of interest for each of the contacts of TB cases (HIV status, co-morbidities, use of immunosuppressive drugs, vital status, etc.), the following databases were linked. B) The British Columbia Linked Health database (BCLHD) is a data resource for applied health services and population health research. It is housed at and has been developed by the Centre for Health Services and Policy Research (CHSPR) at the University of British Methods Columbia. The BCLHD is one of only a small number of data resources in the world where longitudinal research on an entire population can be carried out. The BCLHD initially brought together core data on health services utilization and vital events for the province's residents from 1985 onwards. Over the years, the BCLHD has expanded to include data on other factors that may be relevant to applied health services and population health research. The databases from the BCLHD used in this research project, as well as the information requested (for all the contacts recorded in the DTB between 1990 and 2000) were the following: • Medical Services Plan (MSP): annual, fiscal-year files of services provided to MSP-covered individuals in by practitioners, billed to MSP, and paid by MSP. CHSPR provides researchers with MSP data organized by the date services were provided. Information is available from 1985/86 thru 2000/01. There were 2 types of information requested from the MSP: A) The MSP services provided. B) The MSP enrollment dates and cancellation dates. The information requested from MSP was: date of the service provided and diagnosis of the diseases of interest for the study (ICD-9 codes), as well as province of patient and province of service. • Hospital separations (HSP): contains information on separations (discharges, transfers, and deaths) of in-patients and day surgery patients from acute care hospitals in BC. The information is available from 1985/86 thru 2000/01. The information requested from HSP was: the admission and separation dates, all the diagnoses (ICD-9 codes) available from that database (there are 16 potential diagnoses that can be entered), the diagnosis of the pre-admission co-morbidities codes, the residence indicator, and the province issuing the coverage. There was information available also as to whether the patient was an out-of-province or out-of-country transfer. Methods • British Columbia Cancer Agency (BCCA) incidence file: containing information on all cancers reported to the B C C A . The information is available from 1986 thru 2000. The information requested from the B C C A was: the diagnosis date and the histological diagnosis. • Pharmacare (PHM): calendar year (grouped by claim date) files of prescriptions paid by the BC PharmaCare plans (all those exceeding $800 C A N per year per family in most of the study years). Information available from 1986 thru 2001. The information requested to P H M was: the date of prescription and drug names (therapeutic codes) as well as the corresponding brand name, generic name and presentation. • Vital Statistics (VS): deaths registered in the province of BC. Currently available from 1986 thru 2000. The information requested of VS was the date of death for all the contacts. The enrollment and cancellation dates in the MSP was also requested for all of our study subjects to establish the latest date they were in the province and information was available for them (maximum follow-up), as well as to better determine the right censoring time, which also increases the power of the survival analysis (see the analysis section below). In order to maintain confidentiality, a study ID code that was assigned to the contacts when their information was requested from CHSPR provided the means of identification. CHSPR also provided a geographical socioeconomic status classification for each study subject. CHSPR sends researchers 'area' quintiles/deciles as opposed to national quintiles/deciles. The area quintiles/deciles CHSPR has for BC rank average Methods "neighbourhood income per person equivalent" (IPPE) scores for each enumeration area (EA). IPPE is a household size-adjusted measure of household income. These EAs are then grouped by 'area' and classified into quintiles or deciles, as the case may be. The 'areas' within which EAs are grouped and classified are C M A s (census metropolitan areas) and CAs (census agglomerations). Roughly speaking, these include every major centre in the province whose core population is above 10,000. The rural remainder of the province (i.e. areas not included in a C M A or a CA) is kept as one additional 'area'. This type of classification is beneficial in that it corrects for the discrepancies in costs of living between different urban areas. C) The Division of STD/AIDS Control of BC at the B C C D C records information regarding HIV status (confirmed by the provincial laboratory) of all the subjects tested in the province, as well as the date of the first positive HIV test from 1988 thru 2002; however, the information contained in this database allows reliable personal identification only since 1995. SELECTION CRITERIA: INCLUSION CRITERION: A l l the contacts of active TB cases that had a TST (5 T U of PPD-S) performed with the Mantoux method, from 1990-2000. EXCLUSION CRITERIA: 1- Contacts with a history of TB. 2- HIV infected contacts. OUTCOME: Development of active TB (any type) between 1990-2001, according to any of the following definitions: Methods a) Smear positive and/or culture positive for tubercle bacilli (M. tuberculosis, M. bovis or M. qjricanum). Cases with only MOTT reported in culture results were eliminated from the analysis. b) Histopathological diagnosis of TB. c) Clinical diagnosis of active TB (sensitive criteria). In addition, specific criteria were utilized, namely clinical diagnosis of active TB and treatment with at least three 1 s t line drugs: isoniazid, rifampin, streptomycin, pyrazinamide or ethambutol. PROGNOSTIC FACTORS. The prognostic factors for developing TB assessed in this study were the following: TST size: induration size measured in mm. It was analyzed as a categorical variable using the following categories: 0, 1-4, 5-9,10-14 and >15 mm. The majority of contacts had one TST performed at the Division of TB Control at their first interview (if it occurred close to or after 8 weeks from the exposure to the TB case); however, i f the l s l TST was applied soon after the exposure and was either negative or in the doubtful range, a 2 n d TST is performed 6-8 weeks after. A separate analysis was performed for size recorded for the 1s t TST and for the maximum size of either the 1 s t or 2 n d TST. Age: calculated in years from the birth date. Gender: Male: as recorded in the database. Female: as recorded in the database. Latent TB infection treatment: Completed: completion of >6 months of treatment. Incomplete: contact did not complete 6 months of treatment. Not received: contact did not receive any treatment. -10-Methods Type o f contact: as appears in the database, which is assumed to follow the Canadian T B standards 1 8 summarized below: Close household contacts: those who live in the same household as the T B source case. Close non-household contacts: those who have regular, prolonged contact with the source case, and share breathing space daily, but do not live in the same household. These include regular sexual partners and close friends. Casual contacts: others who spend time regularly but less frequently with the infectious case. These may include classmates, coworkers, or members of a club or a team. Immunosuppression: Positive: the presence o f diabetes mellitus, chronic renal failure, any malignancy, aplastic anemia, malnutrition (as reported in the databases, which depends on the clinical diagnosis o f the physician or public health nurse), alcoholism, organ transplant (only major organs), or the use o f corticosteroids or any immunosuppressive drugs. Negative: the absence o f all of the above. Previous B C G : Positive: contact stated having received B C G or a scar was present. Negative: contact stated not having received B C G and no scar was present. Uncertain: those who do not fit in the above categories or with missing information were considered uncertain. Infectivity o f the source T B case: according to the acid-fast bacill i ( A F B ) smear: smear negative or smear positive. High-risk subjects: Positive: recent arrivals (<5 years) from high T B prevalence countries (TB prevalence >4%) according to the W H O list from 20024; injection drug users; residents and employees in high-risk settings (prisons, nursing homes, homeless shelters); and hospital personnel -11-Methods (physicians, nurses, respiratory therapists, laboratory technicians, x-ray technicians and medical students). Negative: the absence of all of the above. Previous TST response: Positive: i f a contact had a TST (recorded in the DTB database) before the date of contact with any response (induration >1 mm); or i f a contact reported to have had a positive TST before the date of contact. Negative: none of the above. Ethnicity: as recorded in the DTB database. The 3 following groups were assessed: a) Aboriginals; b) Canadian-born (non-Aboriginals); c) foreign-born. Geographical (ecologic) socioeconomic status: it was assessed in quintiles, as reported by Statistics Canada62. This was obtained for our study subjects from B C L H D , by linking their postal code with the corresponding quintile. Briefly, the process used by Statistics Canada is the following: the neighbourhood income per person equivalent (IPPE) is a household size-adjusted measure of household income at the enumeration area (EA) level. Within each census metropolitan areas, census agglomeration or provincial residual area the E A average IPPE was used to rank all EAs, and then the population was divided into approximated fifths, thus creating community-specific income quintiles based on IPPE. This type of classification is beneficial in that it corrects for discrepancies in cost of living between different urban areas. A l l these factors have been considered independently of clinical importance. The evaluation of their relative importance in the prognosis of TB development was determined in this study. Then, the assessment of the risk of developing TB for different TST sizes was performed adjusting for the prognostic factors that were clinically and statistically significant, as explained below. -12-Statistical Analysis STATISTICAL ANALYSIS A n initial descriptive analysis of all the variables of interest was carried out. To assess the risk of developing TB in a period up to 12 years, Cox regression models were applied, as explained below, to the time to event (TB development) of the contact cohorts, from 1990 to 2001. In order to obtain a prognostic model that identifies all the important prognostic factors for TB development up to 12 years, a bivariate analysis of all the prognostic factors of interest and estimation of the risk of developing TB through the hazard ratio was carried out first. Afterwards, multivariate analyses were carried out and the following model building strategies were used in a sequential way: 1) Including all the prognostic factors simultaneously in the model. 2) Forward stepwise and backward stepwise selection methods, which were carried out separately (with likelihood ratio, p<0.05 for entry and p>0.10 for removal of the model). 3) Creation of the final model with a purposeful selection of variables: this final model included all the significant variables (p<0.05) from the 3 models described previously (all variables included, forward stepwise and backward stepwise), as well as the main variables of clinical interest, and confounders (variables that produce a change > 20% on other covariates when removed from the model). In order to evaluate the adequacy of the models, the proportional hazards assumption was assessed by examining the Kaplan-Meier survival curves of the different levels of the categorical prognostic factors; in order for the proportional hazards assumption to hold, the curves of the categories within a variable should not cross each other. For continuous variables, such as age, the proportional hazards assumption was assessed by creating categories and comparing the Kaplan-Meier survival curves calculated for the corresponding categories. The assessment of the proportional hazards assumption was also done through the graphical assessment, by plotting the log (-log) of the survival function versus time (as well as versus log time) and assessing whether the lines are parallel between the levels of the categorical prognostic factors63"63. The test of the statistical significance of the interaction of - 13 -Statistical Analysis each of the prognostic factors with the logarithm of time to event as suggested by Hosmer and Lemeshow6 4 was assessed but not included, because most of the interaction terms, as expected, were statistically significant due to the large sample size in this study. To assess the precision of the estimates of the model parameters, robust variance methods of calculating the standard errors and 95% confidence intervals were used 6 6. Robust variance methods of assessing model precision do not require the distributional assumptions that are made to justify conventional variance estimation methods. For example the data may, and commonly does, exhibit overdispersion; robust variance methods prevent inaccurate estimates that result from outliers or from non-normally distributed data. In order to determine the best TST size cut-off for predicting TB development in the contacts that did not receive LTBI treatment, the TB rates were calculated for each of the TST size categories of interest in clinical practice (0-4 mm, 5-9 mm, 10-14 mm and >15 mm). Also, using the Cox model, the hazard ratios for developing TB were estimated for different TST size categories (5-9 mm, 10-14 mm and >15 mm; with 0-4 mm as the reference category), adjusted by the prognostic factors selected from the final model created previously (see above). This approach was applied for the following groups of contacts that did not receive LTBI treatment: • Household, close non-household and casual contacts. • Contacts with and without prior B C G vaccination. • Immunosuppressed and non-immunosuppressed contacts. • Children 0-10 yrs old; contacts > 10 yrs old (including adults). • Canadian-born, foreign-born and Aboriginal contacts. The estimates of the hazard ratios for the TST size categories were performed using the 1s t TST applied after contact with the TB case, and also the maximum size of the 1 s t or 2 n d TST applied after exposure. The proportional hazards assumption was confirmed for all the Cox models through the assessment of the Kaplan-Meier survival curves and the log (-log) survival function versus time plots, as described previously. -14-Statistical Analysis The multivariate analyses for the models that include all the variables of interest were performed with both SPSS and S-Plus statistical software. There were some differences in the 2 n d or 3 r d decimals in the coefficients, but the results were practically the same for the hazard ratios. The tables provided for the results of the Cox models stem from SPSS, with the exception of the robust variance estimation, which stems from S-Plus. -15-Ethics ETHICS The corresponding confidentiality oaths were signed at the DTB and B C L H D , and the following confidentiality protection mechanisms were set up: Name codes and study ID were used during the data management, quality control of the data and analysis as the identification means, and the complete databases with the identifiable personal information were stored on a C D by 3 of the investigators, Onofre Moran (OM), Kevin Elwood (KE) and Mark FitzGerald (MF). When the personal health number of a subject was not available, the B C L H D and VS databases for accurate retrieval of information require their name and birth date, which was provided. In no other instances, was the personal information of the subjects available to anyone else. A l l the analyses was performed by one researcher (OM) on his personal computer. This computer is not interconnected with any other one. A password was set-up to prevent the accesing of the information in the case the computer may be stolen. Also, a "firewall" was installed to prevent hackers from accessing the computer files. No personal information that could allow subject identification will appear in any of the publications derived from this research, including this thesis. - 16-Database Management DATABASES MANAGEMENT Before presenting the analysis results, a description of the databases used as well as the data management involved will be summarized. DIVISION OF TB CONTROL (DTB) DATABASES AND FILES. The technical procedures of the database management have been italicized for easy recognition. If desired, the reader can pass over them without missing essential information of the study. The DTB provided 264 databases arranged in 3 main sets as explained below. a) One for contacts of TB cases (OCR files). b) One for active cases (OAR files). c) The active registry files. These 3 sets of databases were provided for each year from 1990 to 2001. The information contained in these databases included all the TB cases and contacts recorded up to -and including all TB cases and contacts through - 2001. The OCR files and OAR files stem from the main TB registry (TMA) of the DTB, while the active registry files stem from the TB active registry (TMRA) of the DTB. The kind of information recorded in the T M A and T M R A , as originally appears in the DTB database, is replicated in Appendix II. Main registry. Each of the other 2 databases (OCR and OAR) obtained from the main registry. The OCR files contain the data pertaining to the all the contacts of the pulmonary TB cases diagnosed in the DTB in the corresponding year. The OAR files contain the data of all the following patients: - Active cases (all types ofTB cases, including extra-pulmonary TB). - Patients with Mycobacteria other than TB (MOTT). - Patients who received treatment of LTBI. - 17-Database Management Each of these databases contain 11 sub-databases, which will be denominated files: The OCR files were referred as: OCR01, OCR11, OCR12, OCR13, OCR14, OCR17, OCR19, OCR20, OCR21, OCR22 and OCR23. The O A R files were referred as: OAR01, OAR11, OAR12, OAR13, OAR14, OAR17, OAR19, OAR20, OAR21, OAR22 and OAR23. A summary of the kind of information available in each of the files, as provided by the DTB, is presented in Table 1. The original data collection form utilized in the DTB and from which the information stems for each person, is replicated in Appendix II. Table 1. Information contained in the DTB databases File Information 01 Sociodemographics, medical history (including HIV, AIDS, steroids, and other immunosuppressive conditions). Previous TB, contact with TB and source name, type of contact. Date/cause of death. 11 Classes attended and compliance w/treatment. 12 Visits to the clinic, TB symptoms, duration of illness, other diseases, pregnancy status and current medications (other than TB). 13 Treatment: summary, dates (start and end), purpose, duration, adverse reactions. 14 Treatment: individual doses and frequencies. 17 Source name, contact name, date and type of contact. O A R 17 has the contacts provided by all the active cases, including MOTT and extra-pulmonary TB; OCR 17 has only the contacts of the pulmonary TB cases. 19 Tuberculin skin test and B C G related information. 20 Smear, Gaffky and culture results (including mycobacterium ID). Type of specimen. 21 Mycobacterium ID, and drug sensitivity 22 Lab results (including HIV). 23 Other health problems (Including HIV). TB classification, site, extent and activity. Mycobacterium ID. The active registry contains every registry entry for active cases or other clients on treatment. It includes active cases, inactive cases, TB suspects, patients who received treatment of LTBI and mycobacteria other than tuberculosis (MOTT) during 1990 through 2001, as known to the Division of TB Control up to April 15,2002. The active registry files (from 1990 to 2001) -18-Database Management were transformed from their original text format into Excel, which was also transformed into S-Plus and SPSS for later data management procedures. PROCEDURES: the data management procedures performed in the O A R and OCR files (from the DTB) necessary to do the planned analysis included: • Transformation. • Merging of files. • Duplicate identification. The summary of the technical procedures performed appears in italics to facilitate their identification and skipping to those not interested in the technical procedures. The main problems found and solutions given, as well as some of the quality control procedures utilized are summarized in Appendix III; the information contained there may be useful for other researchers using the same databases. In addition, transformation and duplicate identification were also required for all the other databases provided by B C L H D and they will be briefly explained with the corresponding prognostic factors data management procedures ahead. TRANSFORMATION: A l l the files were provided by the data manager at the DTB (Ms. Fay Hutton) in the original text format as OCR files, OAR files and an active registry file. These files were given in text format because the TB clinic (where all TB cases and their contacts are attended at the DTB) uses the database for clinical and administrative purposes. Hence, the first step was to convert all the files into an Excel format in order to perform the required subsequent data management procedures. There were in total 265 files transformed from text format in Excel format: 132 OCR files (OCR01 to OCR23 from 1990 to 2001), 132 O A R files and 1 active registry file. In addition, all of these files were also transformed in SPSS and S-Plus format. MERGING: Because the number of variables differs among the 132 files for all the years for contacts -19-Database Management (OCR 01 to 23) and the 132 files for all the years for active cases (OAR 01 to 23) due to the type of information contained in each file, the first step was to merge the files of contacts across the years (e.g. file OCR01 was merged from 1990 to 2001 for both contacts and active cases); the same was done for the active cases. Thereafter (and after deleting duplicates as specified below) the variables of interest for the contacts and active cases from the previously merged files (without duplicates) were merged to form one final file. Hence, for contacts the file contained information related to the prognostic factors of interest as well as to their selection criteria; for the active cases, the final file contained information related to their selection criteria. In order to maintain confidentiality, the contacts and TB cases were identified by their name codes, which are alphanumeric codes assigned to each individual who attends the DTB. These name codes are unique for each individual, since they are computer generated using 4 letters from the first and last names, 2 digits from the birth date and 2 digits representing the number of persons in the database who share those characteristics. The number of variables and of subjects (which includes duplicates) contained in each of the files used in this research project from the DTB after merging are presented in Table 2. Table 2. Number of subjects and variables in the DTB files 1990-2001 (Duplicates included) File Number of subjects Number of variables OCR01 49,624 108 OCR12 73,299 36 OCR13 54,284 41 OCR14 5,479 60 OCR17 49,746 14 OCR19 131,689 25 OCR20 56,812 17 OCR21 409 23 OCR22 53,453 28 OCR23 25,490 18 OAR01 15,794 105 OAR13 15,795 38 OAR14 24,252 57 OAR17 59,156 11 OAR20 59,440 14 OAR21 5,280 20 OAR22 31,732 25 OAR23 29,905 15 ACTIVE REGISTRY 16,325 71 -20-Database Management Since the information for the year 2001 from the DTB was not available until after the linkage had been completed with the B C L H D , the information provided below concerns the DTB contacts up to the year 2000. However, the follow-up for all the contacts was done up to the year 2001 in the DTB and in the B C L H D . After merging all the contacts databases across the years, 45,463 names of contacts were found in the DTB from 1990 up to the year 2000. Subsequently the number of unique and duplicate contacts was determined. DUPLICATES: In order to find duplicates among the 45,463 contact's names recorded in the DTB database from 1990 until 2000, logical functions in Excel were used. The number of duplicated contacts is shown in Table 3. The contacts were sorted by name code, birth date and year, in that order, and duplicates were identified using the "IF" and "AND" logical Junctions, as well as filters in Excel (year refers to the year the contact was recorded in database at the DTB). Table 3. Duplicated contacts in the original TB database Description Number of subjects Percent Total number of contacts 42,596 93.7 Duplicated names 2,867 6.3 Total in database 45,463 100 Among the 2,867 duplicated name codes, 574 were within the same year and 2,293 in different years. However, the same contact may be duplicated more than once, and also can be duplicated in the same year and/or in different years. In fact, there were 482 duplicates that occurred more than once. This includes subjects that appear 3 or more times in the database, some of them duplicated in the same year and also in different years. Thus, the possible number of combinations these duplicates could have is numerous. For analysis purposes and in order to provide useful information, the first step (using again logical functions in Excel) was to create 5 main categories of the duplicates: unique names; -21-Database Management one duplicate in the same year; >1 duplicate in the same year; one duplicate in different years; and >1 duplicate in different years. The number and percent of duplicates in each category is presented in Table 4. Table 4. Number of Duplicates by Frequency and Time of Occurrence Code Description Number Percent 0 Unique names 42,596 93.7 1 1 duplicate in the same year 558 1.2 2 >1 duplicate in the same year 16 0.04 3 1 duplicate in different years 2,040 4.5 4 >1 duplicate in different years 253 0.6 Total number of names 45,463 100.0 It should be noted that unique names correspond to unique individuals and one individual could be in more than one duplicate category. For purposes of analysis and in order to have only 1 code per contact, if a contact appeared as a duplicate and had more than 1 code, the maximum code number was assigned. The idea was to evaluate in the analysis stage whether being exposed more than once increases the risk of developing TB subsequently. Among 42,596 unique contacts, 40,211 appeared only once in the DTB database and 2,385 appeared more than once. The numbers of contacts that appeared once in the database, as well as of contacts that appear more than once are presented in Table 5 by frequency and period of appearance. While using the aggregate function of SPSS, regardless of whether the cohort year (first) or the duplicate code (maximum) was chosen, the results were the same, even if the birth date was included as a break variable. -22-Database Management Table 5. Contacts according to the frequency and period of appearance Code Description Number Percent 0 Contacts with no duplicates 40,211 94.4 1 Contacts that appeared twice in the same year 375 0.9 2 Contacts that appeared 3 or more times in the same year 6 0.01 3 Contacts that appeared twice in different years 1,787 4.2 4 Contacts that appeared 3 or more times in different years 217 0.5 Total 42,596 100.0 Since appearing more than once in the database does not necessarily mean that a subject was in contact with several TB cases nor allows determining the number of different TB sources. NUMBER OF TB EXPOSURES: In order to determine the number of different TB cases a contact was exposed to, the TB source(s) name(s) and the contact date was identified for all the contacts. Using S-Plus the SPSS files containing the unique 42,596 contact's name codes aggregated by cohort year (first) and by duplicate code (maximum) were imported and merged. Hence, the resulting database had the name codes, birth date, first cohort year the contact appeared in the database and the duplicate code assigned above. From this file, the 2,385 contacts who appeared more than once were selected (file entitled "duplicated" contacts only) and merged with the file containing 49,746 (all) source names of the contacts, tb number, contact name code, contact date and cohort year, (entitled "TB source's of duplicate contacts names and TB numbers"). The merging was done in a way that the TB names and TB numbers of the all sources appeared in the file (i.e. matching by -23-Database Management specified columns -by name codes- and including non-matching rows in dataset 2 in the new dataset entitled "duplicated.contacts.with.their.TB.sources.data"). This dataset contained 49,746 contacts with their respective sources as well as all the sources of all the contacts of the DTB database. It was subsequently exported to Excel and sorted by name code, cohort year and contact date. Then, only the contacts that appeared more than once (among the 49,746 contacts with their sources) were selected. To do this logical functions were used again (if they had a duplicate code >0) and then were sorted in descending order to delete the rest of the contacts. This left 5,281 contacts with many their respective TB sources. At this point, using logical functions, the contacts who appeared more than once and who had the same TB source were identified. The cases found with apparently the same source were reviewed individually and the proper re-categorizations made (see Table 6 below). The unique contacts with their cohort year (first) and duplicate code (max) were selected. The process was performed in Excel and also in SPSS using the aggregate function (using duplicate code or year as the break variable produced the same results). By using the merging function in S-Plus, logical functions in Excel and the aggregate function in SPSS, 252 subjects were re-categorized as shown in Table 6. Most of them had been recorded in 2 different consecutive years in the DTB database because they had attended during those years to the TB clinic; however, they had been exposed only to 1 source case and the contact date was one, and the same. Other appeared to have been in contact with more than 2 TB cases but in many occasions it was the same case. These reasons explain all the re-categorization to a smaller category. There were only 4 subjects who were moved to a higher category, from 3 to 4. This is because even when the subjects had been in fact recorded only twice (in 2 different years), they had been exposed to 3 different TB cases in those years. -24-Database Management Table 6. Cases re-categorized (By reviewing them individually) Category change Number of subjects 3to0 217 4 to 3 16 3 toi 7 1 toO 4 3 to 4 4 3 to 2 2 4 to 2 1 4to0 1 Total 252 There were 222 contacts who were re-categorized from categories 3 and 1 to category 0 (they were actually exposed to the same TB source once), and thus could no longer be within the duplicated contacts. Hence, there were finally 40,433 contacts exposed to only one TB case and 2,163 contacts that had been exposed to more than one TB case. Table 7 shows all the contacts classified according to the number of times they were exposed to a different TB case and the period of exposure. Table 7. Contacts according to the number of TB cases exposed Description Number Percent Contacts exposed to 1 TB case 40,433 94.9 Contacts exposed to 2 TB cases in the same year 378 0.9 Contacts exposed to >2 TB cases in the same year 9 0.02 Contacts exposed to 2 TB cases in different years 1574 3.7 Contacts exposed to >2 TB cases in different years 202 0.5 Total 42,596 100.0 -25-Database Management Ninety-five percent of all contacts had been exposed to only one TB case. Among the 2,163 contacts exposed to more than one case, 1,952 of them were exposed to 2 cases, which account for 90% of the contacts exposed to more than one case, and 4.6% of all contacts. Because of the small number of contacts exposed to > 2 cases in the same year, this category was collapsed with the category of contacts exposed to 2 cases in the same year, as shown in Table 8. Table 8. Contacts according to the number of TB cases exposed (collapsed) Description Number Percent Contacts exposed to 1 case (code=0) 40,433 94.9 Contacts exposed to >2 cases in the same year (code=l) 387 0.9 Contacts exposed to 2 cases in different years (code=2) 1574 3.7 Contacts exposed to >2 cases in different years (code=3) 202 0.5 Total 42,596 100.0 Among the 2,163 contacts exposed to more than one case, there were 1,776 who were exposed to 2 or more cases in different years, which represent 82% of the contacts exposed to more than 1 case and 4.2% of the total of contacts. Regarding the number of maximum exposures the contacts had, we found the following: there was one subject that was exposed to 9 different cases; one subject was exposed to 8 different cases and one was exposed to 7 cases. It was found that these 3 contacts were exposed to 5 cases that were the same for all. In addition there were 6 subjects exposed to 6 cases, 18 exposed to 5 cases, 42 exposed to 4 cases and the remaining 134 subjects were exposed to 3 cases at different times. Contacts searchable at the B C L H D : From 42,596 unique contacts recorded in the DTB from 1990 up to 2000, there were 5,814 with personal health number (PHN) that were sent to B C L H D for linkage to all the databases: MSP, HSP, B C C A , P H M and VS . The PHN allows a deterministic search of individuals at the B C L H D . The other 36,782 did not have PHN recorded in the DTB database and were sent to -26-Database Management the data analyst at the B C Ministry of Health Services (BCMH) to obtain their PHNs. This was done through a probabilistic match using our contact's name, birth date, gender, and postal code. We also provided a study ID that was assigned to the contacts as the means of identification when the information requested from the B C L H D was provided. Those for whom a PHN was found in the MSP were sent directly by the B C M H to B C L H D for linkage. From the 42,596 contacts a total of 34,359 unique contacts had a valid PHN at the MSP and were searched at the B C L H D databases between 1990 and 2001. In order to determine this, the file entitled "cohort" sent by BCLHD, (containing only the study ID and cohort year for all the subjects searched at BCLHD) was first assessed for duplicates using logical functions in Excel (the resulting file was entitled "cohort - duplicates search.xls"); no duplicates were found. The file "cohort" was subsequently merged in S-Plus with the file entitled "all name codes and study IDs recorded in DTB 1990.2000" (after discarding the presence of duplicate study IDs using logical functions in an Excel format copy) in order to obtain the name codes of the contacts searched at BCLHD. This was done because all the contacts recorded in the DTB files are identified by their name code, while the ones at the BCLHD only by their study ID. Thereafter, the new file containing name codes and study IDs (entitled "name code and study ID of contacts searched at BCLHD") was merged with the file entitled "final contacts" from the DTB (that has the name code and cohort year of the unique contacts recorded in the DTB between 1990 and 2000). The resulting file, which identifies therefore the unique contacts searched and not searched at the BCLHD was entitled "unique contacts searched and not searched at BCLHD". The study IDs of the contacts searched are in the Excel file entitled "cohort", sent by BCLHD. This file had 34,365 subjects but 6 of them did not have information in the DTB databases. This could be explained because the probabilistic, instead of deterministic search, performed to obtain the PHNfrom MSP in those contacts that did not have PHN at the DTB databases. The number of contacts recorded in the DTB that were searched in the B C L H D (those who had a valid PHN) is shown in Table 9. The contacts searched at the B C L H D databases include those recorded in the DTB from 1990 to 2000 only because the contacts for the year 2001 were not yet available when the B C L H D search was requested. The information for the contacts recorded in the DTB in 2001 was obtained during the summer of 2003. The -27-Database Management information from the B C L H D databases was not requested for this cohort of contacts, because of the time previously needed to obtain it (-10 months). Hence, the information regarding immunosuppressive conditions and vital status was validated only for all contacts recorded in the DTB between 1990 and 2000. Table 9. Number of unique contacts in DTB databases searched in BCLHD by period Recorded period at the DTB Unique contacts form DTB sent to BCLHD Unique contacts searched at BCLHD* 1990-2000 42,596 34,359 *The search in the BCLHD databases included the information available up to the year 2001. An 80.7% of the contacts recorded in the DTB between 1990 and the year 2000 had a PHN number (found in the DTB and MSP) that permitted searching them in the B C L H D databases. CONTACTS FOUND AT THE BCLHD The next step was to determine how many of the B C L H D searchable contacts were actually found in the B C L H D databases. In order to do this, each of the Excel databases provided by BCLHD (BCCA, VS, PHM, MSP and HSP) was transformed into SPSS. Then the duplicates were deleted in each database using the aggregate function of SPSS. For the databases the aggregate function may not be reliable to produce unique contacts; for instance in the Pharmacare databases a contact may be duplicated if date ofprescription was used as the aggregate variable since the same subject could have several distinct dates; the same could occur if the prescription number was used. To prevent error, all the SPSS aggregated databases were re-transformed into Excel and using the logical function "IF" (after sorting the database by study ID) duplicates were searched. The following (final) databases provided by BCLHD with unique contact's study IDs were saved in SPSS: "BCCA-study ID aggregated by ICD9-code ", "deaths aggregated by date", "MSP data aggregated by first date of service", "PHM-study ID aggregated by drug number", "HSP study ID aggregated by Admission date (first)" and "RPB-study ID aggregated by enrollment date ". Thereafter, all these databases were transformed into S-Plus (using the data importing option ofS-Plus), and (using the Append function) the study IDs appended into one single column -28-Database Management that resulted in 46,880 study IDs. A variable named "found in BCLHD" was added to this file and was filled out with the number 1. Then this database was transformed into SPSS and Excel using the Export Data function of S-Plus (in a file entitled "BCLHD study IDs all") in order to delete in SPSS (and search in Excel) the duplicate study IDs. The deletion of duplicates was done by applying the aggregate function of SPSS and using the "found in BCLHD" variable as aggregate variable. The resulting SPSS file (entitled "unique BCLHD study IDs") had 34,173 unique contact's study IDs. In order to confirm this number of unique study IDs, the search for duplicates was done in Excel (in file entitled "BCLHD study IDs all w-duplicate search"), obtaining the same number of unique study IDs. However, as demonstrated by posterior merges, among these 34,173 subjects there were 4 who were not in the DTB databases (this could be explained by an error due to a probabilistic search rather than a deterministic search) and 1 more who was repeated (although with a slightly different name code). Thus, the total number of unique contacts found in the BCLHD databases was 34,168. Among the 34,359 contacts searched in the B C L H D databases, there were 34,168 found and 191 not found, which represent 99.4% of those searched, and 80.2% of all the contacts recorded in the DTB. Hence, among the 42,596 contacts who were recorded in the DTB databases (between 1990 and 2000) there were 8,420 (19.8%) that were not found in the B C L H D databases. The difference between the subjects recorded in DTB and those found in the B C L H D databases is due to subjects who were not found to have a valid PHN in the MSP (in a probabilistic search) or that were not enrolled in MSP or did not require the MSP services in BC. The number of unique contacts recorded in the DTB databases that were searchable (with valid PHN) and found in the B C L H D databases, by the periods of interest for the analysis, is summarized in Table 10. Table 10. Number of unique contacts from the DTB searched and found at BCLHD Recorded period at DTB Unique contacts from DTB sent to BCLHD Unique contacts searched at BCLHD* Unique contacts found at BCLHD* 1990-2000 42,596 34,359 34,168 -29-Database Management The databases provided by B C L H D contained the following information regarding the contacts: • MSP-enrollment: contained 34,282 enrollment dates (and cancellation dates when appropriate) dates between 1965 and 2001. These dates allow us to infer until when a contact was still living in the province and also to determine the individual censoring time. • MSP-services provided: had 93,810 records of services provided to the contacts (including date, diagnosis and province of residence) between 1990 and 2001. A l l of the contacts in this database were residents of BC. • Hospital separations: it had 5,368 records of admissions to a hospital (as well as the discharge date, diagnosis and province of residence) between 1990 and 2000. A l l of the contacts in this database were residents of BC except for 3 who resided outside the province. In 6 this information was missing. • Pharmacare databases: there were 8,411 records of prescriptions (drug numbers and prescription dates) of immunosuppressive drugs between 1990 and 2001. • Vital statistics: reported that 1,004 of the contacts had died (as well as the dates) between 1990 and 2000. • B C C A : contained 747 records of malignancies developed (with the corresponding diagnosis dates) among the contacts between 1990 and 2000. The number of unique contacts in each of the databases provided by B C L H D was determined (using the techniques explained above). These numbers are presented in Table 11. -30-Database Management Table 11. Number of unique contacts found in the BCLHD databases BCLHD database Number of contacts MSP-enrollment 34,173 MSP-services provided 7,202 Hospital Separations 2,005 Pharmacare databases 1,789 Vital statistics 1,004 BCCA 707 In addition, the Division of STD/AIDS Control of BC (STD) provided the HIV status -confirmed by the provincial laboratory- of all the subjects tested in the province, as well as the date of the first positive HIV test. The DTB contacts were searched in the STD database using probabilistic matching by name, birth date and gender. Among the 42,596 contacts (from the DTB), there were 14 contacts reported as possible HIV+. This number could be an underestimate as this database contained reliable personal identifiers since 1995. The information provided for dates before 1995 is based on a probabilistic match, using the available information in their database, which in many cases consisted only in one of the names of the subject. -31-Database Management VARIABLES OF INTEREST. Once the unique contacts found in the different databases were identified, the next step was to identify and extract the information needed for the variables needed for the selection criteria, as well as the prognostic variables of interest for our study. This process was started with the databases from the DTB. The information regarding the variables of interest was spread in 10 of the 11 databases obtained from the DTB (OCR files), as well as in the different B C L H D : Medical Services Plan (MSP), Hospital Separations (HSP), British Columbia Cancer Agency (BCCA) and Pharmacare (PHM). Each variable of interest was obtained from the corresponding databases and added to the previous ones to create the databases named "final database X " where X represents the number for each new database created with the addition of each new variable. Unless otherwise specified, the data management was performed in SPSS (some procedures, such as merging, were routinely performed in S-Plus). The description of the databases management involved in the selection of each variable of interest (selection and prognostic variables) is provided below. The information of the contact's variables of interest of the contacts recorded in the DTB between 1990 and 2000 is provided below; however, the information regarding the diagnosis of TB (as well as most of the information obtained from the B C L H D and STD databases) includes up to the year 2001. Age. Since the DTB databases do not contain the age of the contacts the birth date was used to calculate the age. The only other date that was complete for every contact was the contact date from the file OCR] 7, and thus it was used to calculate the age of all contacts. The calculation of age was performed on SPSS using the formula [(contact date - birth date)/'(60*60*24*365.25)]. The name code, age, birth date, contact date and cohort year were selected from the OCR] 7 file to create a new database. Among those contacts found in the BCLHD databases, there were 65 unique contacts with erroneous or missing birth dates. After re-checking birth dates and contact dates individually (manually, including the date the source case became positive when in doubt, and checking birthdates from different databases for unusual or extreme ages), there were no contacts with erroneous or missing birth dates among those found in the BCLHD databases. -32-Database Management Afterwards, this file was imported into S-Plus and merged with the file containing the unique name codes and study IDs for all the contacts at the DTB (entitled "unique contacts searched and not searched at BCLHD"). The resulting file (where all the prognostic factors would be eventually added) was entitled "final database". This database was merged with the one entitled "unique BCLHD study IDs " to include the variable that defines which contacts were found in the BCLHD databases; the resulting file was entitled "final database 1". This database contained unfortunately 2 variables entitled " cohort year" (one from the Excel file and another from the S-Plus file), with slightly different information. After a unique and corrected cohort year was created (with only the first cohort year for duplicate contacts) the database was entitled "final database 2 ". Gender. The variable gender was available in the file OCR01. The gender and the corresponding name code were extracted from this file; in order to delete duplicates (use the aggregate function from SPSS) a numeric variable is also needed and the variable cohort year, which determines the year when the contact was recorded in the DTB. With these 3 variables a new file was created (entitled "contacts gender"). After using the aggregate function, with cohort year as the aggregate variable, there were 34,168 unique contacts with gender information contacts found in the BCLHD, the gender information was available for 34,070 (99.7%) of them; this new file was entitled "unique contacts gender ". This file was then transformed into an S-Plus format (using the import data function of S-Plus) and merged with the previously created file "final database 2 ". The resulting file was entitled "final database 3". HIV. There are 3 databases from the DTB (OCR01, OCR22 and OCR23) that contain information about the HIV and AIDS status of the contacts that is provided by the contacts themselves (OCR22 has also HIV laboratory information to the DTB nurses (they inquire about this and all the information contained in the DTB files the first time the contacts attend to the TB clinic). The HIV and AIDS information was extracted from the 3 databases in SPSS and then they were transformed into S-Plus. In S-Plus the databases OCR01 and OCR22 were appended directly since the variables containing the HIV and AIDS information are named as such. The -33-Database Management HIV information contained in the database OCR23 could be under any of 3 variables named other health problems 1, 2 and 3. In order to determine the HIV/AIDS status from this database, it was first transformed into Excel, and using the logical function "OR " a column a single variable was created for the HIV status and another for AIDS status. This file was then transformed into SPSS and it was verified that all contacts with AIDS were also recorded as HIV +. The OCR23 file was then transformed again into S-Plus to append it with files OCR01 and OCR22. The database containing information regarding the HIV status, AIDS, cohort year when contact recorded as well as name code was entitled "HIVfrom all DTB databases (OCR01, OCR22 and OCR23) ". The database was transformed again into SPSS in order to delete the duplicates. In order to do this, the HIV status was recoded from nominal to numerical value. In order to prevent errors, a frequency table was printed out first and all the names corresponding to HIV (HIV, Reacted and AIDS in their different formats) were recoded as HIV. In addition, the total numbers of HIV subjects in nominal and numerical format were also compared. The deletion of duplicates was done in 2 ways: by aggregating the name code by the cohort year (first) and HIV (max), where the coding for HIV was 0 for negative and 1 for positive, and by aggregating the name code by HTV (max). The results were identical: 107 contacts in the DTB files were HTV positive and 14 of them with AIDS. The database containing unique name codes and HIV status of all the contacts from the DTB was entitled "unique HIV all DTB databases (namecode aggr by cohort year first and HTV code max) ". The HIV information contained in several of the BCLHD databases (MSP, HSP and PHM) was obtained next. The information for HIV, AIDS and all the immunosuppressive conditions of interest in the individual BCLHD databases was extracted as summarized for the variable "immunosuppression", explained ahead. For HIV, the ICD-9 codes corresponding to HIV and AIDS were used to select the contacts HIV status in SPSS. Since the BCLHD databases contain only the study ID of the contacts as the means to identify them, the databases were transforming into S-Plus and merged with the DTB database "namecode study ID and cohort year all contacts " in order to obtain the contact's name code, as well as the year when the contacts were recorded in the DTB (cohort year). Then all the BCLHD databases (MSP, HSP, PHM and BCCA) were appended into a single database ("immunosupr. diseases dates and contact dates all BCLHD ") which contained all the immunosuppressive diseases information (including HIV and AIDS) as well as the diagnosis dates and contact dates. From this -34-Database Management database the HIV contacts were selected and a new database was created ("HIV all BCLHD databases"). Then this database was transformed again in SPSS to delete duplicates and obtain the total unique contacts with HIV (in the database "unique HIV all BCLHD databases (namecode and HIV aggr by cohort yr first) ". There were 176 contacts with HIV and 48 of them with AIDS. From the database provided by the Division of STD/AIDS Control (STD), there were 14 contacts reported as possible HIV: 4 of them were not recorded in the DTB databases (BESK4101, GERJ7101, MORB5705 andMORJ5403) and 3 were not recorded in the BCLHD databases (COY6201, MENJ6201, and MORJ5403), so there was only 1 of them not recorded in either the DTB or the BCLHD databases as HIV (MORJ5403. This contact was recorded in the DTB database as HIV negative in 1998 (same year that was diagnosis with HIV in the Division of STD/AIDS Control). The information from STD is only laboratorial and so the AIDS status of these 2 contacts is unknown. In order to obtain the HIV status for all the contacts, the databases containing the unique HIV contacts from the DTB databases ("unique HIV all DTB databases (namecode aggr by cohort year first and HIV code max) ", the BCLHD databases ("unique HIV all BCLHD databases (namecode and HW aggr by cohort yr first) ") and the STD database were transformed into S-Plus to append them into a single database. Once appended, the databases were transformed back into SPSS (database entitled "HIV all databases (DTB, BCLHD and STD) ") to delete duplicates. The deletion of duplicates was done in two ways: by aggregating the name code by the cohort year (first) and HIV (max), and by aggregating the name code by HIV (max). The results were identical. The resulting database entitled "unique HIV all databases (DTB, BCLHD and STD), (namecode aggr by cohort yr first and HIV max) " was transformed into S-Plus and merged with "final database 3 " to create "final database 4 ". In order to assure that no duplicates were present, a double check was done in an Excel transformed file arid the logical function "IF" was used to compare the contact's name codes that had been sorted in ascending order [results in file "unique HIV all databases (DTB, BCLHD and STD) duplicate check"]. -35-Database Management In addition, to obtain the HIV status of contacts the first year they were recorded (cohort year) in the DTB (since there is no diagnosis date in this database), the following procedures were done: in the database "HIVfrom all DTB databases (OCR01, OCR22 and OCR23) " the name code and cohort year were aggregated in a first step by HIV code (max). In a second step, in the resulting database (entitled "HIV all DTB databases (namecode and cohort yr aggr by HIV code max) "), the name code was then aggregated by HIV code (first) after sorting the database by the contact year in ascending order. The database created, which contains the unique HIV status of the contacts at the moment they were recorded first in the DTB was entitled "unique HIV status first cohort year all DTB databases ". The HIV contact from the STD database who appeared as HIV negative in the DTB databases (and who had the diagnosis the same year he/she was recorded) was corrected in the DTB at this point. There were 108 contacts who were HWat the time they were recorded (for the first time) in the DTB databases. Similarly, in order to select only the contacts who were HIV positive anytime before the contact date and up to 6 months after -which will be referred to as "the same year of contact"-, the time difference between the HIV diagnosis and the contact date was calculated for the BCLHD databases already combined ("HIV all BCLHD databases"). Those with a time difference <183 days were selected in the database entitled "HW all BCLHD databases Dx same contact year" and duplicates were deleted by aggregating the name code and the HIV status by cohort year (minimum); the procedure was also done aggregating name code and HIV status by first contact date. The results were the same: there were 63 contacts in the BCLHD databases diagnosed with HIV the same year of contact. The database was entitled "unique HIV all BCLHD databases Dx same contact year (namecode and HIVaggr by cohort yr min) ". Next, the database with the HIV status when first recorded in the DTB ("unique HIV status first cohort year all DTB databases") and the corresponding one from the BCLHD databases ("unique HIV all BCLHD databases Dx same contact year (namecode and HIVaggr by cohort yr min) ") were transformed into S-Plus in order to append them and get a single database, which was transformed back into SPSS to delete duplicates (database entitled "HIV Dx. same year all databases (DTB, BCLHD and STD) "). The resulting database with no duplicates was entitled "unique HIV Dx. same year all databases (DTB, BCLHD and STD)", which was -36-Database Management obtained by aggregating the name code by HIV status (maximum). This database was converted into S-Plus and merged with the "final database 6" (created to include the "immunosuppression" variables, as explained below) to generate the "final database 7". In addition, the AIDS status for the BCLHD was obtained in several ways with the same results: in summary, the AIDS condition for the MSP and HSP databases was determined by selecting the corresponding ICD-9 codes and then duplicates were deleted. The resulting SPSS database ("unique AIDS from HSP and MSP") with 48 contacts with AIDS was appended in S-Plus to the corresponding one obtained from the DTB databases OCR01, OCR!2 and OCR23 (entitled "unique AIDS all DTB databases (namecode aggr by cohort year first and AIDS code max) " with 14 contacts with AIDS. Then they were transformed back into SPSS to delete duplicates and the resulting database was entitled "unique AIDS all databases DTB, MSP and HSP ". This database was then transformed into S-Plus again and merged with "final database 7 " to create "final database 7 a ". Since there is also HIV and AIDS information available in the active registry, subjects with HTV and AIDS were identified separately in the corresponding databases "unique HIV from active registry" and "unique AIDS from active registry". These databases were then merged with the "final database 7 a" to create "final database 7b" and "final database 7c". These databases identified 14 contacts with HTV and 1 contact with AIDS. All of them had already been identified in the "final database 7 a ". Previous B C G . The information regarding previous BCG was available in 2 variables (previous BCG and BCG scar) in file OCR! 9. The BCG history was provided directly by the contact to the nurse during the first interview. The nurse is supposed to verify whether the contact has a BCG scar. The first step was to obtain these variables with their respective name code into a new database. Since the information available for both variables was: yes, no, uncertain, not asked or missing values, the next step was to create only 3 categories of the BCG status, as planned in the protocol: positive, negative or uncertain. This was achieved using the "transform (recode)" function of SPSS to collapse the uncertain, not asked and missing values in one category named "uncertain"; the "yes" was transformed into "positive" and the "no" into -37-Database Management "negative". Duplicates were deleted using the "aggregate" function of SPSS, after transforming both variables (previous BCG and BCG scar) into numeric format and using to use them as the "aggregate variable"; the results were saved in the file entitled "unique contacts BCG data". Next, the BCG status for every contact was obtained from the information of the 2 variables (previous BCG and BCG scar). This was achieved by using the "select cases "function of SPSS and the logical functions in it (e.g. select ifprev BCG = 1 OR BCG scar = 1; where 1 = positive); after this, the "filtered" cases assigned with the number 1 (which represents the selected cases) were given a code representing the category of interest (negative=0, positive = 1 and uncertain = 2) by transforming the "filter_$" variable into a different variable (using the function "recode "). The process was performed in 3 steps, with the procedures mentioned above (from selecting the cases up to recoding the corresponding filtered variable) repeated for each of the 3 categories of interest. Once the BCG status was determined for all the contacts and summarized in a single variable, the resulting database (entitled "unique BCG status ") was transformed into S-Plus. This database was then merged with the database "final database 4" and the resulting database was named "final database 5". Immunosuppression. This variable was complex, because in order to define whether a contact was immunosuppressed, we had to obtain the information of the diseases and medications of interest from several databases from both, the DTB and the BCLHD databases. The BCLHD databases used to obtain information were the Medical Services Plan (MSP), Hospital Separations (HSP), British Columbia Cancer Agency (BCCA) and Pharmacare (PHM). In addition, the diagnoses contained in these databases were in numeric or alphanumeric format, which caused difficulties, since in some databases, some of the diagnoses and drugs might be described with alphanumeric codes while others had only numeric codes. Besides, the decimal point in the MSP and HSP database could vary between the 3rd or 4th position of a number, and also some diseases could be represented by codes that varied in the number of zeros assigned (for instance diabetes mellitus was supposed to have the numeric code 250 but there were also 2500 and 25000 numeric codes representing the same disease). These numerous inconsistencies in the coding required a manual selection and codification of the -38-Database Management diseases of interest. Thus, a summary of the most important aspects of the data management will be presented only, however, quality control checks (double checking, use of 2 different approaches for the same purpose and matching of the results) were done to assure the validity of the results. DTB data management: the information regarding immunosuppressive diseases and drugs was contained in 3 different fdes: OCR01, OCR12 and OCR23. In OCR 01 there was a specific column for each variable (renal failure, diabetes mellitus, HIV, AIDS, renal dialysis, renal transplant, malnutrition, G-I bypass, gastrectomy, immunosuppressive therapy, steroids and malignancy) and the information contained there was yes, no, uncertain and not asked. The first step was to convert this information into the corresponding variable name, if present (e.g. for the column of diabetes mellitus "yes" was converted into DM and the rest of the answers into "none ", since we were only interested in being specific about the diagnosis). In the OCR012 file there were 3 columns for "other diseases " and 3 columns for "present medications'"; in the OCR 23 file there were 3 columns for "other health problems", numbered 1 to 3. The diseases and drugs of interest were selected and coded for each of the files. To assure that no disease was missed, a frequency table of all the diseases contained in each file was created first and then all the different names inputted for a disease were used to identify the disease of interest, which was then coded to facilitate its handling. Next, the 3 files (OCR01, OCR]2 and OCR23) were transformed into S-Plus, so that the name codes and the variables containing the immunosuppressive conditions (diseases and drugs) of interest could be appended into a single file entitled "diseases of interest all DTB databases (OCR01 OCR!2 OCR23) ". This file was transformed back into SPSS to delete duplicates and the resulting database was entitled "unique immunosup diseases all DTB (namecode aggr by imsup dis max) ". This database was later merged with the corresponding one from BCLHD as explained below. BCLHD data management: the first step was to match the study IDs of the contacts with the corresponding name codes for each of the BCLHD databases containing the immunosuppressive conditions (diagnosis and drugs) of interest: MSP, HSP, BCCA and PHM. Next, for each database, using the ICD-9 codes and therapeutic codes provided, the diagnoses -39-Database Management of the diseases of interest, as well as the drugs of interest, were identified and coded, and their respective diagnosis dates and prescription dates were also obtained in the corresponding SPSS files entitled: "MSP Dx and dates", "HSP Dx and dates", "BCCA ICD9 name code and cohort year" and "PHM drugs name code and cohort year". Next, these databases were transformed into S-Plus in order to merge them all. The resulting database was then merged with a file containing the contact dates (obtained from OCR] 7 and entitled "name code contact date and cohort year all contacts "). The resulting database, which had now the name codes, contact dates, cohort year, diagnosis and prescription drugs codes as well as their respective dates was entitled "immunosupr. diseases dates and contact dates all BCLHD" and transformed into an SPSS format. Next, the immunosuppressive diseases and drugs that were only received during the period close to the TB exposure had to be defined. In order to do this, the time difference (in days) between the diagnosis/prescription dates and contact dates was calculated. Contacts were considered to be immunosuppressed when they were in contact with the source case if the immunosuppressive condition was present (recorded) in any of the databases within the following time periods: For renal failure, diabetes mellitus, HIV and renal transplant: <182 days from the contact date. For malnutrition and malignancy: between -182 and 182 days from the contact date. For corticosteroids and antineoplastic drugs (prescription dates): between -60 and 60 days from the contact date. The different time periods were selected on the assumption that they would identify whether the contact was immunosuppressed at the time ofTB exposure. Renal failure, DM, HIV and renal transplant are diseases that could have an immunosuppressive effect any time after diagnosis; however, when the disease is diagnosed after the exposure to a case, because there is usually a lapse between these diseases are present and the diagnosis is made, 6 months was considered a reasonable time window to identify whether an individual was already immunosuppressed at the time of contact. On the other hand, malnutrition, malignancy and immunosuppressive drugs, have an immunosuppressive only while the disease is present, or the drug is being taken. The time -40-Database Management window considered reasonable to identify whether a subject was immunosuppressed at the time of contact, was having malnutrition or malignancy diagnosed within 6 months before or after becoming a contact. Similarly, subjects were considered immunosuppressed at the time of contact if immunosuppressive drugs prescribed within 60 days before or after the contact date. Based on these times, the immunosuppressive conditions (diseases and drugs) considered to be present "the same year as TB contact" were selected and coded into a newly created variable "imdis6". This variable contained the codes of the diseases or drugs that were present within the time frame of interest. In addition, another variable was created containing the corresponding names to allow the identification of the specific immunosuppressive disease or type of drug. This way the variables of interest included: diabetes mellitus, malnutrition, HIV, corticosteroids, chronic renal failure, antineoplastic drugs, major organ transplant, aplastic anemia and malignancy. They were coded in ascending order, so that the higher the code, the more serious the immunosuppressive condition was considered (which also allowed the easy identification and selection of the worse condition performed later). Other conditions recorded and coded were drug abuse, carcinoma in situ, alcoholism and Cushing 's syndrome. The next step was to delete duplicates, which was done in a way that if a subject had more than 1 immunosuppressive condition (disease or drug) only the more serious condition would be selected, through assigning higher codes to more serious diseases; thus, the diseases were coded in descending order as follows: malignancy, aplastic anemia, organ transplant, antineoplastic drugs, chronic renal failure, use of corticosteroids, HIV, malnutrition and diabetes mellitus. Thereafter, duplicates were deleted by using the "aggregate " function of SPSS, selecting the maximum value of the "aggregate variable", which was "imdis6". Selecting only one immunosuppressive condition (and the more serious) per subject was done because of the interest in identifying which conditions were better predictors (risk factors) for developing TB. Although, contacts who had more than one immunosuppressive disease were also identified by aggregating first the name code and the variable "imdis6" by the contact date (first) only in those contacts with a immunosuppressive condition ("imdis6">0), which resulted in the database "unique contacts with several immunosupr. conditions", and from this database, the name code was aggregated by "imdis" max, resulting in the database named "unique contacts with immunosupr. conditions ". The variable identifying the number -41-Database Management of immunosuppressive conditions the same year of contact was named "numimcon ", which was added later on to the final databases (Added to "final database 14" to produce final database 14a"). There were 1,835 contacts with 1 or more immunosuppressive conditions (IC): Nine contacts with 4 IC, 58 contacts with 3 IC, 314 contacts with 2 IC and the 1,455 contacts with 1 IM. These included drug abuse, carcinoma in situ, alcoholism and Cushing's syndrome. Using the "immunosupr. diseases dates and contact dates all BCLHD" database (with the information for 130,232 name codes), the name code was aggregated by the most serious immunosuppressive condition present the same year of contact (by "imdis6", choosing the function "max" in SPSS); the procedure was also performed by aggregating name code by cohort year (minimum) and "imdis6" (max). The results were the same. The resulting database was entitled "unique immunosuppressive diseases same year of contact all BCLHD (name code aggr by imdis6m max)" and contained the information of interest for all the contacts. This database was converted into S-Plus and appended to the one from the DTB containing only the most serious immunosuppressive conditions (highest code) present when the contact was recorded for the first time (first cohort year) in the DTB, entitled "unique immunosup diseases all DTB with all HIV data first cohort year (namecode aggr by imdis first) ". This database was in turn obtained from the database "diseases of interest all DTB databases (OCR01 0CR12 OCR23) with all HIV data" by aggregating in a first step the name code and cohort year by immunosuppressive disease (max), and then after sorting in ascending order by cohort year, aggregating in a second step the name code by immunosuppressive disease (first). The resulting database, entitled "immunosuppressive diseases same year all databases (BCLHD and DTB) ", was converted to SPSS again to delete duplicates. Also, at this point, the contact from the STD database, who was recorded as HIV negative in the DTB databases, was corrected as HIV positive. The database with the unique immunosuppressive diseases for both databases (DTB and BCLHD) occurring the first year contacts were recorded was entitled "unique immunosupr dis same yr all databases (DTB and BCLHD) ". The proper deletion of duplicates was confirmed by transforming the database into an Excel format, using the logical function "IF" after sorting the contact's name codes in ascending order. -42-^ Database Management This database was converted back into S-Plus and merged with the "final database 5", which produced "final database 6". Then, in SPSS, the immunosuppressed status (as positive or negative) was determined from the previously created variable that had the information regarding the immunosuppressive diseases present the same year the contact was recorded in the DTB, or the same year of contact if in the BCLHD databases. Type of contact. The information regarding contact type was provided in files OCR! 7 in a coded form with numbers or letters. These codes represented the type of contact as required for the research project (close household contact, close non-household contact and casual contact), but also places of contact (school, work, hospitals, etc.). The first step after obtaining the relevant information from the OCR] 7 files (name code, type and date of contact and cohort year) was to recode all the provided information into only the 3 codes of interest mentioned (or unknown). This was done in two ways: a) by selecting only the types of contact of interest mentioned above and disregarding the places of contact. The new variable was named "contype2 "; and B) by selecting the type of contact and the place of contact, and arbitrarily deciding the most likely type of contact according to the place of contact as follows: for hospitals and laboratory workers a "close non-household" type of contact was assigned; while for school and work contacts, a "casual" type of contact was assigned. The new variable was named "contype". The resulting database was named "type of contact". Then the duplicates were deletes by aggregating name code and cohort year by contact type i.e. "contype" or "contype2" and the corresponding databases were entitled "type of contact (namecode and cohort year aggr by contype max) " and "type of contact2 (namecode and cohort year aggr by contype max) ". Next, in order to obtain the (closest) type of contact that occurred when exposed for the first time to the case, after sorting in ascending order by cohort year, the name code was aggregated by contact type (max) in the database "unique type of contact" and "unique type of contact!". The procedure was performed also by using the contact date instead of the cohort year, with the same results. The 2 databases ("unique type of contact" and "unique type of contact2") were then converted into S-Plus and merged with the "final database 7a" to create "final database 8". Later on, in the "final contacts for analysis BCLHD all" and -43-Database Management "final contacts for analysis BCLHD all", the household contacts from the variable "contype" that did not match those in the variable "contype2" (due to using the maximum contact type in the aggregating procedure) were corrected manually (9 subjects). The same was done for the close non-household contacts (29 subjects). Ethnicity. Ethnicity was classified as follows: a) Aboriginal people; b) Canadian-born; c) foreign- born. The information regarding ethnicity was obtained from file OCR01, where country of birth and ethnicity are specified, and for Aboriginals the Indian community is specified as well. The information in the DTB database appears under 3 variables with the corresponding names. First a frequency table of the 3 variables was created in order to determine the different possible names used for the information of interest. Because it was found that most of those contacts belonging to an Indian community were specified as "first nations" under "ethnicity" (but not all), and that there were many contacts recorded as "first nations" who did not have an Indian community specified, the selection of being an Aboriginal person was based on being classified as "first nations" (registered or non-registered) in "ethnicity". If a contact was Aboriginal was identified and coded as 1 in a new variable names "aborigin ". Being Canadian-born or not was determined based on the country of birth. Then, if Canadian-born but not Aboriginal, a contact was considered Canadian-born and coded as 1 in a new variable named "canborn"; and if not Aboriginal nor Canadian-born, a contact was consideredforeign-born. After being identified, the Aboriginal contacts were selected ("filtered") in SPSS with code = 1. Then this "filtered" variable was recoded into a different variable entitled "ethnicit" (coded as 1). Next, the contacts who were Canadian-born but not Aboriginals were selected in a similar fashion and recoded into the "ethnicit" variable and coded as 0. Then, the same procedure was repeated for those contacts not born in Canada (those coded under the variable "canborn"), which were coded as 2. This way, the new variable "ethnicit" contained all the contacts identified as required in the database entitled "ethnicity". Next, duplicates were deleted by aggregating name code by ethnicity (first) and the new database was named "unique contacts ethnicity", which was transformed into S-Plus and merged with "final database 8" to create "final database 9". -44-Database Management Latent TB infection (LTBI) treatment. The information regarding LTBI treatment was available from 2 files: OCR13 and OCR14. The file OCR13 has the information of the purpose of treatment, the drugs received and the time when the entire treatment started and ended. The file OCR14 contains information regarding the specific drugs and doses received by the contacts as well as dates when every treatment was started and ended. From file OCRI3 the treatment duration in days was calculated for all the contacts by subtracting the start of treatment date from the end of treatment, and the results were divided by (60x60x24) in SPSS (info in variable "treatdur"). There were 215 subjects who did not have either date; for them the treatment duration was obtained directly from an existing variable "numbdays" which contains this information (treatment duration in days). The information was not taken directly from this variable from the beginning because when checking the validity of the information contained in this variables, some erroneous calculations were found. For the few subjects with very long (>900 days) or negative values, the results were re-checked in file OCR14 and corrected accordingly. A cut-off of 900 days was chosen because contacts with these values were the outliers among those who received >365 days. From file OCR13 the contacts who had received LTBI treatment were identified if the treatment purpose was either "chemoprophylaxis" or "chemo to prophy", regardless of the treatment received. Also included were contacts in whom the treatment purpose was "chemotherapy" but received only INH (6 contacts), or INH and rifampin and the reason for this was stated as "inactive TB" (only 1 contact). AU the other contacts in whom the treatment purpose was either "chemotherapy" or "prophy to chemo " were excluded from this group, since these contacts received eventually full treatment for TB. Similarly, contacts who were stated as "chemoprophylaxis" but received >3 drugs and the treatment purpose was recorded as "suspect pulmonary TB" were deemed to have received full treatment for TB and were excluded (1 contact). Among those contacts considered to have received LTBI treatment with the above criteria, there were 271 who had treatment duration of 0 days; they were re-classified as contacts who did not receive LTBI treatment. The variable containing the information of whether a contact received LTBI treatment (and if so, coded as 1) was named "treated". The database containing all this information was entitled "treatment OCR13". Because the interest was only in the contacts who had received LTBI treatment or who had -45-Database Management received no treatment at all, all the other contacts who had received chemotherapy rather than LTBI treatment, i.e. those without treatment duration >0 (in "treatdur) and LTBI treatment received (coded as 1 in "treated"), were deleted (431 unique contacts), and the new database was entitled "LTBI treatment". Since we were interested in those who completed LTBI treatment (> 6 months) versus those who did not, based on LTBI treatment duration, contacts whose treatment was >182 days (based on the variable "treatdur") were selected and coded as 1 in the new variable named "ltbi6mon". Contacts who had received <182 days of treatment, or no treatment at all were then selected and coded as 0 in the variable "ltbi6mon ". In addition, LTBI treatment duration was categorized as 0, <3, <6, <9 and >9 months (0, 1-91, 92-182, 183-273 and >274 days). To do this, the corresponding LTBI treatment durations were categorized in 5 strata, from 0 to 4 according to the above LTBI treatment duration in the variable "ltbil2m". In order to assure that all contacts who received any type of treatment were indeed included in file OCR 13, the subjects who received any treatment in files OCR13 and OCR14 were compared: unique contacts who had received treatment for any duration of time were selected from both databases by calculating first the treatment duration (treatment end date -treatment start date) and then aggregating them by treatment duration. The corresponding databases were named "unique contacts OCR13 (namecode aggr by treatment duration max) " and "unique contacts OCR!4 (namecode aggr by treatment duration max) ". Next both files were converted into S-Plus and merged (having marked treated contacts in OCR]3 file as "yes "). It was found that there were more contacts who had received any type of treatment in the OCR13 file (3,800) in comparison with the OCR14 file (3,599), and that all the contacts from the OCR 14 file were already included in the 0CR13 file. Then, in order to delete duplicates, contacts from the database "LTBI treatment" were aggregated by the maximum value of: LTBI treatment duration (in database "unique LTBI treatment duration"), the code of having received LTBI treatment for 6 months "ltbi6mon" (in database "unique LTBI treatment completed (6 months) ", and the code of having received LTBI treatment for 1, 3, 6, 9 and 12 months in database "unique LTBI treatment (0 to over 9 months) ". Each of the 3 databases with unique contacts was transformed into S-Plus and merged with "final database 9" to create "final database 10" (with treatment duration), "final database 10 a" (with 6 months of treatment completed) and "final database 10 b" (with treatment categorizedfrom 0 to >9 months). -46-Database Management Tuberculin skin test (TST) size. The information regarding TST induration size is contained in file OCR] 9, however every time a contact was seen in the DTB, all the previous TST results were replicated and the newest result added. This explains the large number of recorded information in this file (131,689). The first step was to determine which contacts have had a TST result before the contact date and the cohort year. To do this, the differences in time were calculated as follows: TST application date - contact date (in days); and TST application date - cohort year. For the last calculation, a year format had to be given first to the TST application date (only for contacts with negative time difference), since the date is originally recorded as day/month/year. Then those subjects with negative time difference, indicating that the TST was performed before the contact date and the cohort year the source was recorded, were selected and recoded as 1 if yes, and 0 if no, in the variable "tstbefo". In addition, there are two variables that contain information provided by the contact directly regarding previous TST date and result (positive or negative). Several hundred dates in this variable were missing either the month, or the day, or both, and in order to handle them in the statistical packages as dates they were re-entered manually based on the original information from the Excel files, assigning the middle month of the corresponding year or the middle day of the corresponding month. Next, the difference (previous TST - contact date) was calculated and those with a negative difference (recoded as 1 in a variable named "prtstbcd". Then those with a previous TST done before the contact date (coded as 1 in "prtstbcd") who also had a positive previous TST result (stated as "positive" in variable "prtstres") were selected and recoded as 1 in a variable named "prepotst". In this same variable, those contacts with a difference <-700 days (because in some contacts the contact date was recorded up to 2 calendar years after the cohort year) were checked manually to verify that the previous TST date was before the cohort year (few were modified in the same variable). This information was verified afterwards by calculating the time difference (year of previous TST - cohort year). All the information was saved in a database called "TST with corrected dates all". Next, those contacts who had a TST before the first contact date or before the cohort year (when they were first recorded in the DTB) and their result was > 0 mm were also identified and coded as 1 in the variable "tstberec " (otherwise were coded as 0). The contacts who have had a positive TST (stated by contact or recorded in the DTB) before the contact date and the cohort year were identified by selecting those coded as 1 in "tstberec" or as 1 in "prepotst" -47-Database Management and coded as 1 in a new variable named "prpotst". These contacts were selected (and the rest deleted) to create a database entitled "previous positive TST". Next, from the database "TST with corrected dates all", the contacts who had not had a TST before the contact date and the cohort year (they were first recorded in the DTB), as well as those who have had a TST previously but was 0 mm in size, were selected through deleting those contacts coded as 1 in "prpotst". These contacts were thus considered not to have evidence of being infected previously and were saved in a new database named "TST for contacts with no previous positive TST". In this database, contacts with TST performed with other than 5TU of PPD were deleted (246 contacts including some duplicates): there were 119 contacts who have had the TST performed with 250 TU; 30 contacts who have had it with 100 TU; and 2 contacts with 1 TU. Also, there were 36 contacts who had PPD Battey applied (from a strain of M. intracellulare), and 59 who had their TST performed with other type (not specified) of antigen. The new database was entitled "TST for contacts 5TU". Next, the TST results (>0 mm in size) recorded in the DTB prior to the contact date and the cohort year were identified by calculating the time differences of both (as it was explained previously), and deleted; the new database was named "TST for contacts 5TU New". Next, all the contacts with missing TST sizes were identified and deleted. The contacts with valid TST size were saved in the database "TSTfor contacts 5TU valid". To delete duplicates, several approaches were used in this database: a) In order to obtain the maximum TST size recorded the first contact date (year) or cohort year, the minimum of these dates was selected first in the variable "mincoyr" and then the name code and the variable "mincoyr" were aggregated in SPSS by TST size (max) in the database "TSTfor contacts 5TU valid (name code and mincoyr aggr by TST size max) ". Next, in this database, the dates were sorted in ascending order and the name code was aggregated in SPSS by the TST size (first), resulting in database "unique TST size for contacts 5TU valid (name code aggr by TST size first) ". b) In order to obtain the results of the Is' and 2nd TST applied after the first contact date (year) or cohort year they were recorded in the DTB, the following procedures were carried out: in order to obtain the first TST size from the first contact date (year) or cohort year, the database was sorted in ascending order by "mincoyr" and by the date the TST was given ("tstdtegi"). Then name code was aggregated by "mincoyr" (first) and "tstdtegi" (first) in the database "unique first TST dates (name code aggr by -48-Database Management mincoy first and tstdtegi first) ". This database contained the name codes and dates of first TST, which would allow the identification of the corresponding contacts in the database "TSTfor contacts 5TU valid" in order to be able to delete them and identify the contacts who received the second TST. Next, the new database with a variable added (named "firstst") to allow the later identification of the first TST, was transformed into S-Plus in order to merge it (by name code and the date TST was given) with the database "TST for contacts 5TU valid". The new database named "TSTfor contacts 5TU with first TST identified" was transformed back into SPSS. In order to obtain the second TST size from the first contact date (year) or cohort year, the contacts with the first TST (previously identified) were deleted from the database, and this new database was named "TSTfor contacts 5TU without first TST". Next, the procedures described above were performed again to retrieve the unique name codes and dates of the second TST and the resulting database was entitled "unique second TST dates (name code aggr by mincoy first and tstdtegi first)". Then, a variable was added (named "secndtst") to allow the later identification of the second TST given; this database was transformed into S-Plus and merged with an S-Plus transformed version of the database "TST for contacts 5TU with first TST identified". The newly created database entitled "TST for contacts 5TU with first and second TST pre-identified" had all the information of interest regarding TST, including the order in which the TST was given (first or second), after being in contact with a case. In order to assure that none of these contacts have had a positive TST (Documented or Stated by the contact) before contact date or cohort year, this database was merged in S-Plus with the one containing the unique contacts with a previous positive TST (Documented or Stated by the contact) named "unique previous positive TST (namecode aggr by prpostst max) ", which derived from the database "previous positive TST". After that, the database "unique previous positive TST (namecode aggr by prpostst max) "was transformed into S-Plus and merged with the database "TST for contacts 5TU with first and second TST pre-identified" to create the database "TST for contacts 5TU with first and second pre-identified Verified", from which the contacts with a previous positive TST (Documented or stated) were deleted (>1300 contacts including duplicates) and the remaining contacts were only those who had their TST after the -49-Database Management first contact date or the first cohort year they were recorded in the DTB. This database was named "TSTfor contacts 5TU with first and second TST identified". Next, from this database, only the contacts with first and second TST were selected and the rest deleted; this new database was entitled "TSTfor contacts 5TUfirst and second TST Only". In order to have the information of the TST sizes of interest for unique contacts, for the 3 types of TST (first, second, and both: first and second) the following procedures were performed in the database "TST for contacts 5TUfirst and second TST Only ": 1. 1st TST: after sorting the database in ascending order by the date the TST was given and selecting the contacts with the first TST, duplicates were deleted by aggregating name code by TST size (first). As a quality control procedure, the deletion of duplicates was also carried out by aggregating name code by TST size (max) and the results were identical (same number of unique contacts). The new database was entitled "unique first TST size (name code aggr by first TST size max) " and the variable containing the maximum TST size for the first TST given to contacts was named "tstsizl". 2. 2nd TST: basically the same procedures described in step 1 were performed, which involved the selection of the contacts with second TST, and the deletion of duplicates with the two approaches described (which produced the same number of unique contacts). The resulting database was entitled "unique second TST size (name code aggr by first TST size max) " and the variable containing the maximum TST size for the second TST given to contacts was named "tstsiz2". 3. 1st and Td TST: the same procedures described in steps 1 and 2 were performed, including the deletion of duplicates using the two approaches (which also produced the same number of unique contacts). The new database containing the maximum TST size for either the first or second TST given to contacts (in the variable "tstsizl2") was entitled "unique first AND second TST size (name code aggr by first TST size max) ". In addition, in order to obtain the maximum TST reaction size, regardless of when and how many TSTs were administered (except TSTs applied before contact date or cohort year), the database "unique TST size for contacts 5TU valid (name code aggr by TST size first)", -50-Database Management containing all the unique contact's TST sizes, was also rechecked for contacts who may have had their TST applied before the first contact date or the cohort year they were first recorded in the DTB. In order to do this, this database was transformed into S-Plus and merged with the database "unique previous positive TST (namecode aggr by prpostst max) " to produce the database "unique TST size for contacts 5TU valid with prev pos TST identified", which, after selecting and deleting the contacts with a previous positive TST (842 including duplicates), resulted in the database "unique TST size for all contacts (previous positive TST excluded) ". This database had the TST size (maximum) of all contacts applied after first contact (or cohort year) in the variable "tstall". Each of the above databases containing unique contact's TST sizes were transformed into S-Plus for merging with the main databases as follows: the "unique first AND second TST size (name code aggr by first TST size max)" database was merged with the "final database 10b" to create the "final database 11". The database "unique first TST size (name code aggr by first TST size max) " was merged with the "final database 11" to create "final database 11a". The database "unique second TST size (name code aggr by first TST size max) " was merged with the "final database 11a" to create "final database lib". Finally, the database "unique TST size for all contacts (previous positive TST excluded) ", containing the unique contact's with all their TST sizes regardless of when and how many TSTs were administered (except TSTs applied before contact date or cohort year), was merged with the "final database lib " to create "final database 11c ". Next, in order to have the maximum TST size, regardless of whether the TST was applied before or after the contact date and cohort year the contact was first recorded in the DTB, the two databases containing all the unique contact's TSTs, "unique TST size for all contacts (previous positive TST excluded) " and "unique previous positive TST (namecode aggr by TST size max) ", were transformed into S-Plus and merged by name code and TST size (to double check, they were also appended and the results were the same) to create the database "unique TST size for contacts (all and previous TST) ", which was merged with "final database 11c" to create "final database 12 ". -51-Database Management Previous TST response. The information regarding previous TST results and dates are in file OCR19. As explained in the TST size section above, the contacts who had a TST >0 mm in size or stated to have had a positive TST applied to them before the contact date and the cohort year they were first recorded in the DTB, were identified and saved in the database "previous positive TST". To this database there were 13 more contacts added later, who did not have a date of previous TST (stated by the subject), but they had a date of TST application that allowed to determine that the TST was actually given before the contact date or the cohort year, and were thus included in the "previous positive TST" database. This database already had the valid dates from Excel that were inputted manually for those contacts in whom no day or month was specified. From 131,689 records containing TST information of all the contacts (from 1990 until 2001), there were 3,712 contacts who had a previous positive TST. In order to delete duplicates in the database "previous positive TST", the name code was aggregated by the TST size (max). The resulting file was named "unique previous positive TST size (namecode aggr by TST size max)". This file was converted into S-Plus and merged with the "final database 11c" to produce "final database 12a ". Previous TB. The information about previous TB is contained in the file OCR01, where it was obtained from, along with the name code and cohort year. After creating a table of frequencies, the variable containing the information "prevtb" was recoded in order to have only yes, no or missing (which included the original missing information as well as the answers uncertain, not asked and not applicable). This variable was then coded numerically and the database was named "previous TB status". Then only those contacts with a previous TB status confirmed (only the yes and no) were selected in a file named "previous TB status confirmed". Afterwards, to delete duplicates, the name code and previous TB status was aggregated by cohort year (min). The file was entitled "unique previous TB status confirmed (namecode and prevtb aggr by cohort yr min) ". The appropriate deletion of duplicates was confirmed in Excel by using the logical functions as described previously. To assure that the date stated as the previous TB date was not after the contact, the following procedures were performed. The contact date was obtained from the database "name code, contact date and cohort yr all -52-Database Management contacts " and was exported from S-Plus to Excel in order to convert them into a year format (yyyy). Then it was copied back into the original S-Plus file and named "name code contact date and year all contacts". This file was then merged with the database "unique previous TB status confirmed (namecode and prevtb aggr by cohort yr min)", in order to have the information regarding name code, previous TB status, cohort year and contact year. There were 287 contacts who stated that they have had TB previously. To assure that the statement of having had previous TB was actually before the contact date, the contacts with a negative difference between the 2 dates (contact year - year "previous TB" was stated) were identified (in the variable "prtbconf). The variable "tbconf was converted into a numeric format and named "prevtbr". This way it was found that among the 287 unique contacts who stated they have had previous TB, there were 272 who actually stated that they had TB before the contact date. The database was tested for duplicates and none were found. Next, the database "unique previous TB status confirmed (namecode and prevtb aggr by cohort yr min)" was transformed into S-Plus and was merged with the database "final database 12a" to create "final database 13". In addition, contacts who had TB diagnosed before the contact date either bacteriologically (smear or culture positive) or clinically (active TB) were also identified and added to the main database later in the final database 17f (see ahead). The variable defining whether a contact had had TB by any of the criteria mentioned before was named "prevtbf in "final database 17f. Infectiviry of source cases. The infectivity of the source case was determined through knowing whether the source case was AFB smear positive or negative. First, the source cases for each of the contacts were identified and their TB number extracted from files OCR01 and OCR17 into new databases named respectively "TB sources cases OCR01" and "TB sources cases OCR17". These 2 databases were transformed into S-Plus in order to have all the information in a single database. This was achieved by appending both files resulting in the database entitled "TB source cases all". In order to delete duplicates, this database was transformed into SPSS and after sorting the database in ascending order by the cohort year, the name code was aggregated by the source TB number (first). The resulting database was entitled "unique TB source cases". In order to assure that this database contained only the first TB sources -53-Database Management identities that contacts were exposed to (for those exposed to more than one), the following verification was performed. From the database "TB source cases all", the name code and source TB number were aggregated by the cohort year (min). The resulting database was named "TB source cases (contact namecode and TB# aggr by cohort yr min)". From this database, the name code was then aggregated by the source TB number (first), after sorting the database in ascending order by the cohort year. This database with unique source TB numbers was entitled "unique TB source cases ". The results obtained were identical to those in the database "unique TB source cases ". Next, the AFB smear results for all the cases recorded in the DTB (1990-2001) were obtained from the file OAR20. To assure that the positive AFB smears were obtained when more than one AFB smear was reported, the results were coded as 0 if negative and 1 if positive. Since a TB case may have had several AFB smears even in different years, in order to assure that if a positive smear was reported the first time the TB source was recorded in the DTB the following procedures were done to delete duplicates: the TB source number and the cohort year were aggregated by the AFB smear (max). The resulting database was named "TB cases smear results (TB# and cohort year aggr by afbres max) ". In this database, after sorting in ascending order by the cohort year, the TB number was then aggregated by the AFB smear results (first). The resulting database was entitled "unique TB cases smear results". In order to obtain the TB smear results only for the contact's TB sources, the database "unique TB cases smear results" was converted into S-Plus and merged with the database "unique TB source cases ", which has the TB source TB number and the contact's name code. The resulting database was named "unique TB sources smear results ". This database was re-checked for duplicates (by aggregating the name code by the afb smear "max") and none were found. This database was then merged with the "final database 13" to create "final database 14". At this point, the number of immunosuppressive conditions from the previously created database "unique contacts with immunosupr. conditions", was added to "final database 14" to create "finaldatabase 14a". Geographical (ecologic) socioeconomic status. It was obtained from CHSPR in quintiles, as reported by Statistics Canada, for each the contacts found in BCLHD and for each of the years that they appeared in the CHSPR database (from 1990-2001). In order to have a summary measure for every contact, the mean, -54-Database Management the median and the mode for each subject were calculated for the quintiles across the years. The database with the study ID as well as the mean and median quintiles for each was transformed into SPSS and entitled "SE quintiles". This database was transformed into S-Plus and merged with the database "name code contact date and year all contacts corrected" in order to obtain the contact's name code. The resulting database, which was entitled "SE quintiles 2 ", was converted back into SPSS to delete duplicates. When the mode was missing (which occurred when there was only 1 quintile recorded), which was the case for 372 contacts, the existing value (mean) was copied into the missing modes. To delete duplicates, the name code was aggregated by the mean, the median and the mode in 3 separate steps, resulting in 3 databases entitled: "unique SE quintiles (mean)", "unique SE quintiles (median) " and "unique SE quintiles (mode) ". Each of these databases was then transformed into S-Plus and merged with the database "final database 14a" resulting in the following databases (which contain the mean, median and mode SE quintiles respectively): "final database 15", final database 15 a" and final database 15 b". At the end of the database management, while doing the quality control it was found that there were some subjects with quintiles of 0 in some years. Ms. Denise Morettin, from CHSPR was contacted and confirmed that these corresponded to missing values. The mean, median and mode were recalculated after replacing the zeros by missing values and saves in the database entitled "SE status summarized (quintiles) ". Then, duplicates were deleted by aggregating the name code by the mode, mean and median individually. Next, the 3 databases were merged into a single one containing the unique mean, median and mode, entitled "unique SE status summarized (mean, median and mode) ". This database was then merged with the following databases created at the end of the database management: "final database 23a", "final contacts for analysis" and all the subsequent ones. High-risk subjects. The information required to define the high-risk persons was contained in several of DTB and BCLHD databases. a) Injection drug users were obtained from the DTB files OCR01 and OCR23 and from the previously created database "immunosupr. diseases dates and contact dates all BCLHD" that contained all the subjects who had a diagnosis of drug use in either the MSP or HSP databases. -55-Database Management The information available regarding drug abuse is available in file OCR01 from the DTB in a variable named "drug abuse " and also in file OCR23 under the 3 variables named "other health problems " 1, 2 and 3. From file OCR01, the subjects with drug abuse were identified and coded as 1 if "yes " and 0 if "no " in a new database entitled "drug users OCR01" under a variable named "druguser". Similarly, in the "health problems" variables from file OCR23 subjects were identified (if stated "drug abuse" or equivalent terms according to the frequency table created for this purpose), and coded as 1 in a new variable named "druguser" in a new database entitled "drug users OCR23 ". Also, from the database "immunosupr. diseases dates and contact dates all BCLHD", the subjects with drug abuse were selected, and the rest deleted, to create the database "drug users BCLHD"; the variable identifying them was named "druguser". Then, the 3 databases ("drug users OCR01", "drug users OCR23", and "drug users BCLHD") were transformed into S-Plus in order to append them all and create a single database, which was entitled "drug users all databases ". This database was transformed back into SPSS in order to delete duplicates, by aggregating the name code by the variable "druguser" (max). The new database was entitled "unique drug users all databases". This new database contained 210 unique subjects identified as drug users from the DTB and BCLHD databases. The information needed to for the variables below (b, c and d) was extractedfrom the DTB file OCR01 into a database entitled "high-risk persons", which included the name codes of the subjects, as well as their country of birth, date of arrival, current occupation and type of population at risk they belonged, b) Recent arrivals (<5 years) from high prevalence countries were selected according to the WHO list of high burden countries in 2000 (WHO 2002 report4), from the DTB file OCR01; in order to assure the selection of all the countries properly as countries with low and high prevalence ofTB, a frequency table was created first for the variable containing the countries of birth "brthcoun ". The 22 countries listed as high burden countries in the 2002 WHO report (which caused 75.4% of the almost 3.7 million globally notified cases4), were identified and coded as 1 (high-risk), while the rest of the countries were coded as 0 (low risk) in a new variable named "hitbctry". From the same file, the dates of arrival (available for non-Canadians) were obtained. The dates were converted from string or -56-Database Management character format (e.g. 1985/00/00) into a numeric format (e.g. 1985). These dates were in character format because it was the only format that would allow importation of the information into SPSS, since most of the dates omitted the month or day of arrival (e.g. they appear as 1985/00/00). Next, the arrival date was subtracted from the cohort year (year they appear as contacts in the DTB) to identify the subjects who arrived within 5 years before being a contact. Those who had been in Canada < 5 years and were from a country with a high TB prevalence (coded as 1 in the variable "hitbctry ") were identified and coded as 1 in a new variable named "tbctry5y". c) Residents and employees in high-risk settings (prisons, nursing homes, homeless shelters). The occupation was available in 3 variables obtained from the database "high-risk persons ": "current occupation ", "risk occupation " and "population at risk type ". First, to assure a proper classification of the residents and employees in high-risk settings, a frequency table of the 3 variables was generated. The employees and residents of high-risk settings were only available in the variables: "current occupation" and "population at risk type". All the high-risk residents and employees in interest were identified from the "current occupation" variable and coded as 1 using the recoding function of SPSS, the rest were coded as 0, and the missing values were replicated; the new variable was named "hirskocu". This variable was duplicated (using copy and paste) and named "hirskset" in order to have all the information of interest in a single variable. The high-risk residents and employees in interest from the variable "population at risk type" were identified, coded as 1, and added to the variable "hirskset" in the database "high-risk persons ". d) Hospital personnel (physicians, nurses, respiratory therapists, laboratory technicians, x-ray technicians and medical students). The information regarding the high-risk hospital personnel (HP) was available in the 3 variables obtained from the database "high-risk persons": "current occupation", "risk occupation" and "population at risk type". To assure a proper classification of the HP, from the frequency table previously generatedfor the 3 variables, the HP of interest was identified from the 3 variables. From the "current occupation" variable, the HP of interest was identified and coded as 1 using the recoding function of SPSS; the rest were coded as 0, and the missing values were replicated; the new variable was named "hirskhp". At this point, 2 out of the 4 subjects recorded as TB lab staff in the variable "risk occupation" who had not been identified in the variable "current occupation" were coded as 1 in the variable "hirskhp". This variable was -57-Database Management duplicated (using copy and paste) and named "hirskhos ". Next, the HP from the variable "population at risk type" were identified, coded as 1, and added to the variable "hirskhos " in the database "high-risk persons ". Next, the variable "druguser" from the database "unique drug users all databases" was added to the database "high-risk persons" (by converting them into S-Plus and merging them), creating the database "high-riskpersons all", which was transformed back into SPSS. In the database "high-risk persons all", all subjects coded as 1 in any of the 4 variables "druguser", "tbctrySy", "hirskset" or "hirskhos", were coded as 1 in a new variable named "highrisk". All subjects coded as 0 in all of the (4) variables above, were coded as 0 in the variable "highrisk" and the rest (who had a missing value in any of the 4 variables above) were coded with missing values. In order to delete duplicates, the name code was aggregated by the variable "highrisk" (max), and the resulting database, containing only the unique high-risk subjects, was entitled "unique high-risk persons all". This database was transformed into S-Plus in order to merge it with the "final database 15 b" to create the "final database 16". In addition, each of the 4 high-risk groups was identified individually and added later (see ahead, at the end of the database management section) to the databases "final contacts for analysis BCLHD all" and "final contacts for analysis BCLHD" using the same variable names. From the database "high-risk persons all", the unique subjects of the 3 variables "tbctry5y", "hirskset" and "hirskhos" were extracted into a database each, by aggregating the name code by each of the variables (by the maximum code). The information regarding injection drug use for unique subjects was already available in the variable "druguser", in the database "unique drug user all databases". The databases containing the information regarding being a recent arrival from a country with high TB prevalence ("tbctry5y"), a resident or employee of high-risk settings "hirskset"), or hospital personnel ("hirskhos") were respectively entitled: "unique recent arrival from high TB country", "unique high-risk settings" and "unique hospital personnel". -58-Database Management Contacts with chest x-ray abnormalities compatible with previous tuberculous disease. The information regarding chest x-ray abnormalities compatible with previous tuberculous disease is contained in file OCR23 and after creating a frequency table, it was found that it appears as "healedprimary complex" on 2 variables: "disease classification" and "non-TB abnormalities". The date that information was recorded in the DTB is also available in the same file in the variable "activity change date". The name code of the contacts and the variables of interest above were selected from the OCR23 file. Next, the contact date was added with the purpose of knowing whether the abnormalities were present at (or around) the contact date, and the new database was entitled "contacts with old TB on chest x-ray". Contacts with the diagnosis of "healedprimary complex " (HPC) on the chest x-ray (on any of the 2 variables above) were identified and coded as 1 in a new variable named "prboldtb ". Next, in order to know when the HPC was diagnosed in relation to the contact date, the "activity change date " was subtracted from the "contact date ". Contacts who had the HPC diagnosed before, and up to 1 year after the contact date, and who were coded as 1 in the variable "prboldtb", were considered to have a chest x-ray compatible with previous tuberculous disease. They were identified and coded as 1 in a variable named "oldtb". In order to delete duplicates, the name code was aggregated by the variable "oldtb " (max) and the new database was entitled "unique contacts with old TB on chest x-ray". In addition, from file OCR13, contacts who had received conventional LTBI treatment with 1 or 2 drugs (all treated with INH and/or RM, except for 1 subject who received INH, RM and PZA) for changes compatible with previous tuberculous disease in x-ray (according to the variables "presumed TB inac", "inactive TB" and "+tbn x-ray chang") before or up to a year of the contact date were identified and coded as 1 in a new variable named "oldtbrxf. Then, duplicates were deleted by aggregating the name code by the variable "oldtbrxf (max), and the new database was entitled "unique contacts treated for old TB". Next, the databases "unique contacts with old TB on chest x-ray" and "unique contacts treated for old TB" were transformed into S-Plus and appended to for a single database entitled "contacts with old TB", in which subjects with previous tuberculous disease were in a variable named "oldtb". This database was transformed back into SPSS for duplicate deletion, which was done by aggregating the name code by the variable "oldtb". The resulting database was entitled "unique contacts with old TB". This database was transformed into S-Plus in order to merge it with the database "final database 16" to create "final database 17". -59-Database Management Contacts with previous tuberculous disease by lab (positive smear or positive culture for MTB) or clinically. There are some subjects with a positive culture for MTB, in whom the specimen is referred as "historic" in files OCR21 and OAR21 (in the variable "specimen#"). No historic MTB cases were reported in the OCR20 or OAR20 files. The historic MTB cases from OCR21 were identified in the databases entitled "unique contacts with historic positive MTB culture (OCR21)". There were 107 MTB historic cases in this database. The dates the specimen was received could be obtained for most (99) of these subjects from the database "TB OCR20", and the new database was entitled "unique subjects with historic positive MTB with dates (OCR21) " (except for one case, all were dated before 1990). The same procedures were done for the file OAR21 and the database with the unique historic MTB cases with dates was named "unique subjects with historic positive MTB with dates (OAR21) ". Then, the databases "unique subjects with historic positive MTB with dates (OCR21) " and "unique subjects with historic positive MTB with dates (OAR21)" were merged (and then checked for duplicates; none were found); the resulting database, which had all the historic MTB cases, as well as the dates the specimen was received (for most of them) was entitled "unique subjects with historic positive MTB culture OCR21 and OAR21". There were 167 subjects with historic positive MTB culture. The database "unique subjects with historic positive MTB culture OCR21 and OAR21" was merged with the "final database 17" to create "final database 17a ". This database contained the contacts who had a "historic" positive culture for MTB, all of them identified by the OCR database. There were 106 contacts who had a historic positive MTB culture before the contact date and were identified in the variable "histmtb " (all were recorded in the OCR21 file; the OAR21 file did not add any more cases). Contacts with no date for the historic positive MTB culture were considered to have had it before 1990 (since it was the case for all but for one of the historic cultures in whom a date was available, and also because the term historic was used routinely before 1990). There were also some contacts with "historic"positive smears in the variable "specimen^", in the file OCR 20. They were identified from a database entitled "contacts with positive smear (OCR20) ") and coded as 1 in a variable named "histposm"; the rest of contacts were -60-Database Management deleted and the resulting database wad entitled "contacts with historic positive smear (OCR20)". The name code and "specimen^" were aggregated by the date specimen was received (min) in order to delete duplicates and preserve dates. The new database was entitled "unique contacts with historic positive smear (OCR20)". There were 39 subjects in this database, all with their corresponding dates (date specimen was received in the lab). Similarly, the subjects with historic positive smear from the OAR20 file were identified and saved with their corresponding dates in the database "unique subjects with historic positive smear (OAR20) ". There were 22 subjects in this database. Both databases were merged in S-Plus and the resulting database was entitled "unique subjects with historic positive smear (OCR20 and OAR20) ", which had 60 subjects. This database was checked for duplicates and none were found. Next, to identify those subjects with a historic positive smear that could have been due to MOTT, the database "unique MOTT all OCR and OAR databases" (createdfor this purpose, containing all the unique MOTT cases and dates from all OCR and OAR databases) was merged into the database "unique subjects with historic positive smear (OCR20 and OAR20) ". The resulting database was entitled "unique subjects with historic positive smear (OCR20 and OAR20), MOTT identified". Since there were no historic smear positive subjects with concurrent MOTT (there were 2 subjects in whom MOTT was diagnosed many years after the historic positive smear), the database "unique subjects with historic positive smear (OCR20 and OAR20)" was used to be merged with the database "final database 17a" to create "final database 17b ". There were 39 contacts with historic positive smear before the contact date (all identified by the OCR20 database). These were identified in the variable "hismrall". In order to determine the contacts with previous TB diagnosed clinically, all cases recorded as active TB cases from the OCR files are in the database "unique active TB cases OCR23 "; those active TB cases in whom the diagnosis of active TB (according to the variable "activity change date") was made before the contact date, were considered as previous TB cases. In order to do this, the database "unique active TB cases OCR23" was merged into the "final database 17b ", to create the "final database 17c " and the active TB cases with previous TB identified as described in the variable "oldactb". In order to assure that none of these - 6 1 -Database Management previous TB cases were due to MOTT, they were checked using the information from the database "unique MOTT all OCR databases ". There were no isolated MOTT cases among them. There were 246 contacts diagnosed with active TB before the contact date among the active cases from the OCR23 database; however, when the "reviewed" contact date, available in the variable "contdter" (created to correct for contact dates supposedly occurring after the cohort year the contact was recorded in the DTB; see Appendix III for more details) was used in the calculations, it was found that 4 contacts were misclassified; therefore, there were finally 242 contacts with previous TB, i.e. with TB diagnosed before the contact date and the year the contact was recorded in the DTB (cohort year). These 242 previous TB cases were identified in the variable "oldactbc " in the "final database 17c ". Similarly, the active cases from the OAR files and cases with TB clinically from the active registry files were obtained with their dates in the respective databases "unique active TB cases DTB with dates" and "unique clinical TB cases in active registry". These 2 databases were merged to create the database entitled "unique active TB cases DTB and active registry with dates ". Next, in order to determine the contacts with previous TB diagnosed clinically from the OAR files and the active registry, the database "unique active TB cases DTB and active registry with dates" was merged into the "final database 17c" to create the "final database 17d". The variables containing the information regarding the active cases from the OAR files and the clinically diagnosed cases from the active registry were named "actboar" and "tbdiagar" respectively. The contacts identified with previous TB from the OAR databases (in variable "oldtboar") and active registry (in variable "oldtbar" were no different than the previously identified by the OCR database ("unique active TB cases OCR23 "). Because the OAR and active registry databases did not add any information, the corresponding variables were deleted, but a variable identifying all the active cases from all the OAR databases and all the cases diagnosed clinically in the active registry was created and named "acttboar"; the new database entitled "final database 17e ". In order to also identify all the contacts recorded as smear positive or culture positive for MTB in whom the date the specimen was received in the laboratory before the contact date (and thus considered to be previous TB), the database "unique positive smears and cultures -62-Database Management all OCR databases " was merged with the database "final database 1 7e " to produce the "final database 17 f . No more previous TB cases were among the positive smear and culture cases from the OAR databases (see the "final database 19" in the diagnosis ofTB Section). Contacts with a positive smear or positive MTB culture were identified in the variable "smculpos " and the corresponding dates were obtained from the dates when the specimen for culture or smear was received. For contacts in whom these dates were not available (26 of them) the dates were obtained from the date of diagnosis of the positive MTB (in the variable "activity change date " obtained originally from file OCR23). Contacts identified as previous TB based on positive smear or MTB culture were identified in the variable "oldpostb ". Smear and culture positive cases were re-checked (isolated MOTT excluded) later when the "final database 18" was created (3 smear positive cases were due to MOTT only and re-classified as not smear positive). There were 69 contacts with positive smear or culture before the contact date; however, there were 67 contacts with a previous positive smear or MTB culture when the reviewed contact date was used (from variable "contdter"). Then all the contacts with previous TB were identified from all the variables containing the corresponding information; i.e. the information of TB diagnosed before the contact date: previous TB stated by the contact or based on TB diagnosis or smear or culture results. The variable identifying the contacts with previous TB was named "prevtbf in the "final database 17f. Afterwards, one subject who died before 1990 with TB was added to those in the variable "prevtbf. -63-Database Management DIAGNOSIS OF TUBERCULOSIS. The information regarding the possible diagnosis of tuberculosis (TB) for all the subjects attended in the DTB is contained in several of the O A R files, OCR files and in the active registry file (AR). It is important to note that the identification and selection of TB cases was independent (done with different databases and at different times) and blinded (without knowledge of whether any of the possible TB cases had previously been identified as contacts). Subjects were considered to have TB i f any of the following criteria were present: 1. They had a positive smear and/or positive culture for tubercle bacilli (M. tuberculosis, M. bovis or M. africanum), which will be denominated M T B . Subjects who had a positive smear but the culture was positive for MOTT were excluded. 2. They had a diagnosis of active tuberculosis (sensitive criteria). In addition, specific criteria were applied as follows: a) subjects were considered to have TB i f they had a diagnosis of active TB and they were treated with at least three 1 s t line drugs: isoniazid (INH), rifampin or rifabutin (RM), pyrazinamide (PZA), ethambutol (EMB) and streptomycin (SM); however, those who in addition received clofazimine and/or clarithromycin were not included, to avoid including patients with MOTT. 3. They had a histopathological diagnosis of TB or the diagnosis of TB was at death. The contacts who developed TB were identified by matching the database containing the TB cases, with the database containing all the contacts recorded in the DTB during 1990 through 2001 ("final database 17"). The date of TB diagnosis was also obtained for all the contacts who developed TB. The identification of the TB cases from the pertinent OAR files, OCR files and the A R file is explained below. The TB cases identification was done first with the O A R files and with the A R file and then it was also done separately for the OCR files. The identification of cases -64-Database Management applying specific criteria was done first, and the sensitive criteria were applied afterwards, as explained below. OAR files: The relevant information to determine whether a subject developed TB was obtained from the DTB files OCR13, OCR20, OCR21, OCR23 and OCR01. File OAR13 contains the information regarding the purpose of the treatment (chemotherapy, chemoprophylaxis, chemo to prophylaxis, prophylaxis to chemo and uncertain), the drugs received, as well as dates when the treatment started and ended. File OAR20 has the information regarding lab results: smear, Gaffky and culture, including mycobacteria identification (ID) and dates when specimens were received. File OAR21 contains the laboratory information regarding mycobacteria ID and sensitivity. File OAR23 has the TB classification, activity ofTB and date. File OCROI contains the information of subjects diagnosed with TB at death or at necropsy, as well as the date of diagnosis. Subjects were considered to have TB if they fulfilled the criteria mentioned above, using the following procedures. Smear positive and/or culture positive for MTB: From file OAR20, only the data of interest (name codes, smear and culture results with their corresponding dates) were selected and kept in a new database was entitled "TB OAR20". After creating a frequency table to determine how the results were inputted, the smear results were identified and recoded as negative, positive (coded as I), and missing/unknown in a new variable named "smeares". Next, the culture results were also identified and recoded as negative (coded as 0), positive (coded as 1), Atypical (coded as 2) and missing/unknown (coded as 3) in a new variable named "cultres". Then, based on the mycobacteria ID information, the different types of mycobacteria were identified from a frequency table, and recoded as follows: MTB was coded as 1 and all other coded as 0 in a new variable named "micobid". Also, mycobacteria other then TB (including non-atypical: non-identified and Bovis -BCG related-) were identified in a variable named "mott2". Next, subjects with smear positive but MOTT negative were identified and coded as 1 in a variable named "smearpos " in the same database ("TB OAR20"). Subjects with smear and culture positive were identified -65-Database Management ("smeares"=l and "cultres"=l) and recoded as 1 in a new variable named "smrculpo". Subjects with smear positive and culture negative or unknown were also identified ("smeares"=l and "micobid"=3) and recoded as 1 in a new variable named "smrpcltn". Finally, subjects who were smear positive or culture positive for MTB were identified (micobid=l OR smrculpo=J OR smrpcltn=l) and coded as 1 in a new variable named "tbpolab". In addition, subjects who were smear positive or culture positive for MTB were also identified by selecting (micobid=l OR smearpos=l) and coded as 1 in a new variable named "tbposlab ". The variable "tbposlab " identified 4 subjects more than "tbpolab "; these subjects were smear positive and culture result missing but were positive for TB complex in RNA probe. Next, only cases coded as I in "tbposlab" along with their corresponding information regarding smears and cultures were selected (and the rest deleted) in a new database named "OAR20 smear or culture positive ". Next, in order to identify the cases who were smear positive (excluding MOTT) and/or culture positive for MTB, as well as the dates the diagnosis was made by the laboratory, according to the date the specimen was received by the laboratory, the following procedures were performed using the database "OAR smear or culture positive ": The smear positive cases (MOTT excluded) were identified (subjects coded as 1 in the variable "smearpos ") from the database "OAR smear or culture positive ", and the rest were deleted to create the database entitled "subjects with positive smear (OAR20)". Cases with historic positive smears were deleted from this database and the resulting database was entitled "subjects with positive smear historic excluded (OAR20)". Then, the name code and the variable "smearpos" were aggregated by the date specimen was received (min) in order to delete duplicates and the resulting database was entitled "unique subjects with positive smear (OAR20)". The absence of duplicates was double checked by aggregating the name code by the variable "smearpos" (max) and the results were unchanged. There were 1,695 unique cases with positive smear (plus another 22 who were historic positive smear) in the OAR20 file. The MTB positive culture cases with their dates of diagnosis were identified from the databases "OAR smear or culture positive", "TB0AR21" and "TBOAR23". From the database "OAR20 smear or culture positive", MTB culture positive cases were identified (subjects coded as 1 in variable "micobid") and the rest deleted to create the database -66-Database Management entitled "subjects with positive MTB culture (OAR20)". There were no cases with historic positive MTB cultures in this database. Then, the name code and the variable "micobid" were aggregated by the date specimen was received (min) in order to delete duplicates and the resulting database was entitled "unique subjects with positive MTB culture (OAR20) ". The absence of duplicates was double checked by aggregating the name code by the variable "micobid" (max) and the results were unchanged. There were 3,016 unique cases with positive MTB culture and their corresponding dates. From the database "TBOAR21", the MTB culture positive cases were identified (subjects coded as 1 in variable "micobid") and the rest deleted to create the database entitled "subjects with positive MTB culture (OAR21)". There were 57 historic MTB cases in this database. Next, the name code and the variable "micobid" were aggregated by the date specimen was received (min) in order to delete duplicates and the resulting database was entitled "unique subjects with positive MTB culture (OAR21) ". The absence of duplicates was double checked by aggregating the name code by the variable "micobid" (max) and the results were unchanged. There were 3,035 unique cases with positive MTB culture and their corresponding dates. From the database "TBOAR23 ", the MTB culture positive cases were identified following the same steps described before to create the databases "subjects with positive MTB culture (OAR23) " and "unique subjects with positive MTB culture (OAR23) ". There is no information regarding historic lab results in the original ("OAR23") database. There were 2,910 unique cases with positive MTB culture. The dates contained in this database correspond to the dates a clinical diagnosis was established (in the variable "activity change date "); no laboratory dates are available in this database. Because only 2 of the databases contain the laboratory dates (the specimens were received) of the specimens reported the positive MTB cultures ("unique subjects with positive MTB culture (OAR20) " and "unique subjects with positive MTB culture (0AR21) "), they were transformed into S-Plus and appended first to form a single database entitled "subjects with positive MTB culture (OAR20 and OAR21) ". Then, to delete duplicates, the database was transformed back into SPSS and the name code and variable "micobid" were aggregated by the date specimen was received (min). The resulting database was entitled "unique subjects with positive MTB culture (OAR20 and OAR21)". This database contained 3,069 TB cases with MTB positive culture results (3,029 of them or 98.7% had also the lab result date). In order to identify all -67-Database Management the MTB positive cases, this database was transformed into S-Plus to merge it with the database "unique subjects with positive MTB culture (OAR23) ". The resulting database was entitled "subjects with positive MTB culture (OAR20, OAR21 and OAR23)". This database was transformed back to SPSS to check for duplicates and none were found. This database contained 47 more cases with culture positive for MTB (34 of them had a diagnosis date from the variable "activity change date"). Hence, the total number of TB cases with a positive culture for MTB in the OAR fries was 3,116 (of whom 3,063 or 98.3% had a lab or diagnosis date). In order to have all the smear positive and MTB positive cases from the OAR files in a single database, the databases "unique subjects with positive smear (OAR20) " and "subjects with positive MTB culture (OAR20, OAR21 and OAR23)" were transformed into S-Plus and merged to create the database entitled "positive smears and cultures all OAR databases". This database was converted back into SPSS and checked for duplicates but none were found. However, to facilitate the procedures, this database was also entitled "unique positive smears and cultures all OAR databases". There were 3,319 unique cases smear positive and/or culture positive. Because the database "unique positive smears and cultures all OAR databases " contained also the information regarding the diagnosis dates (based mostly on the dates when specimens were received, except for the 34 dates mentioned above stemming from OAR23), this database was later merged with the database containing all the TB cases from the other OAR files and the AR files, as explain below (the terms OAR files and DTB files are used for the same type of files). Diagnosis of active TB using specific criteria (treated with at least three Is' line drugs): The database "TB OAR23" was used to identify subjects with diagnosis (or suspected diagnosis) of active TB. First a frequency table of the variables "disease classification " and "activity of TB" was created. Then subjects were selected if recorded as: primary TB, progressive primary TB, TB respiratory, TB non-respiratory and TB other respiratory as well as their corresponding suspects. If selected, they were recoded as 1 in a new variable named "tbclinic". In addition, TB was determined to be active if in the corresponding variable they appeared as active cases. Active cases were then selected and recoded as 1 in a new variable "activetb". Next, those who selected in both of the above variables were identified and -68-Database Management recoded as 1 in a new variable named "tbactcli" and were thus considered to have active TB; the subjects identified in the variable "activetb" were the same as those identified in the variable "tbclin" and thus in the variable "tbactcli". Next, subjects coded as 1 in this variable were selected (and the rest deleted) to create a new database named "active TB cases DTB ", which contain the subjects diagnosed with active TB. Next, after sorting cohort year (the year subjects were recorded in the DTB) in ascending order duplicates were deleted by aggregating the name code by the cohort year (first), i.e. the first year they were recorded in the DTB. The deletion was performed in this way because the cohort year was available for all the subjects, while the date of active TB was missing in 17% of cases. The new database was entitled "unique active TB cases DTB". In order to later identify these subjects as active TB cases, a new variable named "activetb" was created and all subjects were coded as 1 (l=yes). There were 3,864 subjects in this database diagnosed with active TB between 1990 and 2001. From file OAR13, the variables of interest (name code, purpose of the treatment, drugs received, as well as dates when the treatment started and ended.) were selected and kept in the new database named "TB OAR13". The database "TB OAR13" was used to identify subjects who received conventional TB treatment. First a frequency table was created to identify the possible ways the information regarding the drug summary had been entered. All the different combinations of drugs that included at least 3 of any of the primary ones (INH, RM, PZA, EMB and SM) were identified and recoded as 1 in a new variable named "tbtreat". Treatments that included at least 3 drugs of interest but also included clofazimine and/or clarithromycin were excluded. Next, only subjects coded as 1 in "tbtreat" were selected and the rest deleted to create the new database was entitled "TB treated subjects DTB", which contains only the subjects who had received conventional treatment for TB. To delete duplicates in this database, after sorting it in ascending order by the cohort year, the name code was aggregated by the cohort year (first) which was available for all (the date treatment started or ended was not used as aggregating variable because it was missing for >8% of cases), and the new database was entitled "unique TB treated subjects DTB". There were 3,987 subjects in this database who had received conventional TB treatment between 1990 and 2001. In order to later identify these subjects as TB treated cases, a new variable named "tbtreatm " was created and all subjects were coded as 1. -69-Database Management In order to determine which subjects were diagnosed with active TB and received TB treatment, the database "unique TB treated subjects DTB" was merged into the database "unique active TB cases DTB" (after transforming them into S-Plus). The resulting database was named "unique active cases with TB treatment DTB" and transformed back into SPSS. The subjects who had the diagnosis of active TB and also had received TB treatment were selected if coded as yes for both variables "activetb" and "tbtreatm", and after deleting the rest of subjects the new database was entitled "final unique active cases with TB treatment DTB". There were 3,212 subjects who had the diagnosis of active TB and had received TB treatment. In order to determine all the TB cases from the OAR databases, the databases "unique positive smears and cultures all OAR databases" and "final unique active cases with TB treatment DTB " were transformed and merged in S-Plus. The resulting database containing all TB cases from the OAR databases was entitled "TB cases DTB". The presence of duplicates was checked by aggregating the name code by the variable "tbcase ", which was created to identify all cases, by the cohort year (first) after sorting the database in ascending order by cohort year and the new database was entitled "unique TB cases DTB". There were 3,930 TB cases in this database. Histopathological diagnosis ofTB or TB diagnosed at death. The information regarding these aspects is recorded in file OAR01 since when TB is diagnosed at death or at necropsy, it is recorded in the DTB. There are 3 variables that record the information of interest: "cause of death 1", "cause of death 2 " and "review ofTB death ". These variables as well as the name code and date of death were kept in a database entitled "deaths and dates (all) OAR01". Next, a frequency table was constructed of the 3 variables above, to determine all the diagnosis recorded. The most important variable is "review of TB death" since it identifies all subjects in whom TB was the principal cause of death, TB was a contributing factor to death or TB was no related to cause of death. Subjects with any of these 3 recordings were identified and coded as 1 in a new variable named "tbdeath". If in the cause of death 1 or 2, any type of TB diagnosis appeared, the subjects were selected and re-coded as 1 in new variables corresponding to the ones mentioned, which were named "tbdeathl" and "tbdeath2 ". Then, subjects coded as 1 in any of the 3 variables were selected - 70 -Database Management and coded 1 in a variable named "tbreldth", while the rest deleted, to create the database "TB deaths OAR01". Then, to delete duplicates, the name codes and the variable "tbreldth" were aggregated by the date of death (min). The new database, which was named "unique TB deaths OAR01 with cohort yr", contains all the subjects in whim TB was related to their death and/or diagnosed at necropsy, as well as the date of death. There were 401 subjects who had died with TB or because ofTB between 1990 and 2001. Next, in order to obtain all the TB cases diagnosed in the OAR files, the databases "unique TB cases DTB" and "unique TB deaths OAR01 with cohort yr" were transformed into S-Plus and merged. The resulting database was named "TB cases all DTB" and was converted back into SPSS for duplicate checking and deletion. To do this, the name code was aggregated by the "tbcase " (max), and no duplicates were found (in database "unique TB cases all DTB"). In total there were 3,975 TB cases in the OAR files between 1990 and 2001. As explain below, the dates of diagnosis were included in this database to create the database "unique TB cases all DTB with dates". A duplicate check was done, by aggregating the name code by the variable "tbcase " and none were found. Active registry. This database contains information regarding the diagnosis of TB (type of TB), date of diagnosis, treatment received and purpose of treatment during 1990 through 2001. There is also information regarding the date and reasons the treatment ended, including death, leaving the province, lost to follow-up and non-TB diagnosis. There is no information (in the AR files provided) regarding the date of death, cause of death or smear or culture results, since in any case, this information is extracted from the OAR files and transfer to the AR. Hence, the TB cases in the AR were identified only based on the diagnosis of TB and the treatment received. The identification of patients with tuberculosis in the AR was performed as follows: Diagnosis ofTB using specific criteria (treated with at least three Is' line drugs): The diagnosis of TB was available in 4 variables in the AR file: "diagnosis pulmonary" diagnosis other pulmonary", "diagnosis non-pulmonary 1" and "diagnosis non-pulmonary 2 ". After running a frequency table for these variables, all the subjects with a diagnosis of any -71-Database Management type of TB, were coded ass 1 and the rest coded as 0 (except for the missing values), for each variable in the corresponding new variables named "tbpulmon", "tbnonpul", "tbnopuH" and "dxothrpu". To identify all the subjects who were diagnosed with TB in a single variable, subjects who were coded as 1 in any of the 4 variables above were identified and coded as 1 in a new variable named "tbdiagn" in a database entitled "subjects with TB in active registry". Subjects who received conventional treatment for TB were identified from the variable "drug therapy summary". Like done in the OAR files, a frequency table was created to identify the possible ways the information regarding the drug summary had been entered. All the different combinations of drugs that included at least 3 of any of the primary ones (INH, RM, PZA, EMB and SM) were identified and recoded as 1 in a new variable named "tbtreat". Treatments that included at least 3 drugs of interest but also included clofazimine and/or clarithromycin were excluded. Those in whom the reason for treatment (under the variable "chemoprophylaxis reason ") was (or included) MOTT were identified and coded as 1 in the variable "mott". All the subjects who had been coded as 1 in the variable "tbtreat", but that were coded as 1 in the variable "mott", were recoded as 0 in the variable "tbtreat". Next, those who were diagnosed with active TB (coded as 1 in variable "tbdiagn ") and had also received conventional treatment for TB (coded as 1 in "tbtreat") were identified and coded as 1 in a new variable named "tbcase". Among these subjects, those in whom the reason to end treatment (in the variable "reason to end treatment") was specified as "non-TB diagnosis " were identified and recoded as 0 (only 1 subject). Hence, subjects coded as 1 in the variable "tbcase " were considered to have TB under the clinical criteria mentioned above. These subjects were selected and the rest deleted to create the database "TB cases active registry". Duplicates were deleted by aggregating the name code by the date of diagnosis (first), after sorting the database in ascending order by the variable "dateofdx". The resulting database was entitled "unique TB cases active registry". In total there were 3,104 cases in the active registry between 1990 and 2001. Next, all the TB cases from the OAR databases ("unique TB cases all DTB with dates") and from the active registry database ("unique TB cases active registry") were transformed into S-Plus and merged into a single database entitled "TB cases all OAR andAR databases". All TB cases were identified as 1 in the variable named "tbcase". After rearranging and preserving the dates of interest, the new database was entitled "TB cases all OAR and AR -72-Database Management databases!" "TB cases all databases". The number ofTB cases identified in all the DTB databases (OAR and AR) was 3,998 (the active registry database added 23 cases). However, this database had some MOTT cases that were identified and eliminated, as explain below. Next, in order to identify the smear positive and MTB culture positive cases (as well as their dates), the database "TB cases all OAR and AR databases!" (previously "TB cases all databases") was merged with the database "unique positive smears and cultures all OAR databases ". This also allowed establishing the dates the cases were diagnosed according to the laboratory results. The resulting database was entitled "TB cases all OAR and AR databases (with positive smear and cultures) ". Next, in order to assure that all the MOTT cases were identified in the database, the MOTT cases from all the OAR and AR databases were identified as follows: in the databases "TB OAR!0", "TB OARH " and "TB OAR!3", where mycobacteria ID was available, a variable named "mottrev" was created to identify all the MOTT in each database. After deleting the non-MOTT cases, the individual databases ("MOTT OAR!0", MOTT OARH", and "MOTT OAR!3 ") were transformed into S-Plus and appended into a single database entitled "MOTT OAR!0, OARH and OAR!3 ", which was transformed back for duplicate deletion aggregating the name code by the date of diagnosis (the date the specimen was received for OAR!0 and OAR!I and the "activity change date" for OAR!3). The resulting database was entitled "unique MOTT OAR!0, OARH and OAR!3". MOTT information was also obtained from the database TBOAR13, where it is specified under the reason for treatment in the variable "prophylaxis reason ". A database entitled "unique MOTT 0AR13 " was created after deleting the non-MOTT cases and deleting the MOTT duplicates. This database was then transformed into S-Plus and merged with the database "unique MOTT OAR!0, OARH and OAR!3" (which added 37 MOTT cases). The resulting database was entitled "unique MOTT all OAR databases ". This database was transformed back into SPSS and checked for duplicates and none were found. The MOTT information was then obtained from the active registry (from the variable "prophylaxis reason") into the database entitled MOTT active registry. Duplicates were deleted by aggregating the name code by the variable "rxstartdate" (which was the only available for most) and the resulting database was entitled "unique MOTT active registry". -73-Database Management This database was merged with the database "unique MOTT all OAR databases" (which added 2 MOTT cases), resulting in the database entitled "unique MOTT all databases (OARs andAR) ". There were 1,249 unique MOTT cases in all the databases. Then, in order to identify the MOTT cases, the database "TB cases all OAR andAR databases with positive smear and cultures " was merged in S-Plus with the database "unique MOTT all databases (OARs and AR) ". The resulting database was entitled "TB cases all OAR and AR databases with positive smears cultures and MOTT". Then, the MOTT cases (those that did not have a positive MTB culture) were identified and deleted from the database "TB cases all OAR and AR databases with positive smears cultures and MOTT". There were 130 isolated MOTT cases. The resulting database was entitled "unique TB cases OAR and AR databases ". The total number ofTB cases in all the OAR and AR databases was 3,868; and among these, there were 183 cases with TB and MOTT. Date of TB diagnosis (date of first diagnosis). The date of first TB diagnosis was considered the "date of diagnosis " available in the active registry (in the variable "dateofdx"), where it was obtained for 3,104 of the TB cases. For the remaining cases, the date of diagnosis was considered either: a) the date treatment was started (from a database created for this purpose entitled "date treatment started from TB treated subjects DTB", which contains the dates available in file OCR13); b) the date the diagnosis of active TB was made (from a database created for this entitled "date active TB was diagnosed From active cases DTB", which contains the dates available in file OAR23). c) The date the TB diagnosis was made at death or necropsy (from a database created for this entitled "unique TB deaths dates OAR01", from the file OAR01). The first date of diagnosis allowed identify all the TB cases who had TB before being recorded as contacts of a case (for cases between 1990-2001). The new database with dates was entitled "unique TB cases all DTB with dates2". In addition, the cases identified in the active registry (in the database "unique TB cases active registry") had their dates of diagnosis already identified in the variable "dateofdx". Hence, once the 2 databases were merged to produce the database "TB cases all databases", the corresponding dates were already included. The dates for the positive smear cases and the positive MTB culture cases were then added, as explained previously when the database "TB cases all databases" was merged with the database -74-Database Management "unique positive smears and cultures all OAR databases", resulting in the database "TB cases all databases (with positive smear and cultures) ". Diagnosis of TB from OCR files. The identification of TB cases for the OCR files was exactly as done for the OAR files; therefore, a summary of the procedures is presented. The only difference is that for the OCR files the dates of diagnosis were included from the beginning, thus there was no need for the steps used in the OAR files to include the dates of diagnosis for each of the files. Smear positive and/or culture positive for MTB: Contacts with smear positive and/or positive culture for MTB, as well as the corresponding dates of diagnosis were identified as follows: From the database "OCR20 smear or culture positive" those with positive smear (MOTT excluded) were identified and saved in a database entitled "unique contacts with positive smear (OCR20)", which had the historic cases already eliminated. There were 102 positive smear cases in this database. Cases with positive culture for MTB were identified in the database "unique contacts with positive MTB culture (OCR20) ". There were 190 cases with positive culture for MTB. There were no historic MTB cases in the database OAR20. From the database "TB OCR21" contacts with positive culture for MTB were identified and saved in a database entitled "contacts with positive MTB culture (OCR21) ". After excluding historic cases and deleting duplicates the database entitled "unique contacts with positive MTB culture (OCR21)" was created. This database had 206 cases with positive culture for MTB. From the database "TB OCR23 " the MTB cases were identified and saved in the database "contacts with positive MTB culture (OCR23) ". There are no historic cases in OCR23 files. After duplicates were deleted, there were 200 contacts with positive culture for MTB in the database "unique contacts with positive MTB culture (OCR23) ". In all these cases, the deletion of duplicates was done using the minimum (earliest) date the specimen was received in the laboratory (for the OCR23 database, the date of diagnosis was used, which is specified in the variable "activity change date "). The 2 databases with laboratory dates were appended to create the database "contacts with positive MTB culture (OCR20 and OCR21) ". The resulting database after deleting duplicates -75-Database Management by the first/minimum date the specimen was received was entitled "unique contacts with positive MTB culture (OCR20 and OCR21). There were 208 contacts with positive MTB cultures and their dates. This database was merged with the "unique contacts with positive MTB culture (OCR23) ", resulting in the database "unique contacts with positive MTB culture (OCR20, OCR21and OCR23)". There were 235 contacts with positive culture for MTB in all the OCR databases. Next, all contacts with a positive smear (excluding MOTT) or MTB culture, were gathered in a single database by merging in S-Plus the databases "unique contacts with positive MTB culture (OCR20, OCR21and OCR23) " and "unique contacts with positive smear (OCR20) ". The resulting database was entitled "unique positive smears and cultures all OCR databases ". This database was checked for duplicates and none were found. Positive smear or MTB culture cases were identified in the variable "smculpos ". There were 248 contacts who had a positive smear (MOTT excluded) or a positive culture for MTB. Next, contacts with MOTT were identified following the same procedures described for the OAR files. The database "unique MOTT OCR20, OCR21 and OCR23" with 73 MOTT cases was merged with the database "unique MOTT OCR] 3 " to create the database "unique MOTT all OCR databases " (with no duplicates). There were 74 contacts with MOTT in all the OCR databases. Diagnosis of active TB using specific criteria (treated with at least three Is' line drugs): Following the same procedures described for the OAR files, contacts who had a clinical diagnosis of active TB were identified from the database "TB OCR23" and saved in the database "active TB cases OCR23"; the variable identifying them was named "tbactcli". Then, duplicates were deleted by aggregating the name code and the variable "tbactcli" by the date of diagnosis (min), which is available in the variable "activity change date". The resulting database was entitled "unique active TB cases OCR23 " and has the unique active cases according to the earliest date of diagnosis. There were 542 contacts in this database diagnosed with active TB between 1990 and 2001. There were 9 cases with no date of diagnosis. Since no treatment date was found in other OCR or OAR files, the middle of the corresponding cohort year was assigned (i.e. 15/06 of the corresponding cohort year). -76-Database Management The contacts who received treatment for TB with at least 3 of the primary conventional drugs for TB treatment were identified from the file "TB OCR13 " in the same way described for the file "TB 0AR13". Subjects who had received TB treatment (as described) were identified in the variable "tbtreat" and saved in a database entitled "TB treated subjects 0CR13". Duplicates were deleted by aggregating the name code and the variable "tbtreat" by the date treatment started (min), which was available for all subjects. The resulting database was entitled "unique TB treated subjects OCR13". There were 305 unique contacts who had received treatment with at least 3 conventional TB treatment drugs. Next, in order to determine which subjects were diagnosed with active TB and received TB treatment (with at least three 1st line drugs), the database "unique TB treated subjects OCR13" was merged into the database "unique active TB cases OCR23" and the final database containing the unique active TB cases who received treatment was entitled "final unique active cases with TB treatment OCR files". There were 278 active cases that received treatment with at least three 1st line drugs. In addition, it was identified (in the database entitled "unique active cases number of drugs received (0CR13) ") that there were 170 "active cases" who had no received any treatment; it was found that almost all of these cases had had the TB diagnosed in another country (or province) in the past, i.e. before attending the TB clinic (and therefore 1990); there were 6 active cases who received one drug and 88 who received 2 drugs. Histopathological diagnosis ofTB or TB diagnosed at death. Those who died with TB or because of TB were identified from file OCR01. There are 3 variables that record the information of interest: "cause of death 1", "cause of death 2 " and "review ofTB death". A frequency table was constructed of the 3 variables to determine all the diagnosis recorded. The same procedures described previously for the OAR01 file were used. Contacts who died with TB or because ofTB were identified from the 3 variables above and coded as 1 in a new variable named "tbdeath". The database containing only these subjects was entitled "TB deaths OCR01". Duplicates were deleted by aggregating the name code by the date of death and the resulting database was entitled "unique TB deaths OCR01". There were 20 contacts reported in file OCR01 who died with TB or Because ofTB. -77-Database Management In order to determine all the TB cases from the OCR databases, the 3 databases identifying TB cases ("unique positive smears and cultures all OCR databases", "final unique active cases with TB treatment OCR files " and "unique TB deaths OCR01") were merged; as mentioned previously, these databases already had the corresponding dates of diagnosis. The database resulting from merging the positive smear and culture cases with all the TB active cases ("unique positive smears and cultures all OCR databases" and "final unique active cases with TB treatment OCR files") was entitled "TB cases OCR files (lab and clinical) " and had 341 cases. The resulting database from merging this database with the cases who died with TB ("unique TB deaths OCR01") was entitled "TB cases all OCR databases". This database, with all the unique TB cases from all the OCR files, had 348 cases; however, this database still may have had MOTT cases included. In order to identify the MOTT cases, the database "unique MOTT all OCR databases" was merged into the database "TB cases all OCR databases ". The resulting database was entitled "TB cases all OCR databases with MOTT identified" and the variable identifying the MOTT cases was named "mott". There were 11 cases with MOTT, 5 of which were isolated cases and 6 were MOTT and TB. After deleting the isolated MOTT cases (the ones that did not have also TB), the database was entitled "unique TB cases all OCR databases". This database contains the unique TB cases in all the OCR databases as well as their diagnosis dates. There were 343 unique TB cases in all the OCR databases. T B classification. Active cases were classified in 3 categories according to the site affected by TB: pulmonary, pleural and extra-pulmonary. After identifying the active cases from the database "TB OCR23 ", the classification was performed based on the information available in the variables "disease classification " and "site TB affected", available on the same database. Active TB subjects recorded as TB respiratory, primary TB or progressive primary TB (in the "disease classification" variable) were classified as pulmonary TB (and code as 1); those recorded as TB other respiratory (which corresponds to pleurisy) were classified as pleural TB (coded as 2); finally, subjects recorded as TB non-respiratory (which corresponds to extra-pulmonary TB) were classified as extra-pulmonary TB in a variable named "tbclass ". Since there were some subjects recorded as TB respiratory (in the "disease classification" variable) but specified as pleurisy (in the "site TB affected" variable) and others with similar -78-Database Management errors, a correction of these misclassified subjects was done by checking them individually (16 of the 707 subjects were re-classified). The corrected classification was saved in a variable named "tbclasc". All the information was saved in a database entitled "active TB classification (OCR23) ". Next, in order to delete duplicates, conserving the dates and identifying every case with pulmonary TB, the name code and the dates of diagnosis (activity change date) were aggregated by the variable "tbclasc" (min) in SPSS. The resulting database was entitled "active TB classification OCR23 aggregated". This database still contained some duplicates (because of different dates). These were checked individually and since all duplicates occurred before 1990, the deletion of duplicates was performed by aggregating the name code by the "tbclasc "(min). The resulting database, containing the TB classification for the unique TB active cases in the TB OCR23 was entitled "unique active TB classification (OCR23) ", which had 542 cases. In order to add the TB classification to the unique cases (diagnosed using specific criteria) from all the OCR databases, the database "unique active TB classification (OCR23) " was merged into the database "unique TB cases all OCR databases ". The resulting database was entitled "unique TB cases all OCR databases with TB classification". Among the 343 TB cases in the OCR files, there were 35 with no TB classification and for whom the classification search was done individually in other databases where this information could be obtained (OCR01, OCR20, OAR20, OAR23, and active registry). After searching in these databases, there were still 16 cases for whom no TB classification was found. In two of these contacts the contact date recorded was after 2001 (even when the databases cover up to 2001). Thus, among the 343 cases found in the OCR files (using specific criteria for the diagnosis of clinical TB), there were 278 with TB classification. Most of the TB cases without classification were TB cases diagnosed at death. In addition, in order to properly identify all the information of interest (TB case, dates of diagnosis and TB classification) in a single database, the database "unique active TB cases OCR23" was merged into the database "unique TB cases all OCR databases with TB classification". The resulting database was entitled "unique TB cases all OCR databases with TB classification reviewed". The variable identifying the TB active cases and the date of diagnosis, were named respectively "tbactr" and "activcdr". -79-Database Management TB cases using specific criteria. The database "unique TB cases all OCR databases with TB classification reviewed" contained all the TB cases in the OCR files: cases diagnosed by smear and/or culture results; cases diagnosed clinically (active TB treated with 3 drugs), and those diagnosed histopathologically (at death). It also contained the corresponding dates of diagnosis and the TB classification information. The database "unique TB cases all OCR databases with TB classification reviewed" was merged into the database "final database 17f to produce the "final database 18". There were 343 cases in the OCR files, including previous TB cases (TB diagnosed before the contact date) Next, in order to include the TB cases diagnosed from the OAR and active registry databases, the database "unique TB cases OAR and AR databases " was merged into the "final database 18" to produce the "final database 19". There were 346 TB cases in this database. The "unique TB cases OAR and AR databases" added only 3 cases, all diagnosed clinically (smear and culture negative). Two of them were new cases (BLUN4601 and FRAE6504), while the other case had already been categorized as previous TB case. A new variable containing all the TB cases previously identified from the OCR databases and the 3 cases from the OAR and active registry databases was created and named "tbcasef. The corresponding dates of diagnosis from the corresponding databases were also included in a single variable named "datedxf. The positive smears and cultures previously added from "final database 18" (obtained from the OCR databases) were rechecked. There were 2 smear positive cases in the OAR andAR databases (NIPL36010 and NGUD8302) previously not recorded as positive in the OCR databases (although both were MTB culture positive in the OCR databases); none occurred before the contact date or the cohort year. All the cases with positive culture for MTB in the OAR and AR databases were also MTB positive in the OCR databases (in "final database 18"). Thus these additional variables were erasedfrom "final database 19" to create the "final database 20". TB cases using sensitive criteria. In order to include all the possible TB cases, all contacts with active tuberculosis (excluding MOTT), regardless of the number of drugs received, were added to the cases already identified. -80-Database Management To achieve this, the active TB cases previously identified in all the active cases from the OCR, OAR and AR files were selected as explained below. From the database containing all the active cases in the OCR files entitled "active TB cases OCR23 " duplicates were deleted by aggregating the name code and the variable "activetb " by the date of diagnosis (in the variable "activity change date"), but using this time the maximum date. The new database (with the same 542 unique active cases among the contacts previously identified), entitled "unique active TB cases OCR23 (maximum date)" contained thus all the active cases previously found in the OCR files, but now with their corresponding latest dates of diagnosis. The variable containing the active TB cases was named "activtbm ". This variable with the corresponding date (in variable "was added by merging the database "unique active TB cases OCR23 (maximum date)" with the "finaldatabase 20". The resulting database was entitled "final database 21". Cases with isolated MOTT were identified using the variable "mottocr" (included in "final database 20 ") from the database "unique MOTT all OCR databases ", and recoded as no TB cases; then, a new variable containing the active TB cases from OCR23 -with isolated MOTT excluded- was created and named "activtbr". The active TB cases were then added to the cases previously diagnosed according the specific criteria mentioned before (smear or culture positive, active TB and treatment with at least three 1st line drugs or Histopathological TB diagnosis), which had already been identified in the variable "tbcasef in the "final database 20". This was achieved by recoding first the TB cases from the variable "tbcasef into a new variable named "tbcasesn "; and then recoding the active TB cases from the variable "activtbr", into the variable "tbcasesn". There were 577 TB cases when sensitive criteria were used, i.e. when all the active cases from the OCR databases were included as TB cases. These were in the variable "tbcasesn", in "final database 21". These cases include previous TB (TB diagnosed before the contact date). Next, in order to include also the active cases diagnosed from the OAR andAR databases, the TB cases identified in the variable "tbcasesn" and those in the variable "acttboar" were compared. The variable "acttboar", which contains all active cases (from OAR databases) and cases diagnosed clinically (from the active registry), was created previously (in "final database 17e ") during the process of identifying all the previous TB cases. Therefore, there were 578 TB cases identified using sensitive criteria (among the contacts) in all the databases. -81-Database Management There was only 1 active case -no MOTT- in the variable "acttboar" that was not included in the variable "tbcasesn". This case was added to the ones in the variable "tbcasesn". However, this case had been already identified with previous TB. The, in order to have the TB classification for all the TB cases, the database "unique active TB classification (OCR23) " was merged into the "final database 21". The resulting database was entitled "final database 22". The variable containing the TB classification was named "tbclassi" and provided the TB classification for all but 20 cases. There were 5 more cases for which TB classification could be established after a manual search in all the OCR, OAR and AR databases that contain that information. As an additional feature, the number of drugs received by the all the cases identified with the sensitive criteria was investigated. The information regarding the number of drugs for the 542 active cases identified in the OCR23 file was obtained from the drug summary available in file OCR13. The unique subjects with the number of drugs received were saved in the database entitled "unique active cases number of drugs received (0CR13)" and identified in the variable "numdrugs " (by aggregating name code and active TB by maximum number of drugs received). The database "unique active cases number of drugs received (OCR13)" was merged into the "final database 22 " and the resulting database was entitled "final database 22a". The variable containing the number of drugs a case received was also named "numdrugs ". There were 36 cases for which the number of drugs received was not available in the database "unique active cases number of drugs received (OCR13) " and were searched individually in other files where the information is available (OAR13 andAR files). Cases with no treatment specified were considered to have received 0 drugs. Contacts in whom the treatment was not specified and just recorded as "other" (in the original files 0CR13 and OAR13), were searched in files OCR14 and OAR14 to determine the number of drugs they had actually received. There were 5 cases who had been recorded as "other" in the drug summary variable of: two subjects had received 5 drugs (KIMH5801 and BOWR6601), one subject had received 4 drugs, one had received 3 drugs and one had received 2 drugs. -82-Database Management Dates to calculate the time to event. For the TB cases, the dates of diagnosis were obtained from several sources according to the way the diagnosis was made: a) for the cases diagnosed through a positive smear and/or positive M T B culture, the date of diagnosis was the date the specimen was received by the laboratory; or i f not available, the date of diagnosis from file OCR23 (available in the variable "activity change date". The dates of a positive smear and/or culture were identified in the variable "datelab". b) For cases diagnosed clinically (with active TB), the date of diagnosis was the date the diagnosis was made, available in the variable "dateactb" for the OCR databases, and in the variable "activcdt" for the O A R and A R databases (only 1 case from the last 2 databases, but have had previous TB). c) For the cases diagnosed at death, the date of diagnosis was the date of death, available in the variable "dtedeath". These dates were all saved in a new variable named "datevent" in the database "final database 22a". The dates were available for all the cases identified with the sensitive criteria (which include all the ones identified with the specific criteria). For the contacts who did not develop TB, the censoring dates were determined from either: The date the contact left the province, the date the contact died or the latest date the contact was enrolled in the Medical Services Plan of British Columbia (MSP), in that order. The date contacts left the province were obtained from the OCR, OAR and AR files, since the variable "Reason treatment ended" identify those subjects who did not end their treatment because they left the province; and the date is obtained from the associated variable "date treatment ended", available in the same files. Thus, the created databases "unique subjects who left province (OCR]3)", "unique subjects who left province (OAR13)" and "unique subjects who left province (active registry) " containing the name codes and dates subjects ended treatment because they left the province, were appended in S-Plus into a single database entitled "subjects who left the province all databases ". This database was converted back into SPSS and duplicates were deleted by aggregating the name code and the variable identifying the subjects who left the province ("leftprov") by the date the treatment ended. The database containing the unique subjects who left the province and the dates their treatment ended was entitled "unique subjects who left the province all databases". There were 542 unique subjects (contacts and TB cases) whose treatment ended because they left the province. -83-Database Management Then, the database "unique subjects who left the province all databases " was merged into the "final database 22a" to create the "final database 22b". The dates the treatment ended for those who left the province were then added to the previous ones in the variable "datevent" (to the contacts without TB). The date contacts died was determined from the date of death obtained for all of the contacts from the Vital statistics database requested to CHSPR (in the BCLHD) , and available in the database "death dates". This database provided 1,004 death dates. These dates were added to the missing dates for the contacts without TB by merging the database "death dates" into the "final database 22b". The resulting database was entitled "final database 22c ". The dates subjects died were then added to the previously found dates in the variable "datevent" (to the contacts without TB). The latest date contacts were registered in the MSP was obtained also from the B C L H D in the database that contained the enrollment and cancellation dates for all the subjects in the province from 1990 until 2002 (named "rpbfinal"). Because in this database there are contacts who were still enrolled in the MSP as of 2003, it was decided (in agreement with Dr. S. Marion) to impute the date of the date of 12/Dec/2001 to all the contacts enrolled in MSP after this date. This was done to avoid biased estimates (of TB incidence, etc.) that could have resulted from including censoring dates beyond the dates when the information is available from the rest of the databases (which is December 31 s t of 2001). The new database was entitled "MSP enrollment and cancellation dates (cancellation dates imputed) ". Next, duplicates were deleted by aggregating the study ID by the cancellation date (maximum/latest). The new database, containing the cancellation dates (and study ID) of 34,168 of the contacts, was entitled "unique cancellation dates MSP". This database was merged into the "final database 22c " to create the "final database 23 ". A l l the dates of diagnosis for the TB cases and the censoring dates (for the contacts without TB) were identified in the variable date of event ("datevent" in the database). These dates were used in the calculation of the time to event for the survival analysis. The details of the time to event calculations and the quality control applied to assure the correct latest date was used in the calculations are explained below. -84-Database Management At this point a quality control check on dates was performed. Contacts who had the TB diagnosed after 2001, were recoded as no TB up to 2001, in the variables identifying the TB cases ("tbcasesn" for cases using sensitive criteria and "tbcasef for cases using specific criteria). The variables with the TB cases corrected according to the date of diagnosis were named "tbcasesc" and "tbcasefc" respectively in the "final database 23". There were 21 cases (among cases diagnosed using either sensitive or specific criteria) that occurred after 2001 and were therefore re-categorized as non-TB cases. Hence, there were 557 TB cases diagnosed among the contacts before 2001; however some of these contacts had TB in the past (categorized as previous TB). For the contacts without TB and in whom the censoring date was later than 2001 (in the variable "date of event"), the date was corrected as December 31s', 2001. The dates of events with the post-2001 dates corrected were saved in a new variable named "date of event corrected" ("dteventc") in the "final database 23". After the date quality control Check, the time to event was calculated by subtracting the date of event ("dteventc ") from the date of contact ("contdtec ", which in the "final database 23 " included the reviewed date of contact for the 4 cases mentioned previously in the "final database 17c"). The time to event in years was calculated as follows in SPSS: (dteventc -contdtec) / (60 * 60 * 24 *365.25). The time to event in years was saved in the variable named "timepr". In addition, the time to event in months was also calculated as follows: (dteventc -contdtec) / (60 * 60 * 24 *30.43 75). The time to event in months was saved in the variable "timemonp". As another quality control check, all the contacts with a TB diagnosis (using sensitive criteria, which includes all the TB diagnosis using specific criteria) and a negative differences in the variable "time" were searched in the variable previous TB ("prevtbf'), since all had to be identified in this variable. All of them had been already identified as having had previous TB. Among the contacts without a TB diagnosis, there were 810 in whom the cancellation date pre-dated the contact date. In order to have a maximum follow-up in these contacts, the date treatment ended (available in the file OCR13) was used imputed in the date of event corrected. -85-Database Management In order to do this, the date the treatment ended (or if not available, the date treatment started) from file OCR13 was extracted to a database entitled "date treatment ended (0CR13)" in a variable named "rxendsta". Duplicates were deleted by aggregating the name code by the variable "rxendsta" (maximum/latest). The database with the unique dates treatment ended (named "unique date treatment ended (OCR13) ") was merged into the "final database 23 " to create the "final database 23a ". There were 2 contacts with negative time to event that had been diagnosed with TB at death in 1950 and 1987 (SMIM2001 and FISL4801), however their contact date appears as 1993 (for both). There was also one contact with a negative age. After rechecking these subjects (with Ms. Fay Hutton, data manager at the DTB), it was found that both were erroneously recorded as contacts (and the dates did not correspond to them) and that they should have not been in the Contac's database, thus they were deleted. The final database 23a" had therefore 42,593 unique contacts recorded in the DTB between 1990 and 2000. Next, to obtain the latest available censoring dates, only the contacts with no TB diagnosis (in the variable "conterdt") who had received some treatment and in whom the treatment dates were posterior to the date of event corrected (in the variable "rxenusef) were identified. Then, the pertinent dates of event corrected that predated the contact dates, were replaced by the latest treatment dates (only for the contacts without TB). The new variable with the reviewed dates of events was named "dteventr ". In addition, to assure that the maximum censoring dates were obtained for all the contacts, the latest dates of service recorded for the contacts in any of the BCLHD databases (MSP, HSP, BCCA and PHM) were obtained and saved in a database entitled "unique namecode, study ID and maximum dates (MSP, HSP, BCCA and PHM) ". This database was merged with the "final database 23a " to create the "final database 24 ". The variable with all the dates of service from the BCLHD databases was named "dateserv ". The dates of service that were posterior to the dates of events ("dteventr ") were identified in the variable "dtesrvaf. Then, the pertinent dates of event reviewed that predated the contact dates were replaced by the latest dates of service (only for the contacts without TB). The new variable with all the latest dates of events available in all the databases was named "dteventf. After all the date's corrections, the time to event was re-calculated as described -86-Database Management before, and saved in the variables "time" and "timemon" in years and months respectively, in the "final database 24 ". There were 600 contacts without TB in whom the time to event was negative, i.e. in whom the censoring date pre-dated the contact date (these contacts were identified in the variable "cnterdt3 "). Almost all (598) of these subjects had cancelled the MSP (590) or had a medical service recorded in any of the BCLHD (8) before they became contacts with the TB case. Since there is no information available for these subjects after the contact date in any of the databases available, it was decided (in agreement with Dr. Marion) to exclude them from the analysis (see below). Next, from the "final database 24" (which contained all the variables created during the process), the variables that were not of interest in the analysis were deleted. The resulting database with only the variables of interest in all the 42,593 contacts was entitled "final database all contacts ". The 600 contacts who had no available information after the contact date in any of the databases were deleted from the "final database all contacts ". The resulting database, with 41,993 contacts was entitled "final contacts for analysis". The contacts not found in the BCLHD (8,426 contacts) were deleted and the new database, with 33,567 contacts was entitled "final contacts for analysis BCLHD all". For the purpose of eventually analyzing the risk of developing TB in contacts with previous TB, the maximum/latest date ofTB diagnosis was obtained (for all contacts with TB from the variable "actchmax" and from the OAR20 and OAR23 and AR files for 11 individuals not in that variable); the dates of events for the rest of the contacts (from the variable "dteventf) were appended to the latest dates ofTB diagnosis in a variable named "dtevntfm ". At this stage, the individual variables identifying high-risk persons were added to the database "final contacts for analysis BCLHD all". These variables identified injection drug users ("drugusers"), persons with recent arrival from a country with high TB prevalence ("tbctry5y "), residents or employees in high-risk settings "hirskset"), or hospital personnel ("hirskhos") were respectively entitled: "unique recent arrival from high TB country", "unique high-risk settings" and "unique hospital personnel". At this point, the exclusion criteria were applied and contacts with HIV (104 subjects) and with previous TB (324 subjects; 421 with HIV or previous TB) were deleted from "final contacts for analysis BCLHD all". The resulting -87-Database Management database, containing only the 33,146 contacts to be analyzed was entitled "final contacts for analysis BCLHD". Quality control of data. As noted in each of the sections of the database management above, the quality control of data was applied throughout the data management and included: a) Double-checking information for individuals (contacts and TB cases) and for the variables of interest in the 3 databases provided by the DTB (OCR, O A R and AR). b) Validation of information using databases external to the DTB, which include: MSP, HSP, B C C A , VS , PHM and the database at the Division of STD/AIDS Control of BC. The validation was performed for all the variables of interest available in these databases, c) Deletion of duplicates during the creation of every "final database", d) Auditing of numbers of subjects and records, which was done during the creation of every new database (and all the "final databases X " to assure all subjects and variables were accounted for. e) Validation of unusual (extreme) values: with the personnel who provided the databases, (Ms. Fay Hutton for the DTB and Ms. Denise Morettin for the BCLHD); with the Director of the Division ofTB Control (Dr. K . Elwood); and with search of the information in the subject's charts. The main quality control procedures utilized for the corrections of errors and problems found during the databases management are explained in Appendix III. Similarly, a special effort was put into the quality control for the diagnosis of TB. To assure that the contacts who developed TB (according to the name code) were the correct subjects, the following procedures were done: the name code was the identification means utilized for all the contacts, because it assures confidentiality and it is available for all the contacts. The alternative confidential identification means is the study ID, however it is not available for all contacts, but it is available for all the subjects with the diagnosis o fTB. In order to assure that the contacts who developed TB were correctly identified by the name code (since the name code may change, such as when people marry), the TB number of these contacts and the TB number of the cases diagnosed with TB were matched. In order to do this, the TB number of the contacts was obtained from the database OCR01 and the TB number for the cases from the active TB cases in file OCR23. A l l of the TB numbers of the contacts corresponded to the TB -88-Database Management numbers of the TB cases; the exception was 4 contacts (BENA9201, DYCK9601, LABP2101, STOC2101) that were diagnosed at death and had no TB number in the databases above. There was also one contact diagnosed with active TB who had no positive smears or cultures (in files OCR20, OCR21 and OCR23 and the corresponding OAR files), had received no treatment (in file OCR13, OCR14, OAR13 and OAR14) and had no date of diagnosis (OCR23, OAR23 and AR), but had had a positive smear in the past (file OCR20) and fibrotic changes in chest x-ray; this case was re-classified from active TB to previous TB. Hence, there was only one active TB case that was re-classified as previous TB. A quality control of the number of primary drugs for the treatment of TB received by the TB cases (when less than 3 were listed in the drug summary variable from the DTB) the following was performed: the number of drugs were obtained from the drug summary for each individual in files OCR13 and OAR13, where a list of the drugs received is provided; however the cases treated with less than 3 of the primary drugs for TB were rechecked individually (and manually) in all the pertinent files. The confirmation of the number of drugs received by a subject was done in the files OCR14 and OAR14, since they contain the names, doses and dates of each individual drug a subject received. Through this it was found that some cases that apparently received 1 drug (from the drug summary in OCR13 and OAR]3), had actually received 2 or 3 drugs (in the files OCR 14 or OAR14). The misclassified case and the erroneous number of drugs received by the TB cases were corrected in the databases "final contacts for analysis BCLHD all" and "final contacts for analysis BCLHD". There were some contacts with recorded TST sizes of more then 70 mm and up to 100 mm. These were rechecked (those above 90 were rechecked also in the subject's charts) and all were correct but one, which seemed to be explained by a typing error (instead of 99 mm the correct TST size was 0 mm). -89-Results RESULTS The results will be presented in 3 sections: 1) Descriptive analysis of all the contacts: prognostic factors and TB cases. 2) Assessment of the risk factors for TB development. 3) TST size and risk of developing TB. The results for 2) and 3) above will be presented for two different definitions of TST size. According to the first definition, TST size is simply the size of the reaction from the first TST applied after exposure to the active TB case. According to the second, TST size is the maximum of the reaction sizes from the first and second TSTs applied. Although an analysis based on the results of the 2 n d TST by itself would be interesting, this could not be performed due to the small number of TB cases among the contacts who had a 2 n d TST (51 TB cases among 12,694 contacts with a 2 n d TST). This small number precluded adjusting for the prognostic factors of interest described in the methods section. Even i f such an analysis could have been performed the results would have to be interpreted with care since a considerable proportion of contacts had no second TST and those who did might not be representative of the contact population as a whole. DESCRIPTIVE ANALYSIS OF A L L CONTACTS Among the 33,146 contacts from the DTB found in the B C L H D (HIV and previous TB excluded), there were 228 diagnosed with active TB (i.e. developed TB) during the 12-year follow-up period when sensitive criteria were used for the diagnosis of TB (TB case rate 688/100,000). When specific criteria were used for the diagnosis o f T B , 183 TB cases were found during the same period (TB case rate 552/100,000). These contacts originated from 3,462 TB cases diagnosed in the DTB from the beginning of 1990 to the end of 2000, providing on average 9.6 contacts per active case. The mean and median follow-up of the 33,146 contacts was 6.2 years (SD: 3.1 years). The follow-up ranged from 0 to 13 years; few subjects who were in contact with a TB case in 1989 were recorded in the 1990 database and followed up to 2001, however since only 3 subjects - 9 0 -Results completed 13 years of follow-up, the period of follow-up is considered to be 12 years. The follow-up was > 9 years for 20% of contacts and >10 years for 10% of contacts. There were 204,592 person-years of follow-up. Among the 228 TB cases diagnosed using sensitive criteria, 218 had received at least 2 of the primary drugs for TB treatment (167 cases received >3 drugs and 51 received 2 drugs) and 10 cases received <2 drugs (1 cases had received 1 drug, and 9 had received 0 drugs). The TB case who received 1 drug was smear positive (MOTT excluded). Among the 9 contacts who received 0 drugs, 7 were diagnosed at death and 2 because of a positive smear (MOTT excluded). Thus, all the 10 TB cases who received less than two drugs were diagnosed either at death or because of a positive smear or culture. After corroborating with Dr. K . Elwood (Director of the Division of TB Control) that some subjects diagnosed with active TB are treated with 2 drugs (especially children since 41 or 80% of the 51 that received 2 drugs were <15 years of age), that in all of these TB cases the reason for treatment was "chemotherapy" (treatment of active disease), as opposed to "chemoprophylaxis" (treatment of LTBI), and that the diagnosis for the TB cases treated with less than 2 drugs was well sustained, it was decided to perform the analysis with the TB cases that were diagnosed using the sensitive criteria. -91-Results PROGNOSTIC FACTORS. The most important descriptive statistics will be presented for each prognostic variable ahead; the summary descriptive statistics are shown in Table 12. Table 12. Description of prognostic factors and their availability in the databases Variable Mean (SD) / % Data availability Age (years) 35.2(18.1) 100% LTBI treatment* No treatment or <6m 95% Treatment > 6 m 5% 100% TB exposure (number of TB cases exposed) 1 94.6% 2 4.9% 3 0.5% 100% Ethnicity Canadian-born 55.3% Foreign-born 34.4% Aboriginal 10.3% 99.9% Type of contact Close household 13% Close non-household 39% Casual 48% 99.8% Gender Female 59% Male 41% 99.7% Infectivity of source case Positive smear 82% Negative smear 18% 96% Immunosuppression No 96% Yes 4% >95%+ High-risk subjects No 95% Yes 15% 57%-93%# TST size Median = 0 (range 0 to 100 mm) 86% Previous BCG Negative 68% Positive 32% 64% Geographical SE status (quintiles) Quintile 1 - 25% Quintile 2 - 22% Quintile 3 - 19% Quintile 4 - 18% Quintile 5 - 16% 48% Previous TST response No 95% Yes 5% N/A& Chest x-ray abnormalities No 99% Yes 1% N/A® *LTBI= Latent TB infection treatment. + Information searched in all the provincial health databases, thus likely valid for >95% of the contacts. & Information unspecified in most contacts, but reasonable to assume to be negative (No). * Completeness of information varies for different high-risk factors. ® Completeness unknown since the variables where the information was obtained included other type of data. -92-Results As it can be noted, there are two variables that were not in the planning phase of the study (TB exposure and chest x-ray abnormalities) and that were added during the database management because it was realized that information could be obtained for them. Age. The mean age of all the contacts was 35 years and ranged from 0 (infants up to 1 year) to 107 years old. The distribution of the ages is shown in Figure 1. Age differed significantly between Canadian-born contacts (mean 34.1 ± 17.9 yrs), Aboriginals (mean 29.4 ± 17.6 yrs) and foreign-bom (mean 38.9 ± 17.7 yrs); F: 462; p-value < 0.0001). Figure 1. Age distribution of contacts. 4000 0 10 20 30 40 50 60 70 80 90 100 Age -93-Results Latent tuberculosis infection (LTBI) treatment. There were 31,557 contacts (95%) who received no treatment or less than 6 months of treatment (30,546 or 97% of these received no treatment). There were 1,596 contacts (5%) who received > 6 months of treatment. Hence, among the 2,607 contacts that were initiated on LTBI treatment, 38.8% did not complete > 6 months. There was no missing information regarding LTBI treatment. Among the 4,309 household contacts, there were 3,694 (85.7%) who did not receive, or did not complete 6 months of treatment, and 615 (14.3%) who completed > 6 months of treatment. Among the 12,734 non-household contacts, 12,203 (96%) did not receive or complete 6 months of treatment, while 531 (4%) received > 6 months of treatment. Among the 16,030 casual contacts, there were 15,591 (97.3%) that did not receive or complete 6 months of treatment, and 439 of them (2.7%) received >6 months of treatment. Hence, there were 4 contacts who received LTBI treatment but had missing type of contact information. TB exposure. Among the 33,146 contacts, 31,357 were exposed to one TB source case (94.6%), 1,618 contacts were exposed to two TB cases (4.9%) and 171 (0.5%) to more than 2 TB cases. Ethnicity. Most of the contacts (18,322 of them) were Canadian-born, followed by the foreign-born (11,383) and Aboriginals (3,426), who accounted for over 10% of all the contacts as shown in Figure 2. This information was absent only for 15 subjects. -94-Figure 2. Ethnicity of contacts. 55 34 10 Canadian-born Foreign-bom Ethnicity Aboriginal -95-Results Type of contact. Regarding the closeness of the contact, it was found that 4.309 of the contacts were close household contacts (13%); 12.734 were close non-household contacts (39%); and 16,030 were casual contacts (48%) as shown in Figure 3. This information was missing for 73 contacts. Figure 3. Type of Contact Close Household Close Non-Household Casual -96-Results Gender. There was a predominance of females in the contacts as shown in Figure 4. There were 19.639 females and 13,411 males. Figure 4. Gender of contacts -97-Results Infectivity of source case. The infectivity of the source case was determined by the smear positivity. In 1,422 of the contacts (4.3%), there were no available smear results for the source case. Among the 31,724 with available information, in 26,089 contacts (82%), the source case had a positive smear, and in 5,635 (18%) of the contacts the source had a negative smear, as shown in Figure 5. Figure 5. Source case AFB smear 100 i Q. 8 0 -60 -40 H 20 J POSITIVE 18 NEGATIVE Smear Results -98-Results Immunosuppression. At the time of contact, there were 1,446 subjects (4%) with one or more immunosuppressive conditions, while for 31,700 of the contacts (96%), no immunosuppression was found in any of the databases (DTB and BCLHD). The most common immunosuppressive conditions were: diabetes mellitus in 641 subjects (1.9%), malignancy in 501 subjects (1.5%), use of corticosteroids in 146 subjects (0.4%), renal failure in 98 subjects (0.3%) and malnutrition in 34 subjects (0.1%). The use of antineoplastics, aplastic anemia and organ transplant were present in less than 0.1% of the subjects. Contacts were considered to be immunosuppressed when they were in contact with the source case if the immunosuppressive condition was present (recorded) in any of the databases within the following time periods: For renal failure, diabetes mellitus and renal transplant: <182 days from the contact date. For malnutrition and malignancy: between -182 and 182 days from the contact date. For corticosteroids and antineoplastic drugs (prescription dates): between -60 and 60 days from the contact date. The different time periods were selected on the assumption that they would identify whether the contact was immunosuppressed at the time of TB exposure. Renal failure, D M , and renal transplant are diseases that could have an immunosuppressive effect any time after diagnosis; however, because there is usually a lapse between the first date that these diseases are present and the date of diagnosis, 6 months after the contact date was considered a reasonable time window to identify whether an individual was already immunosuppressed at the time of contact. On the other hand, malnutrition, malignancy and immunosuppressive drugs, have an immunosuppressive effect only while the disease is present, or the drug is being taken. The time frame considered reasonable to identify whether a subject was immunosuppressed at the time of contact, was malnourished or malignancy diagnosed within 6 months before or after becoming a contact. Similarly, subjects were considered immunosuppressed at the time of contact i f immunosuppressive drugs prescribed within 60 days before or after the contact date. -99-Results Most of the iinmunosuppressed contacts had one or two immunosuppressive conditions (96%), but 3.6% had 3 conditions and 0.4% had 4 immunosuppressive conditions as shown in Figure 6. Figure 6. Immunosuppressive conditions 100 c CD a 1 2 3 4 Number of Immunosuppressive Conditions -100-Results High-risk subjects. There were 4,864 contacts (15%) who were considered high-risk persons, i.e. at higher risk for development o fTB, according to the published literature. These included persons who arrived recently (<5 years) from a country with high TB prevalence, who were injection drug users, residents or employees in high-risk settings (prisons, nursing homes, homeless shelters), or hospital personnel (physicians, nurses, respiratory therapists, laboratory technicians, x-ray technicians and medical students). This information was obtained from the DTB databases and only the information about injection drug use could be validated in the B C L H D , since there is no information available in the B C L H D for the other variables. Some subjects had more than one high-risk factor. There were 2,338 contacts with missing information regarding the country of birth or the date of arrival (7%) and complete information was available for 30,808 contacts (93%). Among these, 1,241 subjects (4%) arrived recently (<5 years) from a country with high TB prevalence and 29,567 contacts did not (96%). There were 126 contacts who were injection drug users (0.4%). Regarding residents or employees in high-risk settings (prisons, nursing homes, homeless shelters), the information was missing for 14,109 contacts (42.6%) and it was available for 19,037 contacts (57.4%); among these, 906 (5%) were residents or employees in high-risk settings. The information about hospital personnel (physicians, nurses, respiratory therapists, laboratory technicians, x-ray technicians and medical students) was missing for 13,648 (41.2%) of the contacts and it was available for 19,498 (58.8%) of them. Among those for whom the information was available, there were 2,797 (14%) recorded as hospital personnel. -101-Results Previous BCG. The B C G status was available for 21,049 of the contacts and among these, the B C G status was negative for 14,302 of the contacts (68%) and positive for 6,747 (32%). However, the B C G status was uncertain or missing for 12,097 (36%) of the contacts. Contacts with uncertain or missing B C G status were older (38.5 years old ± 19) than contacts with positive B C G (34.9 years old ± 14), and both groups were older than contacts with negative B C G (32.7 years old ± 19); all the pairwise comparisons were statistically significant (p<0.001). Contacts with negative B C G were more likely to be Canadian-born non-Aboriginal (74.3% of them) than foreign born (18.4%) and Aboriginal persons (7.3%). Contacts with positive B C G were more likely to be Foreign-born (63.5% of them) than Canadian-born (19%) and Aboriginal persons (17.5%). Contacts with uncertain or missing values were intermediate between the previous 2 groups suggesting a mixture of B C G positive and B C G negative contacts in this group since 53.1% of them were Canadian-born, 36.9% were foreign-bom and 10% were Aboriginal persons. Geographical socioeconomic status. A large variation in the socioeconomic (SE) status was found within subjects across the study period. Eight percent of the contacts in this study had differences in 4 quintiles of their SE status between 1990 and 2001; 22% of the subjects had differences of > 3 quintiles, and 38% of them differences > 2 quintiles. Because of these large variations in the SE status for a subject across the years, it was decided to use the mode of the quintile of the SE status. There were 3,596 contacts in the quintile 1; 3,425 contacts in the quintile 2; 3,100 of them in the quintile 3; 2,992 in the quintile 4 and 2,749 in the quintile 5. The distribution of the contact's geographical SE status is presented in Figure 7. Unfortunately the information was not provided by CHSPR for 17,284 of the contacts (52%). -102-Results Figure 7. Ecologic socioeconomic status c CD O i— <D CL S E Quint i les (Mode) Previous positive TST. There were 1,701 contacts (5%) with a previous positive TST (PPT), while 31,445 (95%) did not have a PPT. For most subjects the information regarding the PPT is not specified; however (according to Dr. Elwood and Dr. FitzGerald), it is reasonable to assume that the previous TST is negative in those for whom it is not specified as positive, since it was standard practice to not document the TST size when it was 0 mm. Chest radiograph abnormalities. A n abnormal chest x-ray consistent with previous tuberculous disease ("healed primary complex") was recorded for 221 (0.7%) of the contacts, while no abnormalities were recorded in the DTB databases for 32,925 of them (99.3%). It is important to note that these contacts with chest radiograph abnormalities are subjects with previous tuberculous disease suspected on chest radiograph, but with no previous TB stated or documented (since these were excluded). Also, unlike all the other variables in this study, the -103-Results chest radiograph information was obtained from 2 different variables that were not specifically designed to record exclusively the corresponding information. TST size. There were 28,640 contacts who had a TST after being exposed to a TB case, which represent 86.4 % of all contacts: 91.2% for household contacts, 85.2% for non-household contacts and 86.1% for casual contacts. As it can be observed in Figure 8, the distribution of the TST size was bimodal, with the majority of contacts having a TST size = 0. The median and mode TST size were = 0 mm and it ranged from 0 to 100 mm. Figure 8. TST size (1st applied after contact) 30000 T 1 20000->. o c <D cr 10000-0.0 10.0 20.0 30.0 40.0 50.0 60.0 70.0 80.0 90.0 100.0 TST Size (in mm.) Among the contacts with a TST, 76% had 0 mm and the remaining 24% had from 1 to 100 mm. -104-Results Table 13 shows the distribution of the TST size for all contacts (of the first TST applied after contact) according to the categories of interest. Table 13. Distribution of the TST size among contacts (1st TST) TST size category Frequency Percent 0 mm 21,670 75.7% 1-4 mm 446 1.5% 5-9 mm 1,143 4% 10-14 mm 2,294 8% > 15 mm 3,087 10.8% Total 28,640 100% Figure 9 shows graphically the distribution of the TST size categorized in all contacts. Figure 9. TST size distribution (categorized) 80 60 8 40 ID 20 i i i — i 1 0 mm 1-4 mm 5-9 mm 10-14 mm >=15mm T S T S i z e (mm) -105-Results T B C A S E S As mentioned before, among the 33,146 contacts of interest, there were 228 who developed TB during the 12-year follow-up period (TB rate 688/100,000). Among these, 120 TB cases had a negative smear or culture (52.6%) and 108 (47.4%) had a positive smear and/or culture: 49 of them had a positive smear (21.5%) and 103 had a positive culture for M T B (45.2%). Among the 228 TB cases, 5 were diagnosed at death and no classification was recorded for them. In the remaining 223 TB cases, 204 (91.5%) had pulmonary TB, 8 (3.6%) had pleural TB and 11 (4.9%) had extra-pulmonary TB, as shown in Figure 10. Figure 10. T B cases classification 100 30 91 60 t 20 P U L M O N A R Y P L E U R A L E X T R A - P U L M O N A R Y Tb Classification Figures 11 to 13 show the laboratory results for the pulmonary TB cases. Among the 204 pulmonary TB cases, 97 (47.5%) had a positive smear and/or culture, as shown in Figure 11. -106-Results Figure 11. Pulmonary TB cases confirmed microbiological!) CL NO YES Positive Smear or MTB Culture There were 47 TB cases (23%) who had a positive smear, as observed in Figure 12; and 93 (45.6%) had a positive culture for M T B , as shown in Figure 13. Figure 12. Pulmonary TB cases with positive smear 100 c <D CL NO YES Positive Smear -107-Results Figure 13. Pulmonary TB cases with positive culture N O Y E S MTB Positive Culture Of the 107 pulmonary TB cases without a positive smear or culture, 81 (76%) were in children <15 years of age. Among the 90 TB cases that occurred in children <15 years old, only 7 (7.8%) were smear or culture positive, while the remaining 83 TB cases (92.2%) were diagnosed clinically through a positive TST, abnormal chest x-ray and symptoms (if present). The percentage of patients having bacteriologic proof of disease is lower than in a general population ofTB cases. It is likely that greater surveillance of contacts results in a higher likelihood ofTB diagnosis at an earlier stage and therefore with no bacteriological proof. The positive smear and culture TB cases seem to be to low in the pulmonary TB cases diagnosed using sensitive criteria; however, when the same estimates were performed in the pulmonary TB cases diagnosed using specific criteria (as described in the methods section), the positive smear and culture results were modestly higher: there were 97 (63.8%) positive smear and/or M T B culture cases among 152 TB pulmonary cases. Of the TB cases diagnosed using specific criteria, 47 (31%) were smear positive and 93 (61.2%) were culture positive for M T B . -108-Results The rest of the analysis presented below refers only to the TB cases diagnosed using sensitive criteria (as specified initially). The occurrence of TB (using sensitive criteria) among the contacts during the 12 years of follow-up is shown in the Kaplan-Meier survival curve in Figure 14. There were few contacts followed-up for 13 years. They were exposed in 1989 but recorded in the 1990 database and followed-up until 2001. Figure 14. Suruvival curve. T B occurrence all contacts There is a sharp increase in TB risk during the first 2 years (especially during the first year) of follow-up; the increased TB risk continued up to just before 10 years of follow-up, and then plateaus. -109-Results As it can be appreciated in Table 14, most of the TB cases occurred within 2 years after exposure, and all of them had occurred by 10 years after exposure. Table 14. Occurrence ofTB among contacts after exposure to the source case 1 year Time aftei 2 years * exposure 5 \cars 10 years Number of TB cases 169 (74%) 186 (82%) 208 (92%) 228(100%) There were no TB cases beyond 10 years after exposure among the 383 contacts that were followed up to 12 years. -110-Results RISK F A C T O R S F O R T B D E V E L O P M E N T The univariate analysis of TB occurrence among contacts according to the risk factors of interest will be presented next. The distribution of each variable among the contacts that developed TB is provided initially, followed by the risk of developing TB obtained from Cox's proportional hazards model, and by the survival curves (survival free ofTB) for each of the variables from the Kaplan-Meier survival analysis. The scale of the Y-axis was modified as required in order to be able to show graphically the difference in TB rates among the groups of interest. The assessment of the proportional hazards assumption through the log (-log) of the survival function versus time for each of the variables is provided in the Appendix IV; no clear violation of the proportional hazards assumption was observed in these plots, or in the log (-log survival function) versus log time (not included for space and simplicity of presentation reasons). Because it was found that the survival plots were a more useful tool for the practical assessment of the proportional hazards assumption in this study, they were included in this section. On the other hand, when the statistical approach to assess the proportional hazards assumption was applied, which involves comparing the model containing the interaction term [covariate x (log time - mean log time)] to the model only with the covariate (prognostic factor) of interest, all the interaction terms were significant; even when assessing only the coefficients of the interaction terms, in 2/3 of them the hazard ratio was >3. Because of the discrepancy of the statistical method with the 2 graphical methods, and since the statistical results can be explained by the large number of subjects in the study and the large number of comparisons performed (18), it was decided to use the graphical methods for the final assessment of the proportional hazards assumption. Age. The median age of the contacts who developed TB was 21 yrs and it ranged from 0 years (infants < 1 year old) to 101 years. The majority of the TB cases were children and young adults, with fewer subjects after 35 years of age, as observed in Figure 15; most of cases (70%) presented at or before 35 years of age. - I l l -Results Figure 15. Age distribution ofTB cases 50 T 40->» 30-o c CD 3 cr 2 U . 20-10 J 0 10 20 30 40 50 60 70 80 90 100 Age (years) The age was divided in quartiles, and the TB rate was calculated for the quartiles of age, as shown in Table 15. Table 15. Number ofTB cases by age (in quartiles) Quartile 1 0-21 yrs Quartile 2 22-35 yrs Quartile 3 36-47 yrs Quartile 4 48-101 yrs Contacts/TB cases 8,248/112 8,289/43 8,287/29 8,286/44 TB rate 1,352/100,000 519/100,000 350/100,000 531/100,000 There was a very high TB rate among the contacts <21 years of age, with gradual decline in the ages 22-47 and a slight peak afterwards (X 2 - 73.8; p-value < 0.001). Next, age was categorized into more groups, especially in the younger ages where the highest TB rate occurred, to try to better determine the age groups at highest risk of developing TB. The categories of age (in years) used were the following: 0 (infants), 1-5, 6-10, 11-15, 16-20, 21-30,31-40,41-50, 51-60 and > 60 yrs old. -112-Results The observed TB rates during the 12-yrs follow-up period according to the above age categories are depicted in Figure 16. Figure 16. TB rate by age groups in contacts of TB cases (12-yr follow-up). i i i i i i 1 1 i i 0 yrs old 6-10 yrs 16-20 yrs 31-40 yrs 51-60 yrs 1-5 yrs 11-15 yrs 21-30 yrs 41-50 yrs >60yrs Age categorized Because of the distribution of age, a Cox regression model was created with age and another with the natural logarithm of age. It was observed that the model including the logarithm of age had a better fit than the model containing age, as presented in Table 16. Table 16. Cox regression. Only age in the model. Variable B Hazard ratio (95% CI) Wald test P-value Age" -0.029 0.971 (0.964 to 0.979) 48.8 O.001 Logarithm of age -0.479 0.620 (0.569 to 0.675) 118.7 <0.001 a: -2 LL = 4,645.5; b: -2LL = 4622.5. In spite of the results observed in the univariate analysis, when age and log age were added separately in the final Cox model containing all the other prognostic factors (see below the -113-Results model with purposeful selection of variables), it was observed that the model with age (likelihood ratio test of age added to the model = 119.7) had a better fit than the model with log age (likelihood ratio test of log age added to the model = 90.2); hence, age was used in the subsequent analyses. Based on the observed high risk in the contacts 0-10 yrs of age, the contacts were categorized in 2 groups for analysis: a) contacts 0-10 years old, and b) contacts > 10 years old. There were 75 TB cases among the 2,714 contacts 0-10 years of age (TB rate 2,763/100,000 during the observation period); seventy-two of them (96%) were diagnosed clinically and only 3 (3%) microbiologically through a positive smear or culture. There were 153 TB cases among the 30,432 contacts >10 years of age (TB rate 503/100,000); the diagnosis was clinical in 48 of them (31.4%) and confirmed microbiologically in 105 (68.6%). The difference in TB rates between the two age groups (0-10 year-old and >10 year old) was statistically significant ( X 2 = 186.4; p-value < 0.001). The hazard rate for TB development was more than 5 times higher for contacts 0-10 years old than for contacts >10 years old, as observed in Table 17. Table 17. Cox regression. Only age (categorized) in the model. Variable B Hazard ratio (95% CI) P-value 0-10 years old* 1.703 5.49 (4.16 to 7.24) <0.001 •Compared to contacts >10 years old. Model -2 LL = 4,585.7 -114-Results The Kaplan-Meier survival curve of TB occurrence according to the age categories of the contacts is presented in Figure 17. This curve was also used to check the proportional hazard assumption by comparing the survival curves of age categorized as 0-10 yrs old and >10 yrs old. The K - M survival curves did not cross over, as observed in Figure 17; the log (-log) of the survival function versus time graph did not suggest violations of the proportional hazards assumption either (see Appendix IV). Figure 17. Survival curve. TB occurrence by age groups Survival Curve TB Occurrence by Age groups 0 2 4 6 8 10 12 "lime (Years) As observed, the risk of TB is much higher for contacts 0-10 years old during the entire observation period. It is important to note that among the contacts 0-10 years old the TB cases occurred soon after exposure to the source case and all of them occurred within 3 years after exposure (in fact the last TB case was 2.9 years after contact to the source case). -115-Results Gender. The distribution of gender among the 228 TB cases is shown in Figure 18; as can be observed, there was a predominance of males (130 males and 98 females). Figure 18. Gender distribution ofTB cases 60 -50 • 40 -8 30-<D Q_ 2 0 -1 0 -0 Female Male Gender There were 98 TB cases among the 19,639 female contacts (TB rate 499/100,0000 during the observation period); there were 130 TB cases among the 13,412 male contacts (TB rate 969/100,000). The difference in TB rates was statistically significant ( X 2 = 25.7; p-value < 0.001). The hazard rate for TB development was almost double for male contacts than for female contacts, as observed in Table 18. Table 18. Cox regression. Only gender in the model. Variable B Hazard ratio (95% CI) P-value Males* 0.675 1.96(1.51 to 2.55) <0.001 •Compared to females. Model -2 LL = 4,671.3 -116-Results The survival curve ofTB occurrence in males and females is shown in Figure 19. Figure 19. Survival curve. TB occurrence by gender M a l e s 0 2 4 6 8 10 12 Time (Years) It can be observed that during the entire follow-up period, the TB rate was higher in males than in females. -117-Results LTBI treatment. Most TB cases (218 of the 228 TB cases) had received no proper LTBI treatment (no treatment or <6 months of treatment), while only 10 (4.8%) had received treatment > 6 months, as observed in Figure 20. Figure 20. LTBI treatment of TB cases 120 100 80 8 60 H CD a 40 H 20 NO LTBI Treatment LTBI Treatment NO LTBI Treatment (<6m or None) vs. LTBI Treatment (>=6 m) There were 218 TB cases among the 31,557 contacts who received < 6 months or no treatment (TB rate 691/100,000); and 10 TB cases among the 1,596 contacts who received > 6 months of treatment (TB rate 629/100,000). This difference in TB rates was not statistically significant ( X 2 = 0.084; p-value = 0.77). The hazard rate for TB development was slightly lower for contacts who received LTBI treatment < 6 months or did not receive any treatment than for the contacts who received LTBI treatment > 6 months, as observed in Table 19; however, the difference in hazard rates was not statistically significant. Table 19. Cox regression. Only LTBI treatment in the model. Variable B Hazard ratio (95% CI) P-value LTBI treatment > 6 months* -0.093 0.911 (0.48 to 1.72) 0.77 •Compared to LTBI treatment < 6 months and no treatment. Model -2 LL = 4,698.2 -118-Results The K - M survival curve according to the LTBI treatment is shown in Figure 21. Figure 21. Suruvival curve. T B occurrence by L T B I treatment II I IIIHIIIH ! • II I !• I I 1 1 1 1 1 r 0 2 4 6 8 10 12 Time (Years) There seems to be a lower risk of TB development among the contacts who received LTBI treatment during the first 6 years after contact, but the effect disappears afterwards. This could be explained by re-infection in contacts who completed 6 months of LTBI treatment, since 8.5% of them were exposed to > 2 TB cases, compared to 5.5% of contacts who received < 6 months or no LTBI treatment (X 2 = 30.8; p < 0.001). Since the risk of TB may be different between contacts who received no LTBI treatment (e.g. because there was no indication) and contacts who received < 6 months of treatment (in whom it was indicated but the subjects did not take it), the analysis was repeated with the following 3 -119-Results groups of contacts according to the duration of the LTBI treatment: a) no LTBI treatment; b) LTBI treatment < 6 months; LTBI treatment > 6 months. There were 30,546 (92%) contacts that received no LTBI treatment, 1,004 (3%) that received < 6 months and 1,596 (5%) that received > 6 months, as shown in Figure 22. Among the 228 TB cases, 203 subjects had received no LTBI treatment (89%), 15 of them (6.6%) had received < 6 months and 10 had received > 6 months of treatment. Figure 2 2 . LTBI treatment ofTB cases (detailed) 92 80 -Q _ 60 40 20 4 No Treatment Treatment < 6m Treatment >=6m LTBI Treatment There were 203 TB cases among contacts who did not receive treatment (TB rate 665/100,000), 15 TB cases among contacts who received < 6 months (TB rate 1,494/100,000). and 10 TB cases among contacts with > 6 months of LTBI treatment (TB rate 627/100.000). The difference between the 3 groups was clinically and statistically significant ( X 2 = 9.9; p-value = 0.007). The hazard ratio was more than 2 times higher in the contacts that received less than 6 months of LTBI treatment compared with those who did not receive any treatment, as shown in Table 20. -120-Results Table 20. Cox regression. Only LTBI treatment in the model (0, <6 m and >6 m). Variable B Hazard ratio (95% CI) P-value LTBI treatment < 6 months* 0.845 2.33 (1.38 to 3.94) 0.002 LTBI treatment > 6 months* -0.058 0.944 (0.50 to 1.78) 0.859 •Compared to no LTBI treatment. Model -2 LL = 4,690.2 The survival curve for the contacts according to the duration of LTBI treatment is presented in Figure 23. Figure 23. Survival curve. TB occurrence by LTBI treatment (detailed). Time (Years) As it can be appreciated, the contacts who received at least 6 months of LTBI treatment had a similar TB rate in the long term than those who received no treatment. The contacts who received less than 6 months of treatment had a very high TB rate and with a different pattern -121-Results than that observed in the entire group (steep negative linear trend during the first 8 years of follow-up). However, it is interesting to note that the TB rate of the contacts who received any treatment was lower than the TB rate of the contacts who did not receive treatment in the first 2 years of follow-up. It is important to note however, that the line for the contacts that received no treatment is crossed by the lines of the contacts that received < 6 months (at 2 years) and for the contacts that received > 6 months (at 6 years). Thus, the proportional hazards assumption between the contacts who received no treatment and the other 2 groups does not hold. The similar overall TB rate between the contacts who received > 6 months of treatment and those who received no treatment could be partly due to a lower risk in the later group (if treatment was not offered because they were deemed non-infected); this is supported by the distribution of TST size (in the categories of interest): 79.4% of the contacts who did not receive treatment had a TST size of 0 mm; 1.5% had 1-4 mm and 19.1% had a TST size > 5 mm. On the other hand, in the group of contacts who received > 6 months of treatment, 76.2% had a TST size > 5 mm and 23.8% had between 0 and 4 mm (98% of these had radiographic abnormalities compatible with previous TB chest, 99.3% were injection drug users and 88.3% were immigrants who had arrived recently from a country with high TB prevalence). In order to assess the best way to categorize LTBI treatment for further analysis, it was categorized by collapsing no LTBI treatment and treatment > 6 months in a single group (since the overall risk of TB in the 2 categories is similar) and comparing this category to the category of treatment < 6 months; when this was performed, the hazard ratio for the <6 months category was the same and this model did not improve the fit of one with 3 categories (model -2 L L = 4,690.3); furthermore, the lines crossed as observed in the previous graph for no LTBI treatment and LTBI treatment < 6 months. The other alternative, which is to divide the contacts in groups according to the LTBI treatment and the times when the curves crossed over for the different LTBI treatment groups, could serve the purpose of an exact description of risk for each period divided by the crossovers, but not the purpose of an overall assessment of the role of LTBI treatment with all the other prognostic factors in a prognostic model. -122-Results Hence, for all the subsequent analysis, it was decided to use the categories: no LTBI treatment; LTBI treatment < 6 months; and LTBI treatment > 6 months. This is because even when for the individual groups and time periods the hazard ratio does not describe the risk of TB accurately (due to the crossing of the survival curves), it describes appropriately the risk of TB for each of the LTBI treatment groups during the overall follow-up period; i.e. that the risk of TB for the contacts with no LTBI treatment and those with LTBI treatment > 6 months is similar; and that the risk for the contacts who received treatment < 6 months is much higher then the previous 2 groups. Number of TB cases exposed. The distribution of the number of TB cases for the contacts who developed TB is shown in Figure 24. Among the 228 contacts who developed TB, 203 (89%) were exposed to 1 TB source case; 23 subjects (10%) were exposed to 2 TB cases, and only 2 subjects (1%) were exposed to >2 TB cases. Figure 24. TB cases. Number of TB cases exposed. 100 a. Number of TB Cases Exposed There were 203 TB cases among the 31,357 contacts exposed to one TB source case (TB rate 647/100,000 during the observations period); 25 TB cases among the 1,789 contacts exposed -123-Results to > 2 TB source cases (TB rate 1,397/100,000). The difference between the two groups was statistically significant ( X 2 = 13.9; p-value < 0.001). The risk of TB development was twice as high for contacts exposed to > 2 TB cases compared to those exposed to only 1 TB case during the observation period (Table 21). The risk was also increased in contacts exposed to more than 2 TB cases, but due to the small number of outcomes this result was not statistically significant (and the point estimate may be unreliable). Table 21. Cox regression. Only number ofTB cases exposed in the model. Variable B Hazard ratio (95% CI) P-value Exposed to >2 TB cases* 0.706 2.02 (1.34 to 3.07) 0.001 •Compared to being exposed to 1 T B case. Model - 2 L L = 4,689.0 -124-Results The survival curve for TB development according to the "intensity" of the exposure to TB is shown in Figure 25. Figure 25. Survival curve. T B occurrence by T B exposure. It can be noted that the risk of TB is higher in contacts exposed to >2 TB cases than those exposed to 1 TB case throughout the observation period. -125-Results Ethnicity. Among the 228 TB cases, 100 were Canadian-born (44%), 73 were Aboriginal people (32%) and 55 were foreign-bom (24%), as shown in Figure 26. Figure 26. Ethnicity of TB cases. e CD CL Canadian Born (Non-A Aboriginal Foreign Born Ethnicity One hundred cases of TB developed among the 18,322 Canadian-born contacts (TB rate 546/100,000); 73 TB cases among the 3,426 Aboriginal people (TB rate 2131/100,000); and 55 TB cases among 11,383 foreign-bom contacts (TB rate 483/100.000). The difference between the groups was statistically significant ( X 2 = 11.8; p-value < 0.001). The risk of TB for Aboriginal contacts was almost 4 times as that for Canadian-bom non-Aboriginal contacts and statistically significant, while the risk for the foreign-bom was basically the same than the risk for the Canadian-bom, as observed in Table 22. Table 22. Cox regression. Only ethnicity in the model. Variable B Hazard ratio (95% CI) P-value Aboriginal* 1.356 3.88 (2.87 to 5.25) <0.001 Foreign-born * -0.105 0.900 (0.65 to 1.25) 0.531 •Compared to Canadian-born. Model -2 LL = 4,619.7 -126-Results The survival curve for ethnicity is shown in Figure 27. Figure 27. Suruvival curve. TB occurrence by ethnicity. 0 2 4 6 8 1 0 1 2 Time (Years) As it can be noted in the survival curve, the TB risk was higher for Aboriginal people compared to Canadian-born non Aboriginal and foreign-born throughout the observations period, while the risk for the last 2 groups was similar during the entire period. -127-Results Type of contact. Among the 228 TB cases, there were 102 close household contacts (44.7%), 85 close non-household contacts (37.3%) and 41 casual contacts (18%). The distribution of the type of contact is shown in Figure 28. Figure 28. T B cases. Type of contact. c 8 i— En-close Household Close Non-Household Casual Contact Type Among the 4,309 close household contacts, 102 developed TB (TB rate 2,367/100,000 during the observation period); 85 of the 12,734 close non-household contacts (TB rate 668/100,000); and 41 of the 16,030 casual contacts (TB rate 256/100,000). The observed differences were statistically significant (X 2 = 221.3; p-value < 0.0001). The risk ofTB in household contacts was almost 10 times of that in casual contacts; the risk in close non-household contacts was more than twice than that in casual contacts. Both groups had a statistically significant increase in risk, as shown in Table 23. Table 23. Cox regression. Only contact type in the model. Variable B Hazard ratio (95% CI) P-value Close non-household* 0.969 2.64 (1.82 to 3.83) <0.001 Close household * 2.26 9.57 (6.66 to 13.75) <0.001 •Compared to casual contact. Model -2 LL = 4,527.6 -128-Results The Kaplan-Meier survival curve in Figure 29 shows the risk of TB according to the type of contact. Figure 29. Survival curve. TB occurrence by closeness of contact. Casual Non-Household Household 0 2 4 6 8 10 12 Time (Years) The risk of TB for the casual, close non-household and household contact is increasingly higher throughout the observation period and it is much more important for the household contacts than for the other two groups. The risk of TB in the close non-household contacts is higher than the casual contacts throughout the observation period. -129-Results Infectivity of source case. Among the 226 TB cases for whom the smear results of the source were available, the smear was positive in 198 of them (88%), as shown in Figure 30. Figure 30. TB cases. Infectivity of source case. 100 8 EX-POSITIVE NEGATIVE Source Case A F B Smear Results However, i f the 2 source cases with no smear results are considered as negative, the percent of source cases with a positive smear is 87%. Among the 26,089 contacts in whom the source case had a positive smear, there were 198 that developed TB (TB rate 759/100,000 during the observation period); among the 5,635 contacts in whom the source case did not have a positive smear (because all were negative or no sample was available for smear), there were 28 that developed TB (TB rate 497/100,000). This difference was statistically significant ( X 2 = 4.5; p-value = 0.034). The risk of TB in contacts in which the TB source case had a positive smear was 1.6 times higher than contacts in which the source did not have a positive smear. This increased risk was statistically significant, as observed in Table 24. -130-Results Table 24. Cox regression. Only smear results ofTB source case in the model. Variable B Hazard ratio (95% CI) P-value Smear positive* 0.466 1.59 (1.07 to 2.37) 0.021 •Compared to contacts in which the source did not have a positive smear. Mode -2 L L = 4,631.7 The survival curve in Figure 31 shows the risk o fTB according to the results of the TB source. Figure 31. Survival curve. TB occurrence by smear results ofTB source. As observed in Figure 31, the increased TB risk in contacts in which the TB source had a positive smear was higher throughout the observation period. -131-Results Immunosuppression. Among the 228 TB cases, there were 211 (92.5%) who did not have an immunosuppressive condition at the time of exposure to the TB source, while 17 did (7.5%), as shown in Figure 32. Figure 32. TB cases. Immunosuppressed at the time of contact. 100 CL NO YES Immunosuppressed at the time of Contact The distribution of the immunosuppressive conditions is shown in Figure 33: 13 of the contacts who developed TB had a history of alcohol excess (5.7%), 6 had malignancy (2.6%), 5 had malnutrition (2.2%), 4 had diabetes mellitus (1.8%) and 2 were using corticosteroids (1%) at the time of exposure to the TB source case. -132-Results Figure 33. TB cases. Immunosuppressive conditions at the time of contact. 100 Q. None Malignancy DM OH Malnutrition Corticosteroids I m m u n o s u p p r e s s i v e C o n d i t i o n s at the t ime of C o n t a c t There were some subjects who had more than one immunosuppressive condition: 11 of the 17 TB cases that were immunosuppressed had one immunosuppressive condition (65%), while the other 6 subjects (35%) had more than 1 immunosuppressive condition. Among the 31,700 contacts who were not immunosuppressed at the time of exposure to the source case, 211 developed TB (TB rate 666/100,000), while 17 developed TB among the 1,446 immunosuppressed contacts (TB rate 1176/100,000). The difference between the groups was statistically significant ( X 2 = 5.3; p-value = 0.022). The risk of TB in contacts who were immunosuppressed was almost twice as high as the risk for contacts who were not immunosuppressed. This difference in risk was statistically significant as observed in Table 25. Table 25. Cox regression. Only immunosuppression in the model. Variable B Hazard ratio (95% CI) P-value Being immunosuppressed* 0.645 1.91 (1.16to3.12) 0.011 •Compared to contacts who were not immunosuppressed. Model -2 L L = 4,692.8 -133-Results The survival curve in Figure 34 shows the risk of TB according to the immune status of the contacts. Figure 3 4 . Survival curve. T B occurrence by immune status. Non-immunosuppressed IHIiiiilHIIUIIIl il i l l I i i Immunosuppressed III! I! iM! Ill—I I I 10 12 Time (years) The risk of TB development was higher for the immunosuppressed contacts throughout the observation period. When the type of immunosuppressive condition was considered, the following results were observed. Among 34 contacts with malnutrition, 5 developed TB during the observation period (TB rate 14,706/100,000); among 292 contacts with alcoholism, 13 developed TB (TB rate 4,452/100.000); among 146 receiving corticosteroids by the time of the exposure, 2 developed TB (TB rate 1,370/100,000); among 501 with malignancy, 6 developed TB (TB rate 1.198/100,000), and among 641 contacts with diabetes mellitus, 4 developed TB (TB rate 624/100,000). None of the 98 contacts with renal failure, the 28 contacts with carcinoma in -134-Results situ, the 13 contacts who received antineoplastic medications (by the time of contact), the 6 contacts who had an organ transplant, or the 7 contacts with aplastic anemia, developed TB during the observation period. The difference between groups was statistically significant (X = 164; p-value < 0.0001). The risk o fTB in contacts with malnutrition was overall the highest, being 28 times the risk of non-immunosuppressed contacts, as shown in the Table 26. The risk of TB for contacts with alcoholism was 7.5 times the risk of non-immunosuppressed contacts and statistically significant. The risk of TB was also increased in contacts with malignancy and in those who received corticosteroids around the time of the TB exposure, however it was not statistically significant. The risk of TB among contacts with D M (type I and type II, since no differentiation is made in the databases) was not different than in non-immunosuppressed contacts and the same was observed for those with renal failure. Contacts with organ transplant, aplastic anemia and those receiving antineoplastics did not have an increased risk of developing TB compared to the non-immunosuppressed contacts (there were no TB cases in these groups), however the number of subjects at risk in these 3 groups was very small (< 13 subjects each). Table 26 presents only the hazard ratio for the immunosuppressive conditions for which TB cases were observed during the follow-up period, and thus the hazard ratio calculations could be performed. Table 26. Cox regression. Only immunosuppressive conditions in the model. Variable B Hazard ratio (95% CI) P-value Malnutrition* 3.345 28.37 (11.67 to 68.96) <0.001 Alcoholism* 2.02 7.52 (4.29 to 13.18) <0.001 Corticosteroids* 0.83 2.29 (0.57 to 9.24) 0.243 Malignancy* 0.73 2.08 (0.92 to 4.70) 0.076 DM* 0.04 1.04 (0.39 to 2.81) 0.931 •Compared to contacts who were not immunosuppressed. Model -2 L L = 4,640.7 -135-Results The Kaplan-Meier survival curve in Figure 35 shows the risk o f T B according to the immune status of the contacts. Figure 35. Survival curve. TB occurrence by immunosuppressive conditions. W - • V V V * • I I I I I I I 0 2 4 6 8 10 12 Time (Years) 'Organ Transplant, Aplastic Anemia, Antineoplastics, CA in SH As observed, the increased TB risk for contacts with malnutrition and alcoholism is present throughout the observation period. The line of contacts with D M overlaps the line of contacts with no immunosuppression (none). The line for renal failure, organ transplant, aplastic anemia, antineoplastics and carcinoma in situ remains flat since there were no TB cases in contacts with these immunosuppressive conditions. -136-Results High-risk subjects. As shown in Figure 36, among the 228 TB cases, 25 of them (11%) were high-risk: injection drug users, recent arrivals (<5 y) from a country with high TB prevalence, residents or employees in high-risk settings and hospital personnel. Figure 36. T B cases. High-risk subjects. 100 O CL YES High-risk subjects Among the 25 high-risk subjects that developed TB, 5 (20%) were injection drug users; 15 (60%) were recent arrivals from a country with high TB prevalence (CHTB); and 6 (24%) were residents or employees in a high-risk setting (one of these subjects was an injection drug user). There were no hospital personnel among the high-risk subjects that developed TB. Among the 28,282 contacts who were not high-risk, 203 developed TB (TB rate 718/100,000); 25 developed TB among the 4,864 high-risk contacts (TB rate 514/100,000). The difference between the groups was not statistically significant (X = 2.52; p-value = 0.112). However, when the individual high-risk factors were considered it was found that among the 126 contacts who were injection drug users, 5 developed TB during the observation period (TB rate 3,968/100,000), which was statistically significant (Fisher Exact's test p-value = 0.002). -137-Results Among the 1,241 contacts who were recent arrivals from a CHTB, 15 developed TB (TB rate 1,209/100,000), which was statistically significant (X 2 = 4.38; p-value = 0.036). Among 906 residents or employees in high-risk settings, 6 developed TB (TB rate 662/100,000); this result was not statistically significant ( X 2 = 0.76; p-value = 0.382). There were no TB cases among the 2,797 contacts who were hospital personnel during the observation period. The risk of TB in contacts who were injection drug users was more than 5 times the risk for contacts who were not high-risk. This difference in risk was statistically significant as observed in Table 27. Contacts who were recent arrivals from CHTB had 1.7 times the risk of developing TB compared with no high-risk contacts and this was also statistically significant. Being a resident or employee of a high-risk setting did not increase the risk of developing TB. The hospital personnel risk is not provided since no contact in this group developed TB. Table 27. Cox regression. Only high-risk in the model. Variable B Hazard ratio (95% CI) P-value Injection drug users* 1.73 5.64 (2.32 to 13.71) <0.001 Recent arrival from CHTB* 0.536 1.71 (1.01 to 2.89) 0.045 Resident/employee of high-risk setting* -0.219 0.803 (0.33 to 1.95) 0.803 •Compared to contacts who were not high-risk. Model -2 LL = 4,647.7 -138-Results The survival curve for the risk o fTB according to the high-risk status of the contacts is shown in Figure 37. Figure 37. Survival curve. T B occurrence in high-risk subjects. d i i i 1 1 1 r 0 2 4 6 8 10 12 Time (years) The high risk o f T B for the injection drug users and for subjects who were recent arrivals (<5 years) from a CHTB was observed throughout the observation period. The rate of TB development for the residents or employees in high-risk settings is similar to the contacts with no high-risk and overlaps after 8 years of follow-up. The line for the hospital personnel remains flat since there were no TB cases among this group. -139-Results TST size. The TST size (of the first TST applied after exposure) among the 221 contacts who developed TB (and for whom the TST results were available) did not have a normal distribution, as observed in Figure 38; it is bimodal and skewed to the right. The median TST size was 15 mm and ranged from 0 to 80 mm. Figure 38. TB cases. Distribution of TST size (1st applied). 100 T 1 so i TST Size (mm) Because of the observed distribution of TST size it was decided to do the analysis with TST size categorized as: Omm, 1-4 mm, 5-9 mm, 10-14 mm and > 15 mm. The distribution of the TST size categorized among the contacts that developed TB is shown in Figure 39. Among the 228 contacts who developed TB, the TST size was available for 201 (88%) of them. Among these, there were 54 subjects (27%) with TST size of 0 mm after contact; 1 subject (0.5%) in the 1-4 mm category; 6 subjects (3%) in the 5-9 mm category; 37 subjects (18.5%) in the 10-14 mm category and 103 subjects (51%) in the > 15 mm category. Hence, 27.5% of the contacts that developed TB had a TST size <5 mm (27% had 0 mm), while 72.5% had a TST size > 5 mm. The distribution of TST size Categories among the contacts who developed TB is ascending for those who had some response in the TST. -140-Results Figure 39. TB cases. Distribution of TST size categorized (1st applied). 0 mm 1-4 mm 5-9 mm 10-14 mm >=15mm TST Size Among the 21,670 contacts who had a TST of 0 mm, 54 developed TB during the observation period (TB rate 249/100,000); among the 446 contacts in the TST size category 1-4 mm, 1 developed TB (TB rate 224/100,000); among the 1,143 in the 5-9 mm category, 6 developed TB (TB rate 525/100,000); among the 2,294 in the 10-14 mm TST size category, 37 developed TB (TB rate 1,613/100,000); and 103 of the 3,087 contacts in the TST size category > 15 mm developed TB (TB rate 3,337/100,000). The differences in TB rate among these groups was statistically significant (X 2 = 400.5); p-value < 0.001). The risk of TB development during the observation period increased progressively starting with those who had a TST size > 5 mm. However it was not significant for those in the 5-9 mm category; all the larger TST size categories were statistically significant, as observed in Table 28. Table 28. Cox regression. Only TST size (first applied) in the model. Variable B Hazard ratio (95% CI) P-value 1-4 mm* -0.101 0.904 (0.125 to 6.54) 0.921 5-9 mm* 0.758 2.13 (0.92 to 4.96) 0.078 10-14 mm* 1.89 6.62 (4.35 to 10.05) <0.001 > 15 mm* 2.63 13.9 (9.99 to 19.30) O.001 •Compared to contacts with TST size = Omm. model -2 L L = 3,817.4 -141-Results The survival curve for the TB occurrence according to the TST size categories is shown in Figure 40. Figure 40. Survival curve. TB occurrence by TST size (1st TST). 0 2 4 6 8 10 12 Time (Years) As observed in the survival curve, the TB risk for contacts with TST of 0 mm and those with 1-4 mm is practically the same during the observation period; because of this, and due to the fact that there was only 1 TB case in the 1 -4 mm group, these 2 categories will be collapsed in one (0-4 mm) for subsequent analysis. The TB risk for contacts in the 5-9 mm Category started to increase after 6 years of follow-up, while The risk for the 10-14 mm and > 15 mm categories increases sharply in the first 1 -2 years, then it remains the same after 4 years for the 10-14 mm group, and after 8 years for the > 15 mm group. -142-Results Since contacts in British Columbia have a second TST performed when the first one is not positive, or i f it is done immediately after the time of contact, the risk for these contacts was also assessed. Among the 28,640 contacts who had a TST after becoming exposed to the TB source, there were 12,694 (44%) who had a second TST, and 51 TB cases among them (0.4%). Because of the small number of contacts with a 2 n d TST who developed TB, it was decided to perform all the subsequent analyses with the maximum size of the 1st or 2 n d TST applied to contacts. The following analysis therefore considers the maximum TST size obtained from either the first or second TST applied after the contacts became exposed to the TB source. The distribution of the TST size (maximum of 1 s t or 2 n d TST applied) among the 201 contacts who developed TB (and for whom the TST results were available) tends to be more normal than the one obtained from the first TST, as observed in Figure 41, however it is still skewed to the right. The median TST size was 16 mm, the mode was 15 mm and the range was from 0 to 80 mm. Figure 41. TB cases. Distribution of TST size (maximum of 1st or 2 n d TST). 120-1 1 100 H 80-^ 80.0 TST Size (mm) -143-Results The maximum TST size of the first and second TST applied after exposure was also categorized as before. The distribution of the TST size categories for the 221 contacts who developed TB and for whom the TST size was available is shown in Figure 42. There were 19 subjects (9.5%) with TST size of 0 mm, which is a considerable reduction (from 27%) in the proportion of subjects with no TST response compared to the first TST applied. There was still only 1 subject (0.5%) in the 1-4 mm TST size category; there were 5 subjects (2.5%) in the 5-9 mm Category; 43 subjects (21.5%) in the 10-14 mm category and 133 subjects (66%) in the > 15 mm TST size category. Thus, most of the subjects who had 0 mm in size in the first TST applied after exposure had a substantial increase in the TST response and were in the 10-14 mm and > 15 mm categories. Figure 42. TB cases. Distribution of TST size categorized (1 s t 2n d TST). 70 60 50 ~ 40 i_ cu D_ 30-20-10 0 0 mm 1-4 mm 5-9 mm 10-14 mm >=15mm TST Size Among the 28,640 contacts with TST results available, there were 892 contacts (3%) who had 0 mm in the first TST and > 5 mm in the second TST. None of these contacts have had a previous positive TST (stated or documented in the DTB databases) and only 8 (0.9%) of them had a chest x-ray with abnormalities compatible with previous tuberculous disease. Thus, seems that the majority of these contacts (99.1%) converted (to positive) when the second TST was applied. -144-Results Among the 20,682 contacts who had a TST of 0 mm, 19 developed TB during the observation period (TB rate 92/100,000); only 1 of the 397 contacts in the 1-4 mm TST size category developed TB (TB rate 252/100,000); among the 1,215 contacts in the 5-9 mm category, 5 developed TB (TB rate 412/100,000); among the 2,798 contacts in the 10-14 mm category, 43 developed TB (TB rate 1,537/100,000); and 133 of the 3,548 contacts in the TST size category > 15 mm developed TB (TB rate 3,749/100,000). The differences in TB rate among these groups was statistically significant ( X 2 = 613.6); p-value < 0.001) and the X 2 much larger than the one obtained with the first TST applied. The risk of TB development during the observation period increased progressively starting in the contacts who had a TST size > 5 mm, and the increase is statistically significant, as observed in Table 29. The risk o f T B for the contacts in the 1-4 mm TST size category does not differ significantly from the contacts who had no response to the TST. Table 29. Cox regression. Only TST size (maximum of 1st or 2nd) in the model. Variable B Hazard ratio (95% CI) P-value 1-4 mm* 1.01 2.74 (0.37 to 20.50) 0.325 5-9 mm* 1.51 4.55 (1.70 to 12.19) 0.003 10-14 mm* 2.84 17.10 (9.97 to 29.34) <0.001 > 15 mm* 3.75 42.35 (26.18 to 68.48) O.001 •Compared to contacts with TST size of 0mm. MOC el -2 L L = 3 -145-Results The survival curve for TB occurrence according to the TST size categories for the maximum size of the 1s t or 2 n d TST is shown in Figure 43. Figure 43. Survival curve. TB occurrence by TST size (1st 2 n d TST). 0 2 4 6 8 10 12 Time (Years) As observed in the curve, the risk of TB for the contacts in the 1-4 mm TST size category, is slightly increased compared to contacts with TST size of 0 throughout the observation period. The risk of TB for contacts in the 5-9 mm category is higher from the beginning compared to the contacts with 0 mm, but increases more after 6 years of follow-up. The risk for the 10-14 mm and > 15 mm categories increases sharply in the first 1 -2 years and continues for up to 8 years of follow-up, after which it plateaus. -146-Results Previous BCG. Among the 228 contacts who developed TB, 41 had B C G (18%) and 120 did not have B C G (52.6%). For the remaining 67 TB cases (29.4%) the B C G status was unknown, as shown in Figure 44. Figure 44. TB cases. Distribution of BCG status. Negative Positive Uncertain BCG status Among the 14,302 contacts who did not have B C G , 120 developed TB (TB rate 839/100,000); among the 6,747 contacts with previous B C G , 41 developed TB (TB rate 608/100,000); there were 67 TB cases among the 12,097 contacts for whom the B C G status was uncertain (TB rate 554/100,000). The difference in TB rates between the groups was statistically significant ( X 2 = 8.6; p-value = 0.014). The risk of TB was reduced slightly in contacts with B C G and also in contacts with uncertain B C G status, however, only the latter group was statistically significant, as shown in Table 30. Table 30. Cox regression. Only BCG status in the model. Variable B Hazard ratio (95% CI) P-value BCG positive* -0.308 0.735 (0.51 to 1.047) 0.088 BCG uncertain* -0.396 0.673 (0.50 to 0.91) 0.010 •Compared to contacts with no B C G . Model -2 L L = 4,690.5 -147-Results The survival curve for the occurrence of TB according to the B C G status is shown in Figure 45. Figure 45. Survival curve. TB occurrence by BCG status. 0 2 4 6 8 10 12 Time (Years) As observed in the survival curve, the risk of TB for contacts with no B C G and those with B C G was the same up to 6 years of follow-up; after this period the risk decreased slightly for the B C G group, however, the confidence intervals overlapped. The risk of TB for contacts with uncertain B C G status was higher from the beginning of the follow-up and remained high throughout the observation period. -148-Results Geographical S E status. Among the 219 contacts who developed TB for whom the geographical SE status was available, the distribution of the SE quintiles, shown in Figure 46. was as follows: 64 of the subjects (29.2%) were in the 1s t quintile; 61 (27.9%) were in the 2 n d quintile; 51 (23.3%) were in the 3 r d quintile; 30 (13.7%) were in the 4 t h quintile; and 13 (5.9%) were in the 5 t h SE quintile. Where the 1 st quintile is the lowest SE status and the 5 t h quintile the highest. Figure 46. TB cases. Distribution of geographical socioeconomic status 2 3 4 Geographical SE Quintile Among the 7,912 contacts who were in the 1s t SE quintile, 64 developed TB during the observation period (TB rate 809/100.000); among the 7001 contacts in the 2 n d quintile, 61 developed TB (TB rate 871/100,000); among the 6,280 in the 3 r d quintile, 51 developed TB (TB rate 812/100,000); among the 5,921 in the 4 t h quintile, 30 developed TB (TB rate 507/100,000); and only 13 contacts among the 5,159 in the 5 t h quintile developed TB (TB rate 252/100,000). The difference in TB rates among groups was statistically significant (X = 24; p-value <0.001). As observed in Table 31, the risks o f T B is very similar for the contacts in the 1st, 2 n d and 3 r d SE quintiles and the risk for the 3 groups is more than 3 times higher of that of the contacts in the 5 t h quintiles. Contacts in the 4 t h quintile have an intermediate risk between those in the 5 t h -149-Results quintile and those in the first 3 quintiles. The differences in TB risk of the contacts in quintiles 1 to 4 compared with those in the 5 t h quintile are statistically significant. Table 31. Cox regression. Only SE status in the model Variable B Hazard ratio (95% CI) P-value 1st Quintile* 1.18 3.26 (1.80 to 5.92) <0.001 2 n d Quintile* 1.25 3.50 (1.92 to 6.37) <0.001 3 r d Quintile* 1.174 3.24 (1.76 to 5.95) O.001 4th Quintile* 0.700 2.01 (1.05 to 3.86) 0.035 •Compared to contacts in the 5* quintile. Model -2 L L = 4,476.2 The survival curves of the different SE quintiles are presented in Figure 47. Figure 47. Survival curve. TB occurrence by SE status (quintiles). Time (Years) As observed, the curves of the contacts in the 1s t, 2 n d and 3 r d quintiles overlap (and cross each other) most of the follow-up time. -150-Results Because of the similar risks among the contacts in the 1 s t to 3 r quintiles and because of the crossover of their survival curves (the proportional hazard assumption does not hold for these 3 groups), it was decided to aggregate these 3 groups into a single one for the final analysis. When the geographical SE status was transformed into 3 categories, among the 21,193 contacts in quintiles 1st, 2 n d and 3 r d , there were 176 subjects that developed TB during the observation period (TB rate 830/100,000). The TB cases among contacts in the quintiles 4 t h and 5 t h remained the same. The hazard ratios from the Cox model for the geographical SE status with the 3 categories are presented in Table 32. Table 32. Cox regression. Only S E status (3 categories) in the model. Variable B Hazard ratio (95% CI) P-value 1st, 2 n d and 3 r d Quintiles* 1.20 3.33 (1.90 to 5.85) O.001 4tb Quintile* 0.700 2.01 (1.05 to 3.86) 0.035 •Compared to contacts in the 5 t h quintile. Model -2 LL = 4,476.4 -151-Results The Kaplan-Meier survival curve in Figure 48 shows the occurrence of TB for the 3 geographical SE categories. Figure 48. Survival curve. TB occurrence by SE status (reduced quintiles). 2 4 6 8 10 12 Time (Years) The risk of TB for contacts in the 4 t h SE quintile and especially for those in the 1ST to 3 r d quintiles is higher than the risk for contacts in the 5 t h quintile throughout the follow-up period. -152-Results Previous TST response. Among the 228 contacts who developed TB, there were 11 (4.8%) who had a previous TST response (documented as > 0 mm in the DTB databases, or stated by the contact). The remaining 217 TB cases (95.2%) did not have a previous TST response. Among the 31,445 contacts with no previous TST response, 217 developed TB during the observation period (TB rate 690/100,000); 11 of the 1,701 contacts with a previous TST response developed TB (TB rate 647/100,000). This difference was not statistically significant (X 2 = 0.045; p-value = 0.833). The risk of TB for contacts with a previous TST response was very similar to the risk in contacts with no previous TST response, as observed in Table 33. Table 33. Cox regression. Only previous TST response in the model. Variable B Hazard ratio (95% CI) P-value Previous TST response* -0.069 0.934 (0.509 to 1.71) 0.824 •Compared to contacts with no previous TST response. Model -2 LL = 4,698.2 -153-Results The survival curves for the occurrence of TB in contacts with and without a previous TST response is shown in the Figure 49. Figure 49. Survival curve. TB occurrence by previous TST response. As observed in the figure, the line of the contacts with previous TST response crosses the line of those with no previous TST response, and as mentioned before, the overall risk during the observation period for the 2 groups is very similar. Because of this fact, and also because the reliability of the information regarding previous TST response in the DTB databases is unknown, this variable will not be included in the final model. -154-Results Chest radiograph abnormalities compatible with previous tuberculous disease. Among the 228 contacts who developed TB, 2 (1%) had radiological abnormalities compatible with previous tuberculous disease in the chest x-ray, while the remaining 226 (99%) did not. Of the 32,925 contacts who did not have chest radiograph abnormalities (suggestive of previous tuberculous disease), 226 developed TB during the observation period (TB rate 686/100,000); among the 221 with radiological abnormalities compatible with previous tuberculous disease in the chest x-ray, 2 developed TB (TB rate 905/100,000). This difference in TB rates was not statistically significant (Fisher's Exact test p-value = 0.666). The number of contacts with radiographic abnormalities compatible with previous TB was very small, as well as the number o f T B cases developed among them: this is partly explained by the absence of a variable specifically designed to collect this information (as there is for all the other prognostic factors) in the DTB databases; and partly because most of contacts with radiological abnormalities compatible with previous TB were excluded from the analysis along with the contacts who have had TB previously; in fact, most of contacts with such radiological abnormalities are within the group of contacts who had TB previously. The risk of TB in the contacts with chest radiographic abnormalities suggestive of previous tuberculous disease was no different from the contacts with no abnormalities, as observed in Table 34. This finding can also be explained by the small number o fTB cases in contacts with radiographic abnormalities, as explained previously. Table 34. Cox regression. Only abnormal chest x-ray in the model. Variable B Hazard ratio (95% CI) P-value Old TB in chest x-ray* 0.317 1.37 (0.34 to 5.52) 0.655 •Compared to contacts with no abnormalities in chest x-ray. Model -2 LL = 4,698.1 -155-Results The survival curve of the occurrence of TB according to the presence of chest x-ray abnormalities is shown in Figure 50. Figure 50. Survival curve. T B occurrence by old T B in chest x-ray. As appreciated in Figure 50, the lines of those with chest x-ray abnormalities suggestive of previous tuberculous disease and those without abnormalities cross just before 2 years of follow-up. The risk seems to be higher in the initial 2 years, for contacts with no radiological abnormalities, however the risk is inverted afterwards. But it should be noted that there were only 2 contacts who developed TB in the group with chest x-ray abnormalities. Because of the similar overall rates of TB between the contacts with and without radiological abnormalities, the crossover of the survival curves, the small number of TB cases among contacts with chest x-ray abnormalities, the small number of contacts with chest x-ray abnormalities, and the uncertainty of its reliability (availability), this variable was not included in the final model. -156-Results The results obtained from the univariate analysis applying the Cox model to each of the prognostic factors of interest are summarized in Table 35. Table 35. Cox Regression (univariate analysis). Variable B Hazard ratio (95% CI) P-value Age -0.029 0.971 (0.964 to 0.979) O . 0 0 1 * Males (compared to females) 0.675 1.96(1.51 to 2.55) <0.001* LTBI treatment < 6 months1 0.845 2.33 (1.38 to 3.94) 0.002* LTBI treatment > 6 months1 -0.058 0.944 (0.50 to 1.78) 0.859 Exposed to >2 TB cases* 0.706 2.02 (1.34 to 3.07) 0 .001* Aboriginal® 1.356 3.88 (2.87 to 5.25) <0.001* Foreign-born® -0 .105 0.900 (0.65 to 1.25) 0.531 Close non-household" 0.969 2.64 (1.82 to 3.83) <0.001* Close household" 2.26 9.57 (6.66 to 13.75) <0.001* Smear positive" 0.466 1.59 (1.07 to 2.37) 0 .021* Malnutrition5 3.345 28.37 (11.67 to 68.96) <0.001* Alcoholism5 2.02 7.52 (4.29 to 13.18) <0.001* Corticosteroids* 0.83 2.29 (0.57 to 9.24) 0.243 Malignancy5 0.73 2.08 (0.92 to 4.70) 0.076 DM S 0.04 1.04 (0.39 to 2.81) 0.931 Injection drug users% 1.73 5.64 (2.32 to 13.71) <0.001* Recent arrival from CHTB% 0.536 1.71 (1.01 to 2.89) 0 .045* Resident/employee of high-risk setting"" -0 .219 0.803 (0.33 to 1.95) 0.803 TST size: 1-4 mm* -0.101 0.904 (0.125 to 6.54) 0.921 5-9 mm* 0.758 2.13 (0.92 to 4.96) 0.078 10-14 mm* 1.89 6.62 (4.35 to 10.05) <0.001* > 15 mm* 2.63 13.9 (9.99 to 19.30) <0.001* BCG positive" -0.308 0.735 (0.51 to 1.047) 0.088 BCG uncertain -0.396 0.673 (0.50 to 0.91) 0 .010* 1" Quintile" 1.18 3.26 (1.80 to 5.92) <0.001* 2 N D Quintikf 1.25 3.50 (1.92 to 6.37) <0.001* 3 R D Quintile 1.174 3.24 (1.76 to 5.95) <0.001* 4 T H Quintile 0.700 2.01 (1.05 to 3.86) 0 .035* Previous TST response1 -0.069 0.934 (0.509 to 1.71) 0.824 Old TB in chest x-ray' 0.317 1.37 (0.34 to 5.52) 0.655 •Statistically significant. {Compared to no LTBI treatment. +Compared to being exposed to 1 TB case. @ Compared to Canadian-bom. -Compared to casual contacts. #Compared to non-smear positive. SCompared to non-immunosuppressed. %Compared to non-high-risk. &Compared to contacts with TST=0mm (1st TST). ACompared to contacts with no BCG. -Compared to 5 t h quintile. ICompared to contacts with no previous TST response. 'Compared to contacts with no chest x-ray abnormalities. -157-Results ASSESSMENT OF THE RISK FACTORS FOR TB DEVELOPMENT. The prognostic models to assess the risk factors that best predict the development of TB were constructed using the following model building approaches in a sequential fashion (in order to obtain the final model): A) Model including all the variables of interest in the model. B) Model using forward stepwise and backward stepwise selection methods. C) Model with purposeful selection of variables (final model), which in summary includes: A univariate analysis of each variable (presented previously) followed by the two multivariate analyses described previously. Variables significant (p<0.05) in both analyses (univariate and multivariate analysis), variables of clinical interest, and confounders (variables that produce a change > 20% on other covariates when removed from the model) will be included in the in the model. For each of the types of models built we used two TST sizes: a) the maximum TST size of either the first or second TST applied after contact, which was a better predictor of the development of TB than the 1 s t TST, but also; b) The TST size obtained from the first TST applied after contact. A) Model including all variables in the model. Using the maximum of 1st or 2nd TST results. A l l the prognostic factors assessed in the univariate analysis were included in this model. The results obtained from simultaneously entering in the model all the prognostic factors of interest are presented in Table 36. -158-Results Table 36. Cox regression model with all variables included (1st 2n d TST) Variable B SE P-value Hazard ratio 95% CI for hazard ratio Lower Upper Age* -.060 .006 .000 .942 .931 .953 Gender .124 .172 .471 1.132 .808 1.585 Canadian-born .000* Aboriginal .371 .226 .101 1.449 .930 2.257 Foreign-born* -1.577 .277 .000 .207 .120 .356 No immunosuppressive conditions .002" Carcinoma in situ -14.274 11791.504 .999 .000 .000 NC Chronic renal failure -12.360 953.081 .990 .000 .000 NC Antineoplastics -14.235 24471.660 1.000 .000 .000 NC Organ transplant -13.712 21524.331 .999 .000 .000 NC Aplastic Anemia -14.534 38423.585 1.000 .000 .000 NC Diabetes mellitus .735 .603 .223 2.085 .640 6.793 Malignancy* 1.673 .540 .002 5.330 1.850 15.354 Corticosteroids 1.415 .739 .055 4.116 .968 17.511 OH* 1.102 .450 .014 3.010 1.246 7.272 Malnutrition* 3.384 1.032 .001 29.479 3.902 222.682 BCG negative .000* BCG positive* -1.460 .257 .000 .232 .140 .384 BCG unknown* -.476 .228 .037 .621 .397 .971 Exposed to > 2 TB cases .566 .335 .091 1.761 .914 3.394 Casual contact .000* Household contact* 1.902 .253 .000 6.702 4.082 11.003 Non-household contact* .924 .255 .000 2.518 1.527 4.154 LTBI Rx > 6 months .000* No LTBI Rx* 3.589 .467 .000 36.210 14.501 90.419 LTBI Rx < 6 months* 1.816 .550 .001 6.145 2.089 18.074 TB source smear positive* .491 .258 .057 1.634 .985 2.709 TST size 0 mm .000* TST size 1-4 mm* 2.019 1.046 .054 7.531 .969 58.555 TST size 5-9 mm* 2.677 .593 .000 14.549 4.547 46.549 TST size 10-14 mm* 4.250 .369 .000 70.101 34.018 144.458 TST size > 15 mm* 5.101 .337 .000 164.168 84.810 317.783 Previous positive response+ SE status quintile 5 .076* SE status quintile 1 .748 .442 .091 2.112 .888 5.026 SE status quintile 2* 1.031 .445 .020 2.805 1.174 6.703 SE status quintile 3* 1.132 .446 .011 3.102 1.294 7.439 SE status quintile 4 .744 .465 .110 2.104 .846 5.235 Injection drug user* 1.768 .540 .001 5.858 2.034 16.870 Recent arrival from CHTB* .719 .366 .050 2.052 1.001 4.207 Resident/employee high-risk setting* -1.468 .725 .043 .230 .056 .954 Hospital personnel -11.963 156.974 .939 .000 .000 2.6+128 Old TB in x-ray -.264 1.013 .795 .768 .106 5.594 B = Coefficient. SE = Standard error. *Statistically significant (p<0.05). #P-value for the variable with all its categories. +Not calculable (all contacts with PPR had missing values in some other variable). NC: not calculable (tends to infinity). Model L R = 702.8. B = Coefficient. SE = Standard error. -159-Results Since this initial multivariate analysis showed that the hazard ratio of the immunosuppressive conditions for which there were no TB cases during the follow-up period (renal failure, organ transplant, aplastic anemia, carcinoma in situ and antineoplastic medications) were not statistically significant, and since they did not produce a significant change in the coefficients of other variables when removed from the model, it was decided to perform the subsequent analysis with a variable including only the immunosuppressive conditions for which there were TB cases during the follow-up period. Similarly, since the univariate analysis of the 5 SE status quintiles showed that the survival curved of quintiles 1 to 3 crossed each other, that the coefficients of the quintiles 1 to 3 were similar in the initial model containing all the 5 quintiles, and that SE status did not change importantly the coefficients of the other variables in the model when categorized in 3 groups (quintiles 1-3, quintile 4 and quintile 5), it was decided to perform the subsequent analysis with SE status categorized in these 3 groups, as shown in Table 37. The rationale for not including the above variables with all their original categories (groups) in subsequent analyses is: that there is no compelling reason for including variables that were not found to increase the risk of TB in the model, which in addition do not contribute to improved predictive capabilities of the model (in predicting the development of TB); on the other hand, reducing the number of parameters in this model would improve it by decreasing "overfilling". -160-Results Table 37. Cox regression model with all relevant variables included (1st 2n d TST) Variable B SE P-value Hazard ratio 95% CI for hazard ratio Lower Upper Age* -.059 .006 .000 .942 .931 .953 Gender .135 .171 .428 1.145 .819 1.600 Canadian-born .000* Aboriginal .290 .221 .189 1.336 .867 2.060 Foreign-born* -1.580 .277 .000 .206 .120 .355 No immunosuppressive conditions .000* Diabetes mellitus .755 .602 .210 2.127 .654 6.915 Malignancy* 1.670 .540 .002 5.314 1.843 15.317 Corticosteroids 1.426 .737 .053 4.162 .981 17.662 OH* 1.096 .452 .015 2.992 1.235 7.250 Malnutrition* 3.472 1.029 .001 32.196 4.284 241.989 BCG negative .000* BCG positive* -1.451 .257 .000 .234 .142 .387 BCG unknown* -.482 .228 .034 .618 .395 .965 Exposed to > 2 TB cases .598 .330 .070 1.819 .952 3.477 Casual contact .000* Household contact* 1.908 .253 .000 6.737 4.107 11.052 Non-household contact* .948 .255 .000 2.580 1.565 4.254 LTBI Rx> 6 months .000* No LTBI Rx* 3.589 .467 .000 36.186 14.490 90.369 LTBI Rx < 6 months* 1.822 .551 .001 6.183 2.101 18.195 TB source smear positive .477 .258 .064 1.611 .972 2.670 TST size 0 mm .000* TST size 1-4 mm 1.981 1.046 .058 7.250 .932 56.365 TST size 5-9 mm* 2.665 .593 .000 14.363 4.489 45.957 TST size 10-14 mm* 4.224 .369 .000 68.289 33.155 140.652 TST size > 15 mm* 5.096 .337 .000 163.287 84.334 316.154 SE status quintile 5 .067* SE status quintiles 1-3* .947 .423 .025 2.577 1.125 5.902 SE status quintile 4 .749 .465 .107 2.115 .850 5.261 Injection drug user* 1.688 .538 .002 5.410 1.883 15.545 Recent arrival from CHTB .694 .365 .057 2.003 .979 4.097 Resident/employee high-risk setting -1.404 .722 .052 .246 .060 1.011 Hospital personnel -11.797 140.265 .933 .000 .000 1.9+114 Old TB in x-ray -.303 1.013 .765 .738 .101 5.378 B = Coefficient. SE = Standard error. *Statistically significant (p<0.05). #P-value for the variable with all its categories. Model L R = 699.4. ^ -161-Results The results obtained from the stepwise methods are presented in Tables 38 and 39 below. B) Model using forward stepwise. Using the maximum of 1st or 2nd TST results. Table 38. Cox regression model using forward stepwise (1st 2n d TST) Variable B SE P-value Hazard ratio 95% CI for hazard ratio Lower Upper Age* -.060 .006 .000 .941 .931 .953 Canadian-born .000* Aboriginal .246 .218 .259 1.279 .834 1.960 Foreign-born* -1.485 .243 .000 .226 .141 .365 No immunosuppressive conditions .003* Diabetes mellitus .701 .600 .243 2.015 .622 6.530 Malignancy* 1.670 .537 .002 5.314 1.853 15.237 Corticosteroids 1.351 .726 .063 3.861 .931 16.016 OH* 1.127 .452 .013 3.087 1.274 7.484 Malnutrition* 3.329 1.024 .001 27.922 3.752 207.766 BCG negative .000* BCG positive* -1.443 .255 .000 .236 .143 .389 BCG unknown* -.487 .226 .031 .614 .394 .958 Casual contact .000* Household contact* 1.815 .248 .000 6.141 3.778 9.982 Non-household contact* .915 .254 .000 2.497 1.518 4.108 LTBI Rx > 6 months .000* No LTBI Rx* 3.568 .467 .000 35.459 14.194 88.585 LTBI Rx < 6 months* 1.820 .551 .001 6.171 2.098 18.154 TST size 0 mm .000* TST size 1-4 mm 1.951 1.045 .062 7.039 .907 54.611 TST size 5-9 mm* 2.729 .592 .000 15.313 4.801 48.848 TST size 10-14 mm* 4.331 .366 .000 76.000 37.096 155.707 TST size > 15 mm* 5.208 .335 .000 182.742 94.806 352.243 SE status quintile 5 .019* SE status quintiles 1-3* .998 .422 .018 2.714 1.188 6.201 SE status quintile 4 .770 .464 .097 2.161 .871 5.363 Injection drug user* 1.731 .536 .001 5.644 1.975 16.124 Resident/employee high-risk setting -1.336 .719 .063 .263 .064 1.076 B = Coefficient. SE = Standard error. * Statistically significant (p<0.05). #P-value for the variable with all its categories. Model LR = 690.1. -162-Results B) Model using backward stepwise. Using the maximum of 1st or 2nd TST results. Table 39. Cox regression model using backward stepwise (1st 2n d TST) Variable B SE P-value Hazard ratio 95% CI for hazard ratio Lower Lower Age* -.059 .006 .000 .942 .931 .953 Canadian-born .000* Aboriginal .266 .218 .224 1.304 .850 2.001 Foreign-born* -1.589 .277 .000 .204 .119 .352 No immunosuppressive conditions .003* Diabetes mellitus .748 .601 .213 2.113 .651 6.861 Malignancy* 1.670 .539 .002 5.315 1.848 15.287 Corticosteroids 1.399 .737 ,058 4.051 .956 17.171 OH* 1.100 .453 .015 3.003 1.237 7.290 Malnutrition* 3.450 1.029 .001 31.486 4.189 236.644 BCG negative .000* BCG positive* -1.448 .256 .000 .235 .142 .388 BCG unknown* -.480 .227 .035 .619 .396 .966 Exposed to > 2 TB cases .604 .330 .067 1.830 .959 3.492 Casual contact .000* Household contact* 1.915 .252 .000 6.784 4.136 11.127 Non-household contact* .947 .255 .000 2.578 1.564 4.248 LTBI Rx > 6 months .000* No LTBI Rx* 3.593 .467 .000 36.343 14.550 90.776 LTBI Rx < 6 months* 1.809 .551 .001 6.105 2.075 17.962 TB source smear positive .472 .258 .067 1.603 .968 2.656 TST size 0 mm .000* TST size 1-4 ram 1.945 1.045 .063 6.997 .902 54.297 TST size 5-9 mm* 2.658 .594 .000 14.267 4.457 45.665 TST size 10-14 mm* 4.241 .368 .000 69.454 33.749 142.937 TST size > 15 mm* 5.099 .337 .000 163.877 84.647 317.267 SE status quintile 5 .029* SE status quintiles 1-3* .957 .422 .023 2.604 1.138 5.961 SE status quintile 4 .756 .465 .104 2.131 .857 5.298 Injection drug user* 1.726 .538 .001 5.616 1.958 16.108 Recent arrival from CHTB .686 .365 .060 1.986 .971 4.064 Resident/employee high-risk setting -1.358 .720 .059 .257 .063 1.054 B = Coefficient. SE = Standard error. *Statistically significant (p<0.05). #P-value for the variable with all its categories. Model L R = 699.4. -163-Results The variables excluded from both stepwise models were gender, previous positive TST, previous tuberculous disease in chest x-rays and hospital personnel. In addition, the variable "resident or employee of a high-risk setting" was not statistically significant either in the stepwise models or in the model including all the variables. -164-Results C) Final model (model with purposeful selection of variables). Based on the results from the previous univariate and multivariate analyses, and taking in consideration the clinical relevance, the prognostic factors of interest included in this model were the following: • Age. • Gender. • LTBI treatment (no treatment, treatment < 6 months and treatment > 6 months). • Number o fTB cases the contact was exposed to (1 and > 2 TB cases). • Ethnicity (Canadian-born, foreign-born and Aboriginal people). • Closeness of contact (household, close non-household and casual). • B C G status (positive, negative and uncertain). • Infectivity o fTB source case (smear positive vs. smear negative). • Immunosuppression at the time of TB exposure (malnutrition, alcoholism, use of corticosteroids, malignancy and DM). • High-risk subjects (injection drug users and recent arrivals from a country with high TB prevalence). • TST size (Omm, 1-4 mm, 5-9 mm, 10-14 mm and > 15 mm). • Geographical SE status in quintiles (1 s t to 3 r d , 4 t h and 5 t h quintiles). The results from this model are presented in Table 40. -165-Results Table 40. Final Cox regression model (1st 2 n d TST). Robust variance estimation Variable B SE P-value Hazard ratio 95% CI for hazard ratio Lower Upper Age* -0566 .005 < 0001 .945 .933 .958 Gender .246 .149 .100 1.279 .954 1.714 Canadian-born < 001* Aboriginal* .512 .193 .008 1.668 1.144 2.433 Foreign-born* -1.402 .231 < 0001 .246 .156 .387 No immunosuppressive conditions < 001* Diabetes mellitus .577 .590 .330 1.781 .561 5.660 Malignancy* 1.310 .559 .019 3.708 1.240 11.086 Corticosteroids* 1.520 .629 .016 4.573 1.333 15.695 OH* 1.069 .410 .009 2.913 1.304 6.509 Malnutrition* 3.134 .408 < 0001 22.970 10.321 51.123 BCG negative < 001* BCG positive* -1.416 .222 < 0001 .243 .157 .375 BCG unknown -.407 .208 .050 .666 .443 1.000 Exposed to > 2 TB cases* .804 .234 .0006 2.234 1.412 3.533 Casual contact < 001* Household contact* 1.835 .226 < 0001 6.263 4.024 9.748 Non-household contact* .838 .212 < .0001 2.312 1.525 3.505 LTBI Rx> 6 months < 001* No LTBI Rx* 3.419 .409 < .0001 30.546 13.700 68.109 LTBI Rx < 6 months* 1.713 .482 .0004 5.544 2155 14.260 TB source smear positive* .666 .231 .004 1.946 1.239 3.058 TST size 0 mm < 001* TST size 1-4 mm 1.786 1.050 .089 5.966 0.762 46.681 TST size 5-9 mm* 2.903 .519 < 0001 18.229 6.597 50.372 TST size 10-14 mm* 4.662 .354 < 0001 105.891 52.936 211.818 TST size > 15 mm* 5.488 .325 < 0001 241.836 127.943 457.115 SE status quintile 5 .072* SE status quintiles 1-3* .876 .394 .026 2.402 1.110 5.194 SE status quintile 4 .733 .427 .086 2.081 .901 4.809 Injection drug use* 1.137 .550 .039 3.120 1.061 9.150 Recent arrival from CHTB .571 .342 .095 1.770 0.906 3.460 B = Coefficient. SE = Standard error. *Statistically significant 0p<0.05). #P-value for the variable with all its categories. Model L R test= 1,024; on 25 df; p-value=0. -166-Results Except for the 2 categories of SE status (that had a correlation of 0.867), no other variable had a correlation between the regression coefficients > 0.80). The previously highly significant effect found in the univariate analysis for gender, and alcoholism was importantly modified when adjusting for the other factors in the multivariate analysis model; the risk o fTB in contacts with alcoholism was still statistically significant, but the risk associated to being a male was no longer statistically significant. On the other hand, the apparently similar risk of TB observed in the univariate analysis for those who did not receive LTBI treatment and for those who completed > 6 months was clearly modified in the multivariate analysis; contacts that did not receive LTBI treatment had 30 times higher risk of TB compared with contacts that completed > 6 months of treatment. Similarly, the apparently similar risk of TB among Canadian-bom and foreign-bom observed initially in the univariate analysis was changed in the multivariate analysis, which showed a highly significant reduction in the risk of developing TB i f contacts of TB cases are foreign-bom compared to contacts that are Canadian-bom. The Aboriginal people had a statistically significant increased risk of TB compared to Canadian-bom in the multivariate analysis, but the hazard ratio was lower than the observed in the univariate analysis (hazard ratio went from 3.88 in the univariate analysis to 1.67 in the multivariate analysis). Also, the small protective effect of B C G for the development of TB observed in the univariate analysis, was clearly increased after adjusting for the other prognostic factors in the multivariate analysis (hazard ratio from 0.735 in the univariate analysis to 0.243 in multivariate analysis) and highly statistically significant. The large predictor effect of developing TB observed for younger age, TST size > 5mm, malnutrition and type of contact (especially for household contacts) in the univariate analysis, remained after adjusting for all the other variables in the model. The effect of TST > 5 mm, not having received LTBI treatment and malnutrition were especially important, with risks of TB -20 times or higher in those who had any of these risk factors compared to those who did not. -167-Results The other variables that were clinically and statistically significant predictors of TB were: use of corticosteroids, malignancy, infectivity of the source case (being smear positive), lower SE status, injection drug use and being exposed to > 2 TB cases. The following factors were not significant in the univariate analysis nor when adjusted for other variables in any of the multivariate analyses carried out: previous positive TST, previous tuberculous disease in chest x-rays, hospital personnel, recent arrival from a country with high TB prevalence and being a resident or employee of high-risk settings. -168-Results The model building approaches used previously for the results of the I s or 2" TST, were applied in the same fashion for the results of the 1 s t TST applied to the contacts after exposure to the TB source. The results are presented in Tables 41 to 44. A) Model including all variables in the model. Using the results of the 1st TST applied after exposure. Table 41. Cox regression model with all variables included (1st TST) Variable B SE P-value Hazard ratio 95% CI for hazard ratio Lower Upper Age* -.054 .006 .000 .947 .936 .958 Gender .115 .171 .502 1.121 .803 1.567 Canadian-born .000" Aboriginal .423 .220 .055 1.526 .992 2.349 Foreign-born* -1.422 .280 .000 .241 .139 .418 No immunosuppressive conditions .000" Diabetes mellitus .823 .601 .171 2.278 .701 7.406 Malignancy* 1.591 .538 .003 4.911 1.711 14.092 Corticosteroids 1.381 .737 .061 3.980 .938 16.886 OH* 1.118 .444 .012 3.060 1.280 7.311 Malnutrition* 3.700 1.035 .000 40.433 5.322 307.189 BCG negative .000* BCG positive* -1.360 .259 .000 .257 .154 .426 BCG unknown* -.497 .224 .026 .608 .392 .944 Exposed to > 2 TB cases* .822 .328 .012 2.274 1.195 4.328 Casual contact .000* Household contact* 2.204 .253 .000 9.062 5.514 14.892 Non-household contact* 1.029 .255 .000 2.798 1.696 4.616 LTBI Rx > 6 months .000" No LTBI Rx* 3.329 .472 .000 27.897 11.061 70.361 LTBI Rx < 6 months* 1.881 .553 .001 6.559 2.221 19.372 TB source smear positive* .818 .256 .001 2.265 1.373 3.738 TST size 0 mm .000* TST size 1-4 mm .715 1.017 .482 2.045 .279 15.016 TST size 5-9 mm* 1.697 .492 .001 5.457 2.079 14.325 TST size 10-14 mm* 3.185 .283 .000 24.175 13.878 42.110 TST size > 15 mm* 3.784 .235 .000 43.974 27.726 69.743 Previous positive response+ SE status quintile 5 .090* SE status quintiles 1-3* .922 .423 .029 2.513 1.097 5.760 SE status quintile 4 .814 .465 .080 2.258 .907 5.622 Injection drug user* 1.748 .529 .001 5.742 2.037 16.187 Recent arrival from CHTB* .737 .364 .043 2.091 1.025 4.263 Resident/employee high-risk setting -1.156 .727 .112 .315 .076 1.309 Hospital personnel -12.297 176.653 .945 .000 .000 1.1+145 Old TB in x-ray -.116 1.013 .909 .891 .122 6.485 B = Coefficient. SE = Standard error. 'Statistically significant (p<0.05). #P-value for the variable with all its categories. +Not calculable (all contacts with PPR had missing values in some other variable). Model L R = 565.9. -169-Results The results obtained from the stepwise methods are presented in Tables 42 and 43. B) Model using forward stepwise. Using the results of the 1st TST applied after exposure. Table 42. Cox regression model using forward stepwise (1st TST) Variable B SE P-value Hazard ratio 95% CI for hazard ratio Lower Upper Age* -.056 .006 .000 .946 .935 .957 Canadian-born .000* Aboriginal* .432 .217 .047 1.540 1.006 2.359 Foreign-born* -1.385 .279 .000 .250 .145 .432 No immunosuppressive conditions .002* Diabetes mellitus .816 .600 .174 2.261 .698 7.329 Malignancy* 1.598 .539 .003 4.944 1.721 14.207 Corticosteroids* 1.450 .737 .049 4.263 1.006 18.063 OH* 1.167 .442 .008 3.211 1.349 7.642 Malnutrition* 3.782 1.032 .000 43.904 5.814 331.542 BCG negative .000* BCG positive* -1.389 .257 .000 .249 .151 .413 BCG unknown* -.503 .223 .024 .605 .391 .936 Exposed to > 2 TB cases* .866 .328 .008 2.378 1.250 4.523 Casual contact .000* Household contact* 2.230 .253 .000 9.296 5.658 15.276 Non-household contact* 1.048 .255 .000 2.852 1.730 4.704 LTBI Rx > 6 months .000* No LTBI Rx* 3.339 .472 .000 28.201 11.189 71.078 LTBI Rx < 6 months* 1.845 .552 .001 6.328 2.144 18.676 TB source smear positive* .819 .255 .001 2.268 1.375 3.740 TST size 0 mm .000* TST size 1-4 mm .606 1.016 .551 1.833 .250 13.434 TST size 5-9 mm* 1.632 .490 .001 5.113 1.957 13.361 TST size 10-14 mm* 3.224 .281 .000 25.135 14.500 43.571 TST size > 15 mm* 3.817 .234 .000 45.466 28.723 71.969 Injection drug user* 1.606 .520 .002 4.981 1.798 13.798 Recent arrival from CHTB* .782 .363 .031 2.187 1.073 4.456 B = Coefficient. SE = Standard error. *Statistically significant (p<0.05). #P-value for the variable with all its categories. Model LR = 556.7. d.f. 23. -170-Results B) Model using backward stepwise. Using the results of the 1 st TST applied after exposure. Table 43. Cox regression model using backward stepwise (1st TST) Variable B SE P-value Hazard ratio 95% CI for hazard ratio Lower Upper Age* -.054 .006 .000 .947 .936 .958 Canadian-born .000* Aboriginal .404 .218 .064 1.497 .976 2.296 Foreign-born* -1.430 .280 .000 .239 .138 .414 No immunosuppressive conditions .002* Diabetes mellitus .825 .601 .170 2.282 .703 7.409 Malignancy* 1.594 .537 .003 4.922 1.718 14.104 Corticosteroids 1.363 .737 .064 3.906 .922 16.557 OH* 1.122 .445 .012 3.071 1.284 7.345 Malnutrition* 3.693 1.035 .000 40.165 5.287 305.156 BCG negative .000* BCG positive* -1.365 .258 .000 .255 .154 .424 BCG unknown* -.497 .224 .026 .608 .392 .944 Exposed to > 2 TB cases* .825 .328 .012 2.281 1.199 4.337 Casual contact .000* Household contact* 2.211 .253 .000 9.124 5.553 14.992 Non-household contact* 1.027 .255 .000 2.792 1.693 4.605 LTBI Rx > 6 months .000* • No LTBI Rx* 3.332 .472 .000 27.983 11.091 70.598 LTBI Rx < 6 months* 1.871 .553 .001 6.494 2.198 19.185 TB source smear positive* .819 .255 .001 2.267 1.374 3.741 TST size 0 mm .000* TST size 1-4 mm .690 1.016 .497 1.993 .272 14.611 TST size 5-9 mm* 1.706 .492 .001 5.507 2.099 14.447 TST size 10-14 mm* 3.205 .282 .000 24.661 14.195 42.843 TST size > 15 mm* 3.790 .235 .000 44.250 27.911 70.154 SE status quintile 5 .044* SE status quintiles 1-3* .921 .423 .029 2.512 1.096 5.757 SE status quintile 4 .813 .465 .080 2.256 .906 5.615 Injection drug user* 1.773 .528 .001 5.887 2.094 16.556 Recent arrival from CHTB* .737 .364 .043 2.089 1.024 4.261 Resident/employee high-risk setting -1.134 .727 .119 .322 .077 1.336 B = Coefficient. SE = Standard error. *Statistically significant (p<0.05). #P-value for the variable with all its categories. Model -2 log likelihood = 2,192.4. -171-Results The variables excluded from both stepwise models were previous TST response, previous tuberculous disease on chest radiographs and hospital personnel. In addition, being a resident or employee of a high-risk setting was not statistically significant in any of the previous models. -172-Results C) Final model (model with purposeful selection of variables). Using the results of the 1 st TST applied after exposure. Based on the results from the previous univariate and multivariate analysis, and taking in consideration clinical relevance, the prognostic factors of interest included in this model were the same used for the maximum size of the 1 s t 2 n d TST applied: • Age. • Gender. • LTBI treatment (no treatment, treatment < 6 months and treatment > 6 months). • Number ofTB cases the contact was exposed to (1 and > 2 TB cases). • Ethnicity (Canadian-born, foreign-born and Aboriginal people). • Closeness of contact (household, close non-household and casual). • B C G status (positive, negative and uncertain). • Infectivity o fTB source case (smear positive vs. smear negative). • Immunosuppression at the time of TB exposure (malnutrition, alcoholism, use of corticosteroids, malignancy and DM). • High-risk subjects (injection drug users and recent arrivals from a country with high TB prevalence). • TST size (Omm, 1-4 mm, 5-9 mm, 10-14 mm and > 15 mm). • Geographical SE status in quintiles (1 s t to 3 r d , 4 t h and 5 t h quintiles). Hence, the same variables used in the 1 s t 2 n d TST model are used for the 1st TST; the results from this model are presented in Table 44 below. -173-Results Table 44. Final Cox regression model (1st TST). Robust variance estimation Variable B SE P-value Hazard ratio 95% CI for hazard ratio Lower Upper Age* -.053 .007 <. 0001 .948 .935 .961 Gender .186 .153 .230 1.204 .892 1.626 Canadian-born < 001* Aboriginal* .614 .197 .002 1.848 1.255 2.721 Foreign-born* -1.345 .247 <. 0001 .260 .161 .423 No immunosuppressive conditions <. 001* Diabetes mellitus .611 .610 .320 1.842 .557 6.087 Malignancy* 1.255 .569 .027 3.508 1.149 10.708 Corticosteroids* 1.352 .658 .040 3.867 1.065 14.033 OH* .987 .410 .016 2.685 1.201 5.999 Malnutrition* 3.686 .566 <. 0001 39.894 13.144 121.185 BCG negative < 001* BCG positive* -1.207 .233 < 0001 .299 .189 .473 BCG unknown* -.429 .212 .043 .651 .430 .987 Exposed to > 2 TB cases* .948 .244 .0001 2.580 1.600 4.160 Casual contact < 001* Household contact* 2.200 .226 < 0001 9.027 5.799 14.049 Non-household contact* .907 .217 < 0001 2.476 1.620 3.785 LTBI Rx > 6 months < 001* No LTBI Rx* 3.275 .435 < 0001 26.446 11.280 62.003 LTBI Rx < 6 months* 1.757 .487 .0003 5.794 2.232 15.044 TB source smear positive* .941 .240 < 0001 2.562 1.602 4.099 TST size 0 mm < 001* TST size 1-4 mm .451 1.035 .660 1.570 .206 11.950 TST size 5-9 mm* 1.621 .559 .004 5.059 1.692 15.124 TST size 10-14 mm* 3.539 .288 < 0001 34.445 19.573 60.618 TST size > 15 mm* 4.150 .249 <. 0001 63.426 38.953 103.274 SE status quintile 5 .031* SE status quintiles 1-3* 1.000 .393 .011 2.719 1.258 5.877 SE status quintile 4 .826 .430 .055 2.284 0.983 5.308 Injection drug use* 1.241 .525 .018 3.460 1.237 9.670 Recent arrival from CHTB .593 .340 .082 1.810 0.929 3.520 B = Coefficient. SE = Standard error. 'Statistically significant (p<0.05). #P-value for the variable with all its categories. Model LR test=808; on 25 df; p-value=0. -174-Results Except for the 2 categories of SE status (that had a correlation of 0.865), no other variable had a correlation between the regression coefficients > 0.80). The likelihood ratio test was clearly larger for the final model using the 1 s t 2 n d TST (1,024), than the L R in the final model using the 1 s t TST (808); this suggests that using the 1 s t 2 n d TST results in a better fit and a better predictive model than using the 1 s t TST. This is supported also by the fact that the increased risk of TB for contacts in the TST size categories > 5 mm is better defined in the final model using the 1 s t 2 n d TST; the increased risks ofTB for the TST size category 5-9 mm were much higher when the 1s t 2 n d TST measures are used (hazard ratio = 18.2) than when the 1s t TST results were used (hazard ratio = 5.1). Similarly, a much higher risk o f T B was also observed for the TST categories of 10-14 mm and > 15 mm when the results of the 1s t 2 n d TST are used. On the other hand, the variables that were significant in the final model using the 1 s t 2 n d TST were also significant in the final model using the 1 s t TST and vice versa. As observed in the final model using the 1s t 2 n d TST, in the final model using the 1 s t TST the previously highly significant effect found in the univariate analysis for gender, and alcoholism was importantly modified when adjusting for the other factors in the multivariate analysis model; the risk of TB in contacts with alcoholism was still statistically significant, but the risk associated to being a male was no longer statistically significant. Similarly, the apparently similar risk of TB observed in the univariate analysis for those who did not receive LTBI treatment and for those who completed > 6 months was clearly modified in the multivariate analysis; contacts that did not receive LTBI treatment had 26 times the risk ofTB compared with contacts that completed > 6 months of treatment. Likewise, the apparently similar risk of TB among Canadian-bom and foreign-bom observed initially in the univariate analysis was changed in the multivariate analysis, which showed a highly significant reduction in the risk of developing TB in contacts of TB cases that are foreign-bom, compared to the risk in contacts that are Canadian-bom non Aboriginal. Aboriginal people had a statistically significant increased risk of TB compared to Canadian--175-V R e s u l t s born non-Aboriginal in the multivariate analysis, but the hazard ratio was lower than the observed in the univariate analysis (hazard ratio went from 3.88 in the univariate analysis to 1.85 in the multivariate analysis). Also, the small protective effect of B C G for the development o fTB observed in the univariate analysis, was clearly increased after adjusting for the other prognostic factors in the multivariate analysis (hazard ratio from 0.735 in the univariate analysis to 0.299 in multivariate analysis) and highly statistically significant. The large predictor effect of developing TB observed in the univariate analysis for younger age, TST size > 5mm, malnutrition and type of contact (especially for household contacts), remained after adjusting for all the other variables in the model. The effect of TST > 10 mm, not having received LTBI treatment and malnutrition were especially important, with risks of TB >25 times in those who had any of these risk factors compared to those who did not. The other variables that were clinically and statistically significant predictors of TB were: use of corticosteroids at the time of the exposure, malignancy, infectivity of the source case (being smear positive), lower SE status, injection drug use and exposure to > 2 TB cases. The following factors were not significant in the univariate analysis nor when adjusted for other variables in any of the multivariate analyses carried out: previous positive TST, previous tuberculous disease in chest x-rays, hospital personnel, recent arrival from a high prevalence TB country, and residents or employees in high-risk settings. -176-Results T S T SIZE AND RISK O F D E V E L O P I N G T B In order to determine the best TST cut-off size for predicting the development of TB in different groups of clinical interest, the hazard ratios of TB were estimated for the different TST size categories currently used in practice (0-4 mm, 5-9 mm, 10-14 mm and >15 mm), adjusted for other prognostic factors (see below) using the proportional hazards Cox model. The risks of developing TB for the following subgroups of interest will be presented: • Type of contact: household contacts, close non-household contacts and casual contacts. • Immunosuppressed / non-immunosuppressed. • Prior B C G vaccination / no prior B C G vaccination. • Contacts 0-10 yrs old / contacts >10 yrs old. • The 3 "ethnic" groups: Canadian-bom, Aboriginal people and foreign-bom. The estimates of the TB rates and hazard ratio were estimated for the contacts that did not receive LTBI treatment, which may better represent those in whom the decision to prescribe LTBI treatment should be made in clinical practice. The subgroup analysis for the contacts that received LTBI treatment was not feasible due to the small number of TB cases (that had TST results) among this group (21 TB cases). The results of the analysis will be presented separately for two TST results: a) For the TST size from the 1s t TST applied after contact with the TB case; and b) For the maximum size of the 1s t or 2 n d TST (1 s t 2 n d TST) applied after exposure. Given the number o fTB cases observed (180 among those with TST results available) for the contacts that did not receive LTBI treatment during the observation period, "overfitting" would likely occur i f we included the 23 parameters that were found to be significant factors for TB in the final models presented in the previous section; i.e. applying for instance one Cox model for each of the types of contacts (household, non-household and casual) would be equivalent to having 69 parameters, which would very likely result in biased coefficients and/or large standard errors due to "overfitting"67. -177-Results To prevent this problem, only the variables that were statistically significant and those that are useful or easily assessed in the majority of the contacts and in a variety of scenarios (industrialized and non-industrialized countries) were included in a "reduced" model. One strategy also utilized to obtain this "reduced" model was to dichotomize the variables when pertinent; i.e. variables that when dichotomized did not change in a (clinically) significant way the coefficients or the statistical significance of other variables in the model. In order to achieve the above, the following procedures were performed: TST was categorized in the 4 groups previously mentioned, which cover the categories used in the Canadian and American guidelines 1 8 ' 2 6. In order to achieve this, the categories 0 and 1-4 mm were collapsed into a single category (0-4 mm); when collapsed into one category (0-4 mm), there was no significant difference (>20% change) in the coefficients of the same or the other variables compared to the final model described before for either the 1 s t TST or the maximum of 1s t and 2 n d TST. Similarly, when immunosuppression was classified as "yes" or "no", the coefficients of the variables in the final model did not change significantly (>20%), with the exception of the coefficient of the TST size category 5-9 mm, which increased approximately 30%; however the conclusions for the other variables in the model and for the model itself remained unchanged. When B C G status was dichotomized, the coefficients of the other variables in the model did not change significantly (>20%) and the conclusions regarding the variables in the model and of the model itself remained unchanged. When the "reduced" model, with the variables as specified before, was compared with the final original model presented in the previous section, there were some variables that changed > 20% in their coefficients, however the conclusions regarding their clinical and statistical significance remained unchanged. The variables included in the "reduced" model were: • Age (in years). • Gender (male/female). • TST size (0-4 mm, 5-9 mm, 10-14 mm and > 15 mm). • Closeness of contact (household, close non-household and casual). -178-Results • B C G status (positive vs. negative); and, • Immunosuppressed at the time of TB exposure ("yes" vs. "no": "yes" i f malnutrition, alcoholism, use of corticosteroids, malignancy or D M were present; "no" otherwise). From the final model obtained in the previous section, several variables were excluded in order to have the number of parameters needed to prevent "overfitting". Being exposed to > 2 TB cases was excluded because of the small number of contacts expected in the clinical practice to be exposed to > 2 TB without having received LTBI treatment; the smear results in the TB case were excluded from the "reduced" model because of its variable sensitivity in different settings, which ranges from 50 to 80% 1 7; ethnicity, SE status and injection drag use were excluded on the bases of having little or no applicability in other (non-Canadian or even non-BC) settings (ethnicity and geographical SE status), or being present in a small proportion of subjects (injection drag use). The "reduced" model above had 8 parameters (when doing subgroup analyses), an important reduction compared to the 23 parameters from the original model; now applying a separate Cox model up to 3 different groups would make "overfitting" unlikely. In addition, i f any of the Cox models results from the subgroup analyses presented numerical instability created by "monotone likelihood" 6 8, i.e. implausible large standard errors, due to the occurrence of a zero frequency of the outcome (TB) during the observation period in one of the categories of the variables in the model, that variable was dropped from the model in order to obtain reliable estimates of the coefficients of the TST size categories. Another common explanation of large standard errors is multicollinearity, which was not observed even in the original model with all the variables included. The analyses presented ahead used the "reduced" model, however, the analyses were performed also using a "full" model (which included ethnicity, SE status and injection drug use); the "full" model frequently contained variables with large standard errors, which did not occurred when the "reduced" model was used (except for when it was applied to some of the "ethnic" subgroups, as it is mentioned in the corresponding section). When compared with the -179-Results "full" model, the "reduced" model almost invariably resulted in smaller hazard ratios for the corresponding TST size categories; however, the conclusions remained the same, i.e. the direction and the statistical significance of the hazard ratio for the corresponding TST size category did not change. -180-Results ASSESSMENT USING THE RESULTS OF THE r ' TST The results of the Cox model applied to all the contacts that did not receive LTBI treatment are presented in Table 45. Table 45. Cox regression model for contacts with no LTBI Rx (1st TST) Robust variance est imation Variable B SE P-value Hazard ratio 95% CI for hazard ratio Lower Upper Age* -.070 .008 < 0001 .932 .918 .947 Gender .251 .149 .093 1.285 .959 1.720 Immunosuppression present* 1.309 .284 < 0001 3.703 2.124 6.456 B C G positive* -1.541 .227 < 0001 .214 .137 .334 Casual contact (reference Category) Household contact* 2.005 .212 < 0001 7.424 4.897 11.256 Non-household contact* 1.092 .219 < 0001 2.982 1.939 4.584 T S T size 0-4 mm (reference category) T S T size 5-9 mm* 1.909 .442 .0002 6.747 2.838 16.038 T S T size 10-14 mm* 3.337 .287 < 0001 28.130 16.035 49.350 T S T size > 15 mm* 4.163 .243 < 0001 64.246 39.871 103.522 B = Coefficient. SE = Standard error. 'Statistically significant (p<0.05). Model LR test=684; df: 9; p-value=0. A l l the variables included in the model, with the exception of gender, were statistically significant; however, gender was included to adjust for any potential effect it may have. There were 33,146 contacts that were found in all databases (DTB, MSP, HSP, etc.), and for whom the information regarding several of the prognostic variables could be cross-validated; of these 28,640 (86.4%) had a TST performed with 5 T U of PPD-S after becoming contacts of the TB case, and 201 (0.7%) of them developed TB during the 12-year follow-up period. Among the contacts mentioned above, there were 30,546 contacts that did not receive LTBI treatment, and 26,542 (86.9%) had a TST after becoming contacts to the TB case; 180 (0.68%) of them developed TB during the 12-year follow-up period. -181-Results The results of the analyses performed will be presented ahead in 3 steps: a) The TB rates for each of the TST size categories; b) The estimate of the hazard ratio obtained for the TST size categories from a univariate analysis using the Cox model; the TST size categories 5-9 mm, 10-14 mm and > 15 mm were compared with the 0-4 mm category. c) The estimation of the adjusted hazard ratio for the TST size categories, i.e. when all the other prognostic factors were included in the Cox model: age, gender, immunosuppression, type of contact and previous B C G vaccination status. All contacts. The results presented below correspond to all the contacts that did not receive LTBI treatment. The number o fTB cases and the TB rates for the different TST size categories are presented in Table 46. The TB rates are progressively high for larger TST sizes, with a more important increase starting in the 10-14 mm TST size category. Table 46. Number ofTB cases and the TB rates for the different TST sizes (1st TST). Contacts with no LTBI treatment received TST size Total at risk TB cases* TB rate/100,000 0-4 mm 21,474 49 228 5-9 mm 1,016 6 591 10-14 mm 1,875 34 1,813 > 15 mm 2,177 91 4,180 Total 26,542 180 678 The results of the univariate analysis of the hazard ratio for the TST size are presented in Table 47. The risk of TB is significantly higher in all the TST size categories compared to the 0-4 mm category. -182-Results Table 47. TB risk according to the TST size from the 1st TST (univariate analysis). Contacts with no LTBI treatment B SE P-value Hazard ratio 95% CI for hazard ratio Lower Upper TST size 0-4 mm (reference category) TST size 5-9 mm* .960 .433 .026 2.611 1.118 6.095 TST size 10-14 mm* 2.091 .223 < 001 8.089 5.223 12.528 TST size > 15 mm* 2.943 .177 < 001 18.979 13.410 26.860 B = Coefficient. SE = Standard error. 'Statistically significant 0p<0.05) compared to the 0-4 mm category. The results of the multivariate analysis are presented in Table 48. When adjusting for the other prognostic factors, the risk of TB increased importantly with larger TST sizes. After adjusting for all the other prognostic factors, the risk of TB is much higher for every TST size category than the observed in the univariate analysis. The risk of TB is significantly higher in all categories compared to the 0-4 mm category. Table 48. TB risk according to the TST size from the 1st TST (multivariate analysis). Contacts with no LTBI treatment B SE P-value 95% CI for hazard ratio Lower Upper TST size 0-4 mm (reference category) TST size 5-9 mm* 1.903 .438 < 001 6.703 2.842 15.813 TST size 10-14 mm* 3.334 .242 < 001 28.053 17.442 45.120 TST size > 15 mm* 4.159 .197 < 001 64.022 43.492 94.242 B = Coefficient. SE = Standard error. +Hazard ratio adjusted for age, gender, immunosuppression, type of contact and previous BCG vaccination. 'Statistically significant compared to the TST size category 0-4 mm. -183-Results The corresponding Kaplan-Meier adjusted survival curve* for the risk of TB according to the TST size is shown in Figure 51. Figure 51. TST size and risk ofTB. Contacts with no LTBI treatment (1st TST). * Adjusted for age, gender, immunosuppression, type of contact and previous BCG vaccination. -184-Results Type of contact The results presented below correspond to the subgroup analysis for contacts that did not receive LTBI treatment, based on the closeness of the contacts to the TB case. Household contacts The TB rate for all the TST size categories was high and increased progressively with larger TST sizes, as it can be appreciated in Table 49. Compared to the TB rates in all the contacts that did not receive LTBI treatment, the household contacts had very high TB rates in all TST size categories, including in the TST size category 0-4 mm. Table 49. Number ofTB cases and the TB rates for the different TST sizes (1st TST). Household contacts TST size Total at risk TB cases* TB rate/100,000 0-4 mm 2,072 21 1,014 5-9 mm 185 4 2,162 10-14 mm 335 15 4,478 > 15 mm 405 45 11,111 Total 2,997 85 2,836 •Number ofTB cases developed during the 12-year e< 0.001. -185-Results In univariate analysis the risk of developing TB among all contacts increased progressively after a TST size of 10 mm, as shown in Table 50; the TST size category of 5-9 mm was not significantly different from the TST size category of 0-4 mm. Table 50. TB risk according to the TST size of the 1st TST (univariate analysis). Household contacts B SE P-value Hazard ratio 95% CI for hazard ratio Lower Upper TST size 0-4 mm (reference category) TST size 5-9 mm .761 .546 .163 2.140 .735 6.234 TST size 10-14 mm* 1.512 .338 < 001 4.534 2.337 8.795 TST size > 15 mm* 2.464 .264 < 001 11.756 7.003 19.736 B = Coefficient. SE = Standard error. 'Statistically significant (p<0.05) compared to the 0-4 mm category. When adjusting for the other prognostic factors in the multivariate analysis, the risk of TB with larger TST sizes is increased importantly, as it is shown in Table 51; now all the TST size categories are highly statistically significant associated with an increased risk of developing TB when compared to the TST size category of 0-4 mm. Table 51. TB risk according to the TST size of the 1st TST (multivariate analysis). Household contacts B SE P-value 95% CI for hazard ratio Lower Upper TST size 0-4 mm (reference category) TST size 5-9 mm* 2.299 .560 < 001 9.963 3.322 29.879 TST size 10-14 mm* 3.232 .370 < 001 25.340 12.280 52.289 TST size > 15 mm* 4.059 .294 < 001 57.938 32.560 103.095 B - Coefficient. SE = Standard error. + Hazard ratio adjusted for age, gender, immunosuppression and previous BCG vaccination. 'Statistically significant compared to the TST size category 0-4 mm. -186-Results The corresponding Kaplan-Meier adjusted survival curve* for the risk of TB according to the TST size is shown in Figure 52. Figure 52. T S T size and risk o fTB. Household contacts (1 s t TST). CD CD CD CG > 2 E O T S T S ize 1st (TST) T ime (years) 'Adjusted for age, gender, immunosuppression and previous BCG vaccination. - 1 8 7 -Results Non-household contacts The TB rate for all the TST size categories increased progressively with larger TST sizes, as it can be appreciated in Table 52. Compared to the TB rates in the household contacts, the non-household contacts had much lower TB rates in general and in every TST size category. Table 52. Number ofTB cases and the TB rates for the different TST sizes (1st TST). Non-household contacts TST size Total at risk TB cases* TB rate/100,000 0-4 mm 8,545 19 222 5-9 mm 338 1 296 10-14 mm 604 11 1,821 > 15 mm 722 34 4,709 Total 10,209 65 634 'Number ofTB cases developed during the 12-year 226.5; df: 3; p-value < 0.001. In univariate analysis the risk of developing TB among all contacts increased progressively after a TST size of 10 mm, as shown in Table 53; the TST size category of 5-9 mm was not significantly different from the TST size category of 0-4 mm. Table 53. TB risk according to the TST size of the 1st TST (univariate analysis). Non-household contacts B SE P-value Hazard ratio 95% CI for hazard ratio Lower Upper TST size 0-4 mm (reference category) TST size 5-9 mm .309 1.026 .763 1.363 .182 10.179 TST size 10-14 mm* 2.120 .379 < 001 8.329 3.964 17.502 TST size > 15 mm* 3.097 .286 < 001 22.132 12.623 38.804 B = Coefficient. SE = Standard error. 'Statistically significant 0p<0.05) compared to the 0-4 mm category. -188-Results When adjusting for the other prognostic factors in the multivariate analysis, the risk of TB with TST sizes 10 mm or larger increased significantly compared to the TST size category of 0-4 mm, as it is shown in Table 54. Even after adjusting for all the other prognostic factors, the TST size category of 5-9 mm did not reach statistical significance. Table 54. TB risk according to the TST size of the 1st TST (multivariate analysis). Non-household contacts B SE P-value 95% CI for hazard ratio n«i£oru i « nu' Lower Upper TST size 0-4 mm (reference category) TST size 5-9 mm 1.252 1.030 .224 3.497 .464 26.346 TST size 10-14 mm* 3.379 .406 < 001 29.348 13.233 65.085 TST size > 15 mm* 4.398 .316 < 001 81.248 43.721 150.984 B = Coefficient. SE = Standard error. + Hazard ratio adjusted for age, gender, immunosuppression and previous BCG vaccination. * Statistically significant compared to the TST size category 0-4 mm. -189-Results The corresponding Kaplan-Meier adjusted survival curve* for the risk of TB according to the TST size is shown in Figure 53. Figure 53. TST size and risk ofTB. Non-Household contacts (1st TST). ru CD CD co > 03 E O 1.01 1.00 Time (years) •Adjusted for age, gender, immunosuppression and previous BCG vaccination. -190-Results Casual contacts The TB rate for all the TST size categories increased progressively with larger TST sizes, as it can be appreciated in Table 55. Compared to the TB rates in the household contacts and in non-household contacts, the casual contacts had lower TB rates in general and in every TST size category. Table 55. Number ofTB cases and the TB rates for the different TST sizes (1st TST). Casual contacts •Number TST size Total at risk TB cases* TB rate/100,000 0-4 mm 10,820 9 83 5-9 mm 490 1 204 10-14 mm 930 8 860 > 15 mm 1,045 12 1,148 Total 13,285 30 226 In univariate analysis the risk of developing TB among all contacts increased progressively after a TST size of 10 mm, as shown in Table 56; the TST size category of 5-9 mm was not significantly different from the TST size category of 0-4 mm. Table 56. TB risk according to the TST size of the 1st TST (univariate analysis). Casual contacts B SE P-value Hazard ratio 95% CI for hazard ratio Lower Upper TST size 0-4 mm (reference category) TST size 5-9 mm .900 1.054 .393 2.460 .312 19.417 TST size 10-14 mm* 2.360 .486 < 001 10.593 4.087 27.457 TST size > 15 mm* 2.648 .441 < 001 14.132 5.954 33.542 B = Coefficient. SE = Standard error. *Statistically significant (p<0.05) compared to the 0-4 mm category. -191-Results When adjusting for the other prognostic factors in the multivariate analysis, the risk of TB with TST sizes 10 mm or larger increased importantly compared to the TST size category of 0-4 mm, as it is shown in Table 57. Even after adjusting for all the other prognostic factors, the TST size category of 5-9 mm did not reach statistical significance. Table 57. TB risk according to the TST size of the 1st TST (multivariate analysis). Casual contacts B SE P-value 95% CI for hazard ratio Lower Upper TST size 0-4 mm (reference category) TST size 5-9 mm 1.486 1.063 .162 4.420 .551 35.463 TST size 10-14 mm* 3.179 .524 < 001 24.023 8.606 67.064 TST size > 15 mm* 3.580 .500 <. 001 35.889 13.475 95.587 B = Coefficient. SE = Standard error. + Hazard ratio adjusted for age, gender, immunosuppression and previous BCG vaccination. * Statistically significant compared to the TST size category 0-4 mm. -192-Results The corresponding Kaplan-Meier adjusted survival curve* for the risk of TB according to the TST size is shown in Figure 54. Figure 54. TST size and risk ofTB. Casual contacts (1S T TST). 1.01 T 1 'Adjusted for age, gender, immunosuppression and previous BCG vaccination. -193-Results Immunosuppression status The results presented below correspond to the subgroup analysis for contacts that did not receive LTBI treatment, based on whether the contacts were immunosuppressed. Non-immunosuppressed contacts The TB rate for all the TST size categories was high and increased progressively with larger TST sizes, as it can be appreciated in Table 58. Table 58. Number ofTB cases and the TB rates for the different TST sizes (1st TST). Non-immunosuppressed contacts •Number TST size Total at risk TB cases* TB rate/100,000 0-4 mm 20,599 43 209 5-9 mm 966 5 518 10-14 mm 1,767 32 1,811 > 15 mm 2,014 83 4,121 Total 25,346 163 643 480.1; df: 3; p-value < 0.001. In univariate analysis the risk of developing TB among all contacts increased significantly after a TST size of 10 mm, as shown in Table 59; although, the hazard ratio for the TST size category of 5-9 mm was close to reach statistical significance. Table 59. TB risk according to the TST size of the 1st TST (univariate analysis). Non-immunosuppressed contacts B SE P-value Hazard ratio 95% CI for hazard ratio Lower Upper T S T size 0-4 mm (reference category) T S T size 5-9 mm .916 .473 .052 2.500 .990 6.313 T S T size 10-14 mm* 2.178 .233 < 001 8.826 5.585 13.948 T S T size > 15 mm* 3.018 .188 < 001 20.441 14.143 29.543 B = Coefficient. SE = Standard error. *Statistically significant (p<0.05) compared to the 0-4 mm category. -194-Results When adjusting for the other prognostic factors in the multivariate analysis, the risk of TB increased importantly with larger TST sizes, as it is shown in Table 60; after adjusting for the other prognostic variables, all the TST size categories were highly statistically significant associated with an increased risk of developing TB when compared to the TST size category of 0-4 mm. Table 60. TB risk according to the TST size of the 1st TST (multivariate analysis). Non-immunosuppressed contacts B SE P-value 95% CI for hazard ratio naMMru ru no i Lower Upper TST size 0-4 mm (reference category) TST size 5-9 mm* 1.936 .478 < 001 6.934 2.718 17.689 TST size 10-14 mm* 3.476 .253 < 001 32.334 19.709 53.046 TST size > 15 mm* 4.306 .207 < 001 74.107 49.422 111.121 B = Coefficient. SE = Standard error. + Hazard ratio adjusted for age, gender, type of contact and previous BCG vaccination. *Statistically significant compared to the TST size category 0-4 mm. -195-Results The corresponding Kaplan-Meier adjusted survival curve* for the risk of TB according to the TST size is shown in Figure 55. Figure 55. TST size and risk ofTB. Non-immunosuppressed contacts (1st TST). 'Adjusted for age, gender, type of contact and previous BCG vaccination. -196-Results Immunosuppressed contacts The TB rate for all the TST size categories was high and increased progressively with larger TST sizes, as it can be appreciated in Table 61. Compared to the non-immunosuppressed contacts, the immunosuppressed contacts had higher TB rates in total and for every TST size category. The TB rate for the TST size categories 0-4 mm and 5-9 mm were approximately 3 to 4 times higher in the immunosuppressed contacts compared to the non-immunosuppressed contacts. However, the TB rate in the immunosuppressed contacts was statistically significant higher than in the non-immunosuppressed contacts only in the TST size category 0-4 mm (Fisher's exact test p-value = 0.014); the Fisher's exact test for the 5-9 mm TST size category was not statistically significant due to the small number of immunosuppressed contacts and TB cases in this category (p-value = 0.262). Table 61. Number ofTB cases and the TB rates for the different TST sizes (1st TST). Immunosuppressed contacts 'Number TST size Total at risk TB cases* TB rate/100,000 0-4 mm 875 6 686 5-9 mm 50 1 2,000 10-14 mm 108 2 1,852 > 15 mm 163 8 4 ,908 Total 1,196 17 1,421 -197-Results In univariate analysis the risk of developing TB among all contacts increased progressively after a TST size of 10 mm, as shown in Table 62; the TST size category of 5-9 mm was not significantly different from the TST size category of 0-4 mm. Table 62. TB risk according to the TST size of the 1st TST (univariate analysis). Immunosuppressed contacts B SE P-value Hazard ratio 95% CI for hazard ratio Lower Upper T S T size 0-4 mm (reference category) T S T size 5-9 mm 1.065 1.080 .324 2.900 .349 24.089 T S T size 10-14 mm* .982 .817 .229 2.670 .539 13.229 T S T size > 15 mm* 1.969 .540 .000 7.163 2.485 20.644 B = Coefficient. SE = Standard error. *Statistically significant (p<0.05) compared to the 0-4 mm category. When adjusting for the other prognostic factors in the multivariate analysis, the risk of TB increased with larger TST sizes, as it is shown in Table 63. However, only the increased risk of TB in the > 15 mm TST size category was statistically significant compared to the 0-4 mm category. Table 63. TB risk according to the TST size of the 1st TST (multivariate analysis). Immunosuppressed contacts B SE P-value Hazard ratio+ 95% CI for hazard ratio Lower Upper T S T size 0-4 mm (reference category) T S T size 5-9 mm .970 1.102 .379 2.638 .304 22.878 T S T size 10-14 mm 1.542 .885 .081 4.673 .825 26.480 T S T size > 15 mm* 2.404 .615 < 001 11.067 3.314 36.958 B = Coefficient. SE = Standard error. + Hazard ratio adjusted for age, gender, type of contact and previous BCG vaccination. *Statistically significant compared to the TST size category 0-4 mm. -198-Results The corresponding Kaplan-Meier adjusted survival curve* for the risk of TB according to the TST size is shown in Figure 56. Figure 56. TST size and risk ofTB. Immunosuppressed contacts (1ST TST). 1.01 T 1 Time (years) •Adjusted for age, gender, type of contact and BCG vaccination. The distribution of the 24 contacts that were immunosuppressed and developed TB according to the type of contact was the following: 8 (33%) of them were household contacts; 10 (42%) of them were close non-household contacts; and 6 (25%) of them were casual contacts. -199-Results BCG status The results presented below correspond to the subgroup analysis for contacts that did not receive LTBI treatment, based on whether the contacts had received B C G . Included with the contacts who did not receive B C G are also contacts with uncertain B C G status. Contacts with no previous B C G The TB rate for all the TST size categories increased progressively with larger TST sizes, as it can be appreciated in Table 64. Table 64. Number ofTB cases and the TB rates for the different TST sizes (1st TST). No previous BCG TST size Total at risk TB cases* TB rate/100,000 0-4 mm 19,563 40 204 5-9 mm 508 4 787 10-14 mm 869 32 3,682 > 15 mm 1,143 73 6,387 Total 22,083 149 675 738.4; df: 3; p-value < 0.001. -200-Results In univariate analysis the risk of developing TB among all contacts increased progressively after a TST size of 5 mm, as shown in Table 65; all the TST size categories were significantly different from the TST size category of 0-4 mm. Table 65. TB risk according to the TST size of the 1st TST (univariate analysis). No previous BCG B SE P-value Hazard ratio 95% CI for hazard ratio Lower Upper T S T size 0-4 mm (reference category) T S T size 5-9 mm* 1.356 .524 .010 3.882 1.389 10.851 T S T size 10-14 mm* 2.920 .237 .000 18.541 11.647 29.513 T S T size > 15 mm* 3.493 .197 .000 32.889 22.365 48.365 B = Coefficient. SE = Standard error. 'Statistically significant (p<0.05) compared to the 0-4 mm category. When adjusting for the other prognostic factors in the multivariate analysis, the risk of TB increased importantly with larger TST sizes, as it is shown in Table 66; all the TST size categories are highly statistically significant associated with an increased risk of developing TB when compared to the TST size category of 0-4 mm. Table 66. TB risk according to the TST size of the 1st TST (multivariate analysis). No previous BCG B SE P-value Hazard ratio+ 95% CI for hazard ratio Lower Upper T S T size 0-4 mm (reference category) T S T size 5-9 mm* 1.921 .528 < 001 6.826 2.427 19.197 T S T size 10-14 mm* 3.841 .251 < 001 46.585 28.462 76.250 T S T size > 15 mm* 4.335 .214 < 001 76.352 50.204 116.117 B = Coefficient. SE = Standard error. + Hazard ratio adjusted for age, gender, type of contact and immunosuppression. 'Statistically significant compared to the TST size category 0-4 mm. -201-Results The corresponding Kaplan-Meier adjusted survival curve* for the risk of TB according to the TST size is shown in Figure 57. Figure 57. TST size and risk ofTB. Contacts with no previous BCG (1st TST). •Adjusted for age, gender, type of contact and immunosuppression. -202-Results Contacts with previous B C G The TB rate for the TST size categories 5-14 mm in contacts with B C G were lower than the TB rates of the 0-4 mm and > 15 mm categories, as shown in Table 67. It is interesting also to note that compared with contacts with no or unknown previous B C G , the TB rate in contacts with previous B C G was much higher for the TST size category of 0-4 mm; while the TB rates in all the other categories were significantly lower than the corresponding ones for the contacts with no previous B C G . However, the overall TB rate was very similar in both, B C G positive and B C G negative contacts. Table 67. Number of TB cases and the TB rates for the different TST sizes (1st TST). Previous BCG 'Number TST size Total at risk TB cases* TB rate/100,000 0-4 mm 1,911 9 471 5-9 mm 508 2 394 10-14 mm 1,006 2 199 > 15 mm 1,034 18 1,741 Total 4,459 31 695 22.0; df: 3; p-value = 0.002. -203-Results In univariate analysis the risk of developing TB among all contacts, according to the observed TB rates, was smaller for contacts in the TST categories 5-9 mm and 10-14 mm, compared to the 0-4 mm category; however, these risk reductions were not statistically significant. The risk of TB for contacts with TST sizes > 15 mm was higher compared to all the categories, and statistically significant when compared to the 0-4 mm category, as shown in Table 68. Table 68. TB risk according to the TST size of the 1st TST (univariate analysis). Previous BCG B SE P-value Hazard ratio 95% CI for hazard ratio Lower Upper TST size 0-4 mm (reference category) TST size 5-9 mm -.172 .782 .826 .842 .182 3.899 TST size 10-14 mm -.856 .782 .273 .425 .092 1.966 TST size > 15 mm* 1.323 .408 .001 3.757 1.688 8.362 B = Coefficient. SE = Standard error. *Statistically significant (p<0.05) compared to the 0-4 mm category. When adjusting for the other prognostic factors in the multivariate analysis, the risk of TB for the TST categories 5-9 mm and 10-14 mm was no different than the risk in the 0-4 mm category. The TST size category > 15 mm had a higher risk o f T B compared to all the other categories, and was statistically significant compared to the 0-4 mm category, as appreciated in Table 69. Table 69. TB risk according to the TST size of the 1st TST (multivariate analysis). Previous BCG 1 B SE P-value 95% CI for hazard ratio Lower Upper TST size 0-4 mm (reference category) TST size 5-9 mm .354 .790 .654 1.425 .303 6.706 TST size 10-14 mm -.355 .789 .653 .701 .149 3.293 TST size > 15 mm* 2.292 .446 < 001 9.892 4.124 23.728 B = Coefficient. SE = Standard error. +Hazard ratio adjusted for age, gender, type of contact and immunosuppression. *Statistically significant compared to the TST size category 0-4 mm. -204-Results The corresponding Kaplan-Meier adjusted survival curve* for the risk of TB according to the TST size is shown in Figure 58. Figure 58. TST size and risk ofTB. Contacts with previous BCG (1st TST). 1 oo Time (years) * Adjusted for age, gender, type of contact and immunosuppression. -205-Results Age of contacts. The results presented below correspond to the subgroup analysis for contacts that did not receive LTBI treatment, based on the age groups of the contacts. The figure shows the TB rate for the different age groups in these contacts. i i 1 1 • 1 1 • 1 1— 0 yrs CM 6-10 yrs 16-20 yrs 31-40 yrs 51-60 yrs 1-5 yrs 11-15 yrs 21-30 yrs 41-50 yrs >60yrs Age Based on the observed risk for different age groups presented below, the contacts were categorized in 2 groups for the analysis: a) contacts 0-10 years, and b) contacts >10 years old. -206-Results Contacts 0-10 yrs old The TB rate for all the TST size categories was very high and increased progressively with larger TST sizes, as it can be observed in Table 70. A n important increase in the TB rates was observed in the TST size category 5-9 mm. Table 70. Number ofTB cases and the TB rates for the different TST sizes (1st TST). Contacts 0-10 yrs old TST size Total at risk TB cases* TB rate/100,000 0-4 mm 2,110 17 806 5-9 mm 18 1 5,556 10-14 mm 33 14 42,424 > 15 mm 51 34 66,667 Total 2,212 66 2,984 •Number ofTB cases developed during the 12-year 926.8; df: 3; p-value < 0.001. In univariate analysis the risk of developing TB among all contacts increased progressively after a TST size of 5 mm, as shown in Table 71; however, likely due to the small number of contacts in the 5-9 mm category, as well as TB cases among them, this hazard ratio for this category was not statistically significant. The hazard ratios for all the larger TST size categories were statistically significant. Table 71. TB risk according to the TST size from the 1st TST (univariate analysis). Contacts 0-10 yrs old B SE P-value Hazard ratio 95% CI for hazard ratio Lower Upper TST size 0-4 mm (reference category) TST size 5-9 mm 1.981 1.029 .054 7.251 0.965 54.483 TST size 10-14 mm* 4.275 .362 < .001 71.862 35.360 146.045 TST size > 15 mm* 5.019 .301 <001 151.241 83.914 272.587 B = Coefficient. SE = Standard error. *Statistically significant (p<0.05) compared to the 0-4 mm category. -207-Results When adjusting for the other prognostic factors in the multivariate analysis, the risk of TB increased importantly with larger TST sizes, as it is shown in Table 72. Nonetheless, the hazard ratio for the 5-9 mm category was not statistically significant. The risk of TB in all the larger TST size categories is statistically significant. After adjusting for all the other factors in the multivariate analysis, the risk of TB for all the TST size categories -except the 5-9 mm category- increased when compared to the risks estimated in the univariate analysis. Table 72. TB risk according to the TST size from the 1st TST (multivariate analysis). Contacts 0-10 yrs old B SE P-value 95% CI for hazard ratio Lower Upper TST size 0-4 mm (reference category) TST size 5-9 mm 1.791 1.055 .090 5.998 0.758 47.430 TST size 10-14 mm* 4.514 .388 < .001 91.278 42.651 195.346 TST size > 15 mm* 5.174 .312 <.001 176.534 95.739 325.512 B = Coefficient. SE = Standard error. +Hazard ratio adjusted for gender, type of contact, immunosuppression and previous BCG vaccination. 'Statistically significant compared to the TST size category 0-4 mm. -208-Results The corresponding Kaplan-Meier adjusted survival curve* for the risk of TB according to the TST size is shown in Figure 59. Figure 59. TST size and risk ofTB. Contacts 0-10 year-old (1st TST). T S T S ize (1st TST) >=15 mm n 10-14 mm D 5-9 mm ° 0-4 mm 10 •Adjusted for gender, type of contact, immunosuppression and previous B C G vaccination. Note that all the TB cases in contacts 0-10 yrs old occurred within 3 years (more specifically within 2.9 years) after exposure and almost all within the first year. The distribution of the 66 children 0-10 yrs old that developed TB according to the type of contact was as follows: 35 (53%) of them were household contacts; 26 (39.4%) were close non-household contacts; and only 5 (7.6%) were casual contacts. 1.10 3 o .40-.30-.20 0 1 2 3 4 5 6 7 8 9 Time (years) -209-Results Contacts >10 yrs old The TB rate for all the TST size categories was much lower than the observed in children 0-10 yrs, as observed in Table 73. The increase in TB rate in contacts > 10 yrs of age was observed from the TST size category 5-9 mm. Table 73. Number ofTB cases and the TB rates for the different TST sizes (1st TST). Contacts >10 yrs old TST size Total at risk TB cases* TB rate/100,000 0-4 mm 19,364 32 165 5-9 mm 998 5 501 10-14 mm 1,842 20 1,086 > 15 mm 2,126 57 2,681 Total 24,330 114 469 e< 0.001. In univariate analysis the risk of developing TB among all contacts increased after a TST size of 5 mm, as shown in Table 74 and was progressively higher with larger TST sizes. Table 74. TB risk according to the TST size from the 1st TST (univariate analysis). Contacts >10 yrs old B SE P-value Hazard ratio 95% CI for hazard ratio Lower Upper TST size 0-4 mm (reference category) TST size 5-9 mm* 1.119 .481 .020 3.063 1.194 7.862 TST size 10-14 mm* 1.898 .285 <.001 6.671 3.816 11.663 TST size > 15 mm* 2.816 .221 < .001 16.711 10.838 25.766 B = Coefficient. SE = Standard error. 'Statistically significant (p<0.05) compared to the 0-4 mm category. -210-Results When adjusting for the other prognostic factors in the multivariate analysis, the risk of TB increased significantly after TST sizes of 5 mm, as observed in Table 75. A l l the TST size categories had a much lower (adjusted) TB risk compared to the contacts 0-10 yrs old. Table 75. TB risk according to the TST size from the 1st TST (multivariate analysis). Contacts >10 yrs old B SE P-value 95% CI for hazard ratio Lower Upper TST size 0-4 mm (reference category) TST size 5-9 mm* 1.165 .486 .017 3.205 1.237 8.306 TST size 10-14 mm* 1.965 .295 < .001 7.138 4.003 12.727 TST size > 15 mm* 2.796 .231 < .001 16.379 10.406 25.782 B = Coefficient. SE = Standard error. +Hazard ratio adjusted for gender, type of contact, immunosuppression and previous BCG vaccination. * Statistically significant compared to the TST size category 0-4 mm. -211-Results The corresponding Kaplan-Meier adjusted survival curve* for the risk of TB according to the TST size is shown in Figure 60. Figure 60. TST size and risk ofTB. Contacts >10 years old (1st TST). 1.01 Time (years) •Adjusted for gender, type of contact, immunosuppression and previous BCG vaccination. -212-Results Ethnicity. The results presented below correspond to the subgroup analysis for contacts that did not receive LTBI treatment, based on the ethnicity of the contacts. The results are presented for the following "ethnic groups": a) Canadian-born contacts; b) Aboriginal contacts; and c) foreign-bom contacts. Canadian-bom contacts The TB rate for all the TST size categories was increasingly high with larger TST sizes, as it can be appreciated in Table 76. A n important increase in the TB rates was observed in the TST size categories 5-9 mm and 10-14 mm. Table 76. Number ofTB cases and the TB rates for the different TST sizes (1st TST). Canadian-born contacts TST size Total at risk TB cases* TB rate/100,000 0-4 mm 14,662 23 157 5-9 mm 255 2 784 10-14 mm 366 19 5,191 > 15 mm 525 42 8,000 Total 15,808 86 544 •Number ofTB cases developed during the 12-year follow-up period. X 2 = 726.4 df: 3; p-value < 0.001. -213-Results In univariate analysis the risk of developing TB among all contacts increased progressively after a TST size of 5 mm, as shown in Table 77; the hazard ratios for all the TST size categories were statistically significant. Table 77. TB risk according to the TST size from the 1st TST (univariate analysis). Canadian-born contacts B SE P-value Hazard ratio 95% CI for hazard ratio Lower Upper TST size 0-4 mm (reference category) TST size 5-9 mm* 1.618 .737 .028 5.043 1.189 21.391 TST size 10-14 mm* 3.526 .310 <.001 33.986 18.510 62.400 TST size > 15 mm* 3.990 .259 < .001 54.077 32.522 89.918 B = Coefficient. SE = Standard error. * Statistically significant (p<0.05) compared to the 0-4 mm category. When adjusting for the other prognostic factors, the risk of TB increased importantly with larger TST sizes, as shown in Table 78. The risk of TB for all the TST size categories in the multivariate analysis increased importantly when compared to the risks estimated in the univariate analysis. In the reduced model applied, B C G was excluded because of its large standard errors. The models with and without B C G reached the same conclusions, i.e. that the hazard ratios for all the TST size categories were clinically and statistically significant. Table 78. TB risk according to the TST size from the 1st TST (multivariate analysis). Canadian-born contacts B SE P-value Hazard ratio+ 95% CI for hazard ratio Lower Upper TST size 0-4 mm (reference category) TST size 5-9 mm* 2.263 .740 .002 9.615 2.254 41.018 TST size 10-14 mm* 4.531 .324 < .001 92.872 49.207 175.285 TST size > 15 mm* 5.041 .282 <001 154.592 89.022 268.460 B - Coefficient. SE = Standard error. +Hazard ratio adjusted for age, gender, type of contact and immunosuppression. *Statistically significant compared to the TST size category 0-4 mm. -214-Results The corresponding Kaplan-Meier adjusted survival curve* for the risk of TB according to the TST size is shown in Figure 61. Figure 61. TST size and risk ofTB. Canadian-born contacts (1st TST). 1.02 1.00 CO I— o CD Li-ra > CO E o Time (years) •Adjusted for age, gender, type of contact and immunosuppression. -215-Results Aboriginal contacts The TB rate was very high in all TST size categories for Aboriginal contacts and increased progressively with larger TST sizes. The overall TB rate was much higher than the observed rate in the Canadian-born contacts, especially in the TST size categories 0-4 mm and 5-9 mm, as it can be appreciated in Table 79. Table 79. Number ofTB cases and the TB rates for the different TST sizes (1st TST). Aboriginal contacts TST size Total at risk TB cases* TB rate/100,000 0-4 mm 1,743 17 975 5-9 mm 70 1 1,429 10-14 mm 127 7 5,512 > 15 mm 208 26 12,500 Total 2,148 51 2,374 0.001. In univariate analysis the risk of developing TB among all contacts increased significantly after a TST size of 10 mm, as shown in Table 80; the TST size category 5-9 mm did not differ in the risk of TB compared to the 0-4 mm. The hazard ratios for all TST size categories is significantly lower than the corresponding one for Canadian-born non-Aboriginal and it is largely due to the already high TB rate observed in the 0-4 mm TST size category in Aboriginal people. Table 80. TB risk according to the TST size from the 1st TST (univariate analysis). Aboriginal contacts B SE P-value Hazard ratio 95% CI for hazard ratio Lower Upper TST size 0-4 mm (reference category) TST size 5-9 mm .389 1.029 .705 1.476 .196 11.092 TST size 10-14 mm* 1.785 .449 < .001 5.962 2.472 14.381 TST size > 15 mm* 2.639 .312 <001 14.004 7.596 25.816 B = Coefficient. SE = Standard error. 'Statistically significant (p<0.05) compared to the 0-4 mm category. -216-Results When adjusting for the other prognostic factors in the multivariate analysis, the risk of TB increased importantly with larger TST sizes, as shown in Table 81. However, the TST size category of 5-9 mm did not reach statistical significance after adjusting for all the other factors either. Table 81. TB risk according to the TST size from the 1st TST (multivariate analysis). Aboriginal contacts B SE P-value Hazard ratio+ 95% CI for hazard ratio Lower Upper TST size 0-4 mm (reference category) TST size 5-9 mm 1.321 1.041 .204 3.747 .487 28.804 TST size 10-14 mm* 3.502 .485 < .001 33.195 12.826 85.911 TST size > 15 mm* 4.042 .342 <.001 56.917 29.093 111.352 B = Coefficient. SE = Standard error. +Hazard ratio adjusted for age, gender, type of contact, BCG status and immunosuppression. * Statistically significant compared to the TST size category 0-4 mm. -217-Results The corresponding Kaplan-Meier adjusted survival curve* for the risk of TB according to the TST size is shown in Figure 62. Figure 62. TST size and risk ofTB. Aboriginal contacts (1st TST). CD f -O <D 0 •_ Li-ra > CO E o Time (years) •Adjusted for age, gender, type of contact, B C G status and immunosuppression. -218-Results Foreign-born contacts The TB rate progressively increased with larger TST size categories in foreign-born contacts as it can be appreciated in Table 82. The overall TB rate and the TB rate for the TST size category 0-4 mm was similar than the observed in Canadian-born contacts, but it was much lower for each of the TST size categories > 5 mm. Compared to Aboriginal people, foreign-bom contacts had much lower overall TB rates, as well as for each TST size category. Table 82. Number ofTB cases and the TB rates for the different TST sizes (1st TST). Foreign-born contacts TST size Total at risk TB cases* TB rate/100,000 0-4 mm 5,060 9 178 5-9 mm 690 3 435 10-14 mm 1,382 8 579 > 15 mm 1,442 23 1,595 Total 8,574 43 502 -219-R e s u l t s In univariate analysis the risk of developing TB among all contacts increased progressively after a TST size of 5 mm, as shown in Table 83; however, compared to the TST size category of 0-4 mm, the 5-9 mm one did not reach statistical significance. The hazard ratios for the all the TST categories were much lower than the corresponding for Canadian-born non-Aboriginal; they were also lower than the ones for Aboriginal people (with the exception of the 5-9 mm category). Table 83. TB risk according to the TST size from the 1st TST (univariate analysis). Foreign-born contacts B SE P-value Hazard ratio 95% CI for hazard ratio Lower Upper TST size 0-4 mm (reference category) TST size 5-9 mm .894 .667 .180 2.445 .662 9.030 TST size 10-14 mm* 1.181 .486 .015 3.257 1.257 8.442 TST size > 15 mm* 2.206 .393 < .001 9.079 4.201 19.621 B = Coefficient. SE = Standard error. 'Statistically significant (p<0.05) compared to the 0^ 4 mm category. For this multivariate analysis model, the variable "immunosuppression" was excluded due to its implausible large standard errors. The conclusions with and without it in the model were the same. When adjusting for the other prognostic factors in the multivariate analysis, the risk of TB increased with larger TST sizes, as shown in Table 84. Even after adjusting for all the other factors in the multivariate analysis, the TST size category of 5-9 mm was not significantly different from the 0-4 mm category. Table 84. TB risk according to the TST size from the 1st TST (multivariate analysis). Foreign-born contacts B SE P-value Hazard ratio+ 95% CI for hazard ratio Lower Upper TST size 0-4 mm (reference category) TST size 5-9 mm .817 .678 .228 2.265 .599 8.561 TST size 10-14 mm* 1.194 .506 .018 3.301 1.226 8.893 TST size > 15 mm* 2.233 .416 < .001 9.324 4.127 21.065 B - Coefficient. SE = Standard error. +Hazard ratio adjusted for age, gender, type of contact and BCG status. 'Statistically significant compared to the TST size category 0-4 mm. -220-Results The corresponding Kaplan-Meier adjusted survival curve* for the risk of TB according to the TST size is shown in Figure 63. Figure 63. TST size and risk ofTB. Foreign-born contacts (1st TST). 1.002 T S T S ize (1st TST) T ime (years) •Adjusted for age, gender, type of contact and BCG status. -221-Results Injection drug users. The analysis of the risk of TB according to the use of injection drugs could not be performed due to the absence ofTB cases among some of the TST size categories as shown in Table 85. Table 85. Number ofTB cases and the TB rates for the different TST sizes (1st TST). Injection drug users TST size Total at risk TB cases* TB rate/100,000 0-4 mm 75 0 0 5-9 mm 5 0 0 10-14 mm 5 1 20,000 > 15 mm 9 3 33,333 Total 94 4 4,255 •Number ofTB cases developed during the 12-year follow-up period. X = 25.3; df: 3 ; p-value = 0.001. -222-Results ASSESSMENT USING THE MAXIMUM SIZE OF THE r 1 OR 2*" TST The results of the Cox model applied to all the contacts that did not receive LTBI treatment are presented in Table 86. Table 86. Reduced Cox regression model (V1 or 2™ TST) Robust variance Estimate Variable B SE P-value Hazard ratio 95% CI for hazard ratio Lower Upper Age* -.074 .008 < 0001 .928 .914 .942 Gender* .320 .148 .030 1.377 1.031 1.839 Immunosuppression present* 1.365 .287 < 0001 3.916 2.229 6.880 BCG positive* -1.701 .208 < 0001 .182 .121 :275 Casual contact (reference Category) Household contact* 1.772 .214 < 0001 5.882 3.866 8.950 Non-household contact* 1.084 .218 < 0001 2.957 1.927 4.538 TST size 0-4 mm (reference category) TST size 5-9 mm* 2.660 .508 < 0001 14.296 5.275 38.746 TST size 10-14 mm* 4.256 .332 < 0001 70.533 36.767 135.306 TST size > 15 mm* 5.273 .299 < 0001 195.025 108.482 350.607 B = Coefficient. SE = Standard error. 'Statistically significant 0p<0.05). Model LR test=894; 9 df; p-value=0. A l l the variables included in the model were statistically significant. It can be observed that when using the maximum size of either the 1s t or the 2 n d TST, the risk of TB is greatly increased in all the TST size categories compared to the results obtained from the 1s t TST. The analyses performed for the maximum of the 1 s t or 2 n d TST (1 s t 2 n d TST) below will be presented in the same fashion as it was done for the results of the 1s t TST. -223-Results All contacts. The results presented below correspond to all the contacts that did not receive LTBI treatment and had a 2 n d TST applied after exposure. The number ofTB cases and the TB rates for the different TST size categories are presented in Table 87. The TB rates are progressively high for larger TST sizes. Compared to the results of the 1 s t TST, there is an important reduction in the TB rate observed for the TST size category of 0-4 mm in the 1 s t 2 n d TST; this is mainly explained by the redistribution of contacts from the 0-4 mm to the larger categories, especially to the > 15 mm TST size category. Table 87. Number ofTB cases and the TB rates for the different TST sizes (1st or 2nd TST). Contacts with no LTBI treatment received TST size Total at risk TB cases* TB rate/100,000 0-4 mm 20,773 18 87 5-9 mm 1,102 5 454 10-14 mm 2,211 39 1,764 > 15 mm 2,456 118 4,805 Total 26,542 180 678 •Number ofTB cases developed during the 12-year 768.3; df: 3; p-value < 0.001. -224-Results The results of the univariate analysis of the hazard ratio for the TST size are presented in Table 88. The risk o fTB is significantly higher in all the TST size categories compared to the 0-4 mm category. Table 88. TB risk according to the TST size from the 1st or 2nd TST (univariate analysis). Contacts with no LTBI treatment B SE P-value Hazard ratio 95% CI for hazard ratio Lower Upper TST size 0-4 mm (reference category) TST size 5-9 mm* 1.667 .506 .001 5.296 1.966 14.264 TST size 10-14 mm* 3.031 .285 < 001 20.720 11.853 36.219 TST size > 15 mm* 4.053 .253 < 001 57.550 35.046 94.503 B = Coefficient. SE = Standard error. *Statistically significant (p<0.05) compared to the 0-4 mm category. The results of the multivariate analysis are presented in Table 89. When adjusting for the other prognostic factors, the risk of TB increased importantly with larger TST sizes. After adjusting for all the other prognostic factors, the risk of TB is much higher for every TST size category, starting with the 5-9 mm category. Compared with the results of the 1 s t TST, the hazard ratios for all TST size categories are significantly higher in the 1 s t 2 n d TST, with a resulting better prediction o fTB risk starting from the 5-9 mm TST size category. Table 89. TB risk according to the TST size from the 1st or 2nd TST (multivariate analysis). Contacts with no LTBI treatment B SE P-value Hazard ratio+ 95% CI for hazard ratio Lower Upper TST size 0-4 mm (reference category) TST size 5-9 mm* 2.643 .509 < 001 14.054 5.180 38.131 TST size 10-14 mm* 4.254 .296 < 001 70.390 39.373 125.841 TST size > 15 mm* 5.270 .264 < 001 194.469 115.912 326.266 B - Coefficient. SE = Standard error. +Hazard ratio adjusted for age, gender, immunosuppression, type of contact and previous BCG vaccination. * Statistically significant compared to the TST size category 0-4 mm. -225-Results The corresponding Kaplan-Meier adjusted survival curve* for the risk of TB according to the TST size is shown in Figure 64. Figure 64. TST size and risk o fTB. Contacts with no LTBI Rx (1s t 2 n d TST) 1.01 Time (years) •Adjusted for age, gender, immunosuppression, type of contact and previous B C G vaccination. -226-Results Type of contact The results presented below correspond to the subgroup analysis for contacts that did not receive LTBI treatment, based on the closeness of the contacts to the TB case. Household contacts The TB rate for all the TST size categories > 5 mm was high and increased progressively with larger TST sizes, as it can be appreciated in Table 90. Compared to the TB rates in all the contacts that did not receive LTBI treatment, the household contacts had very high TB rates in the TST size categories > 5 mm. Compared with the results for the 1s t TST, the TST size category of 0-4 mm for the 1s t 2 n d TST had a much lower TB rate (157 vs. 1,014), which is explained by a redistribution of contacts from the 0-4 mm to the larger categories, especially to the > 15 mm TST size category. Table 90. Number ofTB cases and the TB rates for the different TST sizes (1st or 2nd TST). Household contacts TST size Total at risk TB cases* TB rate/100,000 0-4 mm 1,906 3 157 5-9 mm 199 4 2,010 10-14 mm 423 17 4,019 > 15 mm 469 61 13,006 Total 2,997 85 2,836 e< 0.001. -227-Results In univariate analysis the risk of developing TB among all contacts increased progressively after a TST size of 5 mm, as shown in Table 91; the hazard ratios for all the TST sizes were much higher compared to the ones from the 1 s t TST and they are all significantly different from the 0-4 mm (while in the results from the 1s t TST, the category of 5-9 mm was not). i Table 91. TB risk according to the TST size of the 1st or 2nd TST (univariate analysis). Household contacts B SE P-value Hazard ratio 95% CI for hazard ratio Lower Upper T S T size 0-4 mm (reference category) T S T size 5-9 mm* 2.555 .764 .001 12.870 2.880 57.503 T S T size 10-14 mm* 3.263 .626 < 001 26.140 7.661 89.198 T S T size > 15 mm* 4.490 .591 < 001 89.160 27.975 284.168 B = Coefficient. SE = Standard error. 'Statistically significant (p<0.05) compared to the 0-4 mm category. When adjusting for the other prognostic factors in the multivariate analysis, the risk of TB with larger TST sizes increased importantly compared to the univariate analysis, as it is shown in Table 92; similarly, the adjusted hazard ratios for all the TST size categories is much higher than the corresponding results obtained for the 1 s t TST. Table 92. TB risk according to the TST size of the 1st or 2nd TST (multivariate analysis). Household contacts B SE P-value Hazard ratio+ 95% CI for hazard ratio Lower Upper T S T size 0-4 mm (reference category) T S T size 5-9 mm* 4.081 .771 < 001 59.185 13.048 268.463 T S T size 10-14 mm* 4.906 .638 < 001 135.106 38.659 472.167 T S T size > 15 mm* 6.002 .600 < 001 404.376 124.675 1311.568 B - Coefficient. SE = Standard error. + Hazard ratio adjusted for age, gender, immunosuppression and previous BCG vaccination. 'Statistically significant compared to the TST size category 0-4 mm. -228-Results The corresponding Kaplan-Meier adjusted survival curve* for the risk of TB according to the TST size is shown in Figure 65. Figure 65. T S T size and risk o f T B . Household contacts (1ST 2 n d T S T ) 1.01 CD 0 CD L_ Li-nj > 3 co E 3 o TST Size 1st 2nd TST 6 7 8 9 10 Time (years) •Adjusted for age, gender, immunosuppression and previous B C G vaccination. -229-Results Non-household contacts The TB rate was much larger in the TST size categories 10-14 mm and > 15 mm compared to the 0-4 mm and especially to the 5-9 mm, which did not provide any TB cases, as it can be appreciated in Table 93. Compared to the TB rates in the household contacts, the non-household contacts had much lower TB rates in general and in every TST size category, with the exception of the 0-4 mm one, which was similar. Compared to the results from the 1 s t TST, there was not a clear redistribution of TB cases from the 0-4 mm category to the larger TST size categories, as observed in the household contacts. Table 93. Number ofTB cases and the TB rates for the different TST sizes (1st or 2nd TST). Non-household contacts •Number TST size Total at risk TB cases* TB rate/100,000 0-4 mm 8,296 11 133 5-9 mm 372 0 0 10-14 mm 719 13 1,808 > 15 mm 822 41 4,988 Total 10,209 65 634 Xz = 297.3; df: 3; p-value < 0.001. -230-Results In univariate analysis the risk of developing TB among all contacts increased progressively after a TST size of 10 mm, as shown in Table 94; the TST size category of 5-9 mm was not significantly different from the TST size category of 0-4 mm; in fact, since there were no TB cases in the 5-9 mm category, the estimates obtained from the statistical package are not reliable. Table 94. TB risk according to the TST size of the 1st or 2nd TST (univariate analysis). Non-household contacts B SE P-value Hazard ratio 95% CI for hazard ratio Lower Upper TST size 0-4 mm (reference category) TST size 5-9 mm -10.282 245.677 .967 .000 .000 TST size 10-14 mm* 2.635 .410 .000 13.941 6.245 31.118 TST size > 15 mm* 3.670 .340 .000 39.265 20.181 76.396 B = Coefficient. SE = Standard error. *Statistically significant (p<0.05) compared to the 0-4 mm category. When adjusting for the other prognostic factors in the multivariate analysis, the risk of TB with TST sizes 10 mm or larger increased importantly compared to the TST size category of 0-4 mm, as it is shown in Table 95. Since there were no TB cases in the 5-9 mm TST size category, no estimates were obtained. Compared with the results obtained using the 1 s t TST, the adjusted hazard ratios are noticeably higher in the I s ' 2 n d TST, but the conclusions remain the same. Table 95. TB risk according to the TST size of the 1st or 2nd TST (multivariate analysis). Non-household contacts B SE P-value Hazard ratio+ 95% CI for hazard ratio Lower Upper TST size 0-4 mm (reference category) TST size 5-9 mm -10.529 488.277 .983 .000 .000 TST size 10-14 mm* 3.966 .430 < 001 52.785 22.736 122.551 TST size > 15 mm* 5.061 .362 < 001 157.742 77.539 320.902 B - Coefficient. SE = Standard error. + Hazard ratio adjusted for age, gender, immunosuppression and previous BCG vaccination. *Statistically significant compared to the TST size category 0-4 mm. -231-Results The corresponding Kaplan-Meier adjusted survival curve* for the risk of TB according to the TST size is shown in Figure 66. Figure 66. TST size and risk o fTB. Non-Household contacts (1 s t 2 n d TST). m h-o 05 <D i— LL > e CO E Time (years) "Adjusted for age, gender, immunosuppression and previous B C G vaccination. -232-Results Casual contacts The TB rate for all the TST size categories increased progressively with larger TST sizes, as it can be appreciated in Table 96. Compared to the TB rates in the household contacts and in non-household contacts, the casual contacts had lower TB rates in general and in every TST size category (with the exception of the 5-9 mm category in the non-household contacts, which had no TB cases). Compared with the results using the 1 s t TST, the 1 s t 2 n d TST TB rates in the 0-4 mm TST size category were smaller, while the TB rates in the > 15 mm were slightly higher; thus, there was a redistribution of contacts from the 0-4 mm to the > 15 mm TST size category. Table 96. Number ofTB cases and the TB rates for the different TST sizes (1st or 2nd TST). Casual contacts *Number TST size Total at risk TB cases* TB rate/100,000 0-4 mm 10,536 4 38 5-9 mm 528 1 189 10-14 mm 1,062 9 847 > 15 mm 1,159 16 1,381 Total 13,285 30 226 e< 0.001. -233-Results In univariate analysis the risk of developing TB among all contacts increased progressively after a TST size of 10 mm, as shown in Table 97; the TST size category of 5-9 mm was not significantly different from the TST size category of 0-4 mm. Compared to the results using the 1s t TST, the hazard ratios are higher in every TST size category but the conclusions remain the same. Table 97. TB risk according to the TST size of the 1st or 2nd TST (univariate analysis). Casual contacts B SE P-value Hazard ratio 95% CI for hazard ratio Lower Upper TST size 0-4 mm (reference category) TST size 5-9 mm 1.615 1.118 .149 5.026 .562 44.970 TST size 10-14 mm* 3.126 .601 < 001 22.780 7.015 73.975 TST size > 15 mm* 3.616 .559 < 001 37.203 12.437 111.287 B = Coefficient. SE = Standard error. 'Statistically significant (p<0.05) compared to the 0-4 mm category. When adjusting for the other prognostic factors in the multivariate analysis, the risk of TB for the TST sizes > 10 mm was significantly higher than the risk for the TST size category of 0-4 mm, as observed in Table 98. After adjusting for all the other prognostic factors, the TST size category of 5-9 mm almost reached statistical significance. The hazard ratios for all categories increased compared to the univariate analysis; they are also larger compared to the results obtained using the 1 s t TST. Table 98. TB risk according to the TST size of the 1st or 2nd TST (multivariate analysis). Casual contacts B SE P-value Hazard ratio+ 95% CI for hazard ratio Lower Upper TST size 0-4 mm (reference category) TST size 5-9 mm 2.178 1.122 .052 8.830 .978 79.676 TST size 10-14 mm* 4.088 .626 < 001 59.615 17.483 203.277 TST size > 15 mm* 4.734 .598 < 001 113.746 35.216 367.389 B - Coefficient. SE = Standard error. + Hazard ratio adjusted for age, gender, immunosuppression and previous BCG vaccination. 'Statistically significant compared to the TST size category 0-4 mm. -234-Results The corresponding Kaplan-Meier adjusted survival curve* for the risk of TB according to the TST size is shown in Figure 67. Figure 67. T S T size and risk o f T B . Casua l contacts (1st 2 n d T S T ) . 1.01 •Adjusted for age, gender, immunosuppression and previous B C G vaccination. -235-Results Immunosuppression status The results presented below correspond to the subgroup analysis for contacts that did not receive LTBI treatment, based on whether the contacts were immunosuppressed. Non-immunosuppressed contacts The TB rate for all the TST size categories was high and increased progressively with larger TST sizes, as it can be appreciated in Table 99. Compared to the results using the 1s t TST, there was a reduction in the TB rate observed in TST size categories 0-4 mm and 5-9 mm, while the TB rate was higher in the TST size category > 15 mm using the 1 s t 2nt TST; thus, there was a redistribution of contacts from the 0-4 mm to the larger categories, especially to the > 15 mm TST size category. Table 99. Number ofTB cases and the TB rates for the different TST sizes (1st or 2nd TST). Non-immunosuppressed contacts 'Number TST size Total at risk TB cases* TB rate/100,000 0-4 mm 19,941 15 75 5-9 mm 1,045 4 383 10-14 mm 2,083 36 1,728 > 15 mm 2,277 108 4,743 Total 25,346 163 643 . X' = 739.2; df: 3; p-value < 0.001. -236-Results In univariate analysis the risk of developing TB among all contacts increased significantly after a TST size of 5 mm, as shown in Table 100; compared to the results when using the 1 s t TST, the hazard ratios for all TST size categories were much higher in the 1 s t 2 n d TST. Table 100. TB risk according to the TST size of the 1st or 2nd TST (univariate analysis). Non-immunosuppressed contacts B SE P-value Hazard ratio 95% CI for hazard ratio Lower Upper TST size 0-4 mm (reference category) TST size 5-9 mm* 1.638 .563 .004 5.145 1.708 15.503 TST size 10-14 mm* 3.152 .307 < 001 23.373 12.797 42.689 TST size > 15 mm* 4.180 .276 <. 001 65.377 38.095 112.198 B = Coefficient. SE = Standard error. 'Statistically significant (p<0.05) compared to the 0-4 mm category. When adjusting for the other prognostic factors in the multivariate analysis, the risk of TB increased importantly with larger TST sizes, as it is shown in Table 101; all the TST size categories were highly statistically significant associated with an increased risk of developing TB when compared to the TST size category of 0-4 mm. As observed in the univariate analysis, there is an important increase in the hazard ratios for all the TST size categories when using the 1 s t 2 n d TST compared to the 1 s t TST, although the conclusions remain the same. Table 101. TB risk according to the TST size of the 1st or 2nd TST (multivariate analysis). Non-immunosuppressed contacts B SE P-value Hazard ratio+ 95% CI for hazard ratio Lower Upper TST size 0-4 mm (reference category) TST size 5-9 mm* 2.672 .566 < 001 14.475 4.770 43.925 TST size 10-14 mm* 4.414 .318 < 001 82.611 44.253 154.218 TST size > 15 mm* 5.429 .286 < 001 227.874 130.193 398.845 B = Coefficient. SE = Standard error. + Hazard ratio adjusted for age, gender, type of contact and previous BCG vaccination. 'Statistically significant compared to the TST size category 0-4 mm. -237-Results The corresponding Kaplan-Meier adjusted survival curve* for the risk of TB according to the TST size is shown in Figure 68. Figure 68. TST size and risk ofTB. Non-immunosuppressed contacts (1st 2 n d TST). 1.01 •Adjusted for age, gender, type of contact and previous BCG vaccination. -238-Results Immunosuppressed contacts The TB rate for all the TST size categories was high and increased progressively with larger TST sizes, especially after 5 TST sizes of 5 mm, as it can be appreciated in Table 102. Compared to the non-immunosuppressed contacts, the immunosuppressed contacts had higher TB rates in total and in every TST size category. The TB rate for the TST size categories 0-4 mm and 5-9 mm were approximately 4 times higher in the immunosuppressed contacts compared to the non-immunosuppressed contacts. However, the TB rate in the immunosuppressed contacts was statistically significant higher than in the non-immunosuppressed contacts only in the TST size category 0-4 mm (Fisher's exact test p-value = 0.033); the difference in TB rates between immunosuppressed and non-immunosuppressed for the 5-9 mm TST size category was not statistically significant due to the small number of immunosuppressed contacts and TB cases in this category (Fisher's exact test p-value = 0.234). Compared to the results when using the 1 s t TST, the TB rate was clearly lower in the 0-4 mm category (slightly lower in the 5-9 mm category), while it was higher in the 10-14 mm and > 15 mm when using the 1s t 2 n d TST; thus, there was a redistribution of contacts from the 0-4 mm to the larger categories, especially to the > 15 mm TST size category. Table 102. Number ofTB cases and the TB rates for the different TST sizes (1st 2nd TST). Immunosuppressed contacts TST size Total at risk TB cases* TB rate/100,000 0-4 mm 832 3 361 5-9 mm 57 1 1,754 10-14 mm 128 3 2,344 > 15 mm 179 10 5,587 Total 1,196 17 1,421 -239-Results In univariate analysis the risk of developing TB among all contacts increased progressively after a TST size of 10 mm, as shown in Table 103; the TST size category of 5-9 mm was not significantly different from the TST size category of 0-4 mm. Compared to the results obtained when using the 1 s t TST, the hazard ratios are higher for all the TST size categories, but the conclusions remain the same. Table 103. TB risk according to the TST size of the 1st or 2nd TST (univariate analysis). Immunosuppressed contacts B SE P-value Hazard ratio 95% CI for hazard ratio Lower Upper TST size 0-4 mm (reference category) TST size 5-9 mm 1.575 1.155 .173 4.830 .502 46.431 TST size 10-14 mm* 1.860 .817 .023 6.424 1.296 31.829 TST size > 15 mm* 2.747 .658 .000 15.602 4.294 56.694 B = Coefficient. SE = Standard error. * Statistically significant (p<0.05) compared to the 0-4 mm category. When adjusting for the other prognostic factors in the multivariate analysis, the risk of TB increased with larger TST sizes, as is shown in Table 104. However, the increased risk of TB in the 5-9 mm TST size category was still not statistically significant compared to the 0-4 mm category. Compared to the univariate analysis, there is an increase in the hazard ratios for the 10-14 mm and > 15 mm categories. Compared to the results when using the 1 s t TST, the hazard ratios for all the TST size categories are higher when using the 1 s t 2 n d TST and the hazard ratio of the 10-14 mm category became statistically significant. Table 104. TB risk according to the TST size of the 1st or 2nd TST (multivariate analysis). Immunosuppressed contacts B SE P-value Hazard ratio+ 95% CI for hazard ratio Lower Upper TST size 0-4 mm (reference category) TST size 5-9 mm 1.600 1.174 .173 4.952 .496 49.475 TST size 10-14 mm* 2.557 .882 .004 12.899 2.290 72.652 TST size > 15 mm* 3.508 .740 <. 001 33.375 7.822 142.401 B - Coefficient. SE = Standard error. +Hazard ratio adjusted for age, gender, type of contact and previous BCG vaccination. *Statistically significant compared to the TST size category 0-4 mm. -240-Results The corresponding Kaplan-Meier adjusted survival curve* for the risk of TB according to the TST size is shown in Figure 69. Figure 69. TST size and risk ofTB. Immunosuppressed contacts (1ST 2n d TST). 1.01 1.00 GO I-03 > co £ o TST Size 1st 2nd TST >=15 mm D 10-14 mm • • • • n 5-9 mm ° 0-4 mm Time (years) * Adjusted for age, gender, type of contact and previous B C G vaccination. -241-Results BCG status The results presented below correspond to the subgroup analysis for contacts that did not receive LTBI treatment, based on whether the contacts had received B C G . Included with the contacts with no previous B C G are also contacts with uncertain B C G status. Contacts with no previous B C G The TB rate for all the TST size categories increased progressively with larger TST sizes, as it can be appreciated in Table 105. Compared with the results using the 1s t TST, when using the 1 s t 2 n d TST there was a noticeable lower TB rate in the 0-4 mm and 5-9 mm TST size categories and a higher TB rate in the > 15 mm category; this is explained by a redistribution of contacts from the 0-4 mm to the larger categories, especially to the > 15 mm TST size category. Table 105. Number ofTB cases and the TB rates for the different TST sizes 1st or 2nd TST. No previous BCG TST size Total at risk TB cases* TB rate/100,000 0-4 mm 19,189 15 78 5-9 mm 573 3 524 10-14 mm 1,027 34 3,311 > 15 mm 1,294 97 7,496 Total 22,083 149 675 X^ = 1,107.0; df: 3; p-value < 0.001. -242-Results In univariate analysis the risk of developing TB among all contacts increased progressively after a TST size of 5 mm, as shown in Table 106; all the TST size categories were significantly different from the TST size category of 0-4 mm. Compared to the results using the 1s t TST, the 1 s t 2 n d TST hazard ratios were higher in all the TST size categories, but the conclusions were unchanged. Table 106. TB risk according to the TST size of the 1st or 2nd TST (univariate analysis). No previous BCG B SE P-value Hazard ratio 95% CI for hazard ratio Lower Upper TST size 0-4 mm (reference category) TST size 5-9 mm* 1.913 .632 .002 6.776 1.962 23.407 TST size 10-14 mm* 3.773 .310 .000 43.521 23.705 79.901 TST size > 15 mm* 4.618 .277 .000 101.294 58.803 174.492 B = Coefficient. SE = Standard error. 'Statistically significant (p<0.05) compared to the 0-4 mm category. When adjusting for the other prognostic factors in the multivariate analysis, the risk of TB increased importantly with larger TST sizes, as it is shown in Table 107; all the TST size categories are highly statistically significant associated with an increased risk of developing TB when compared to the TST size category of 0-4 mm. Compared to the results using the 1 s t TST, the 1 s t 2 n d TST hazard ratios were higher in all the TST size categories, but the conclusions were unchanged. Table 107. TB risk according to the TST size of the 1st or 2nd TST (multivariate analysis). No previous BCG B SE P-value Hazard ratio+ 95% CI for hazard ratio Lower Upper TST size 0-4 mm (reference category) TST size 5-9 mm* 2.497 .635 < 001 12.143 3.499 42.142 TST size 10-14 mm* 4.598 .318 < 001 99.333 53.243 185.322 TST size > 15 mm* 5.417 .287 < 001 225.148 128.198 395.418 B - Coefficient. SE = Standard error. +Hazard ratio adjusted for age, gender, type of contact and immunosuppression. 'Statistically significant compared to the TST size category 0-4 mm. -243-Results The corresponding Kaplan-Meier adjusted survival curve* for the risk of TB according to the TST size is shown in Figure 70. Figure 70. TST size and risk ofTB. Contacts with no previous BCG (1st 2 n d TST) T S T S ize 1st 2nd T S T 6 7 8 9 10 Time (years) •Adjusted for age, gender, type of contact and immunosuppression. -244-Results Contacts with previous B C G The TB rate increased progressively with larger TST sizes, as shown in Table 108. Compared with contacts with no (or unknown) previous B C G , the TB rate in contacts with previous B C G was much higher for the TST size category of 0-4 mm (this was also observed when using the results of the 1 s t TST); while the TB rates in all the other categories were significantly lower than the corresponding ones for the contacts with no previous B C G ; however, the overall TB rate was very similar in both, B C G positive and B C G negative contacts. Compared with the results obtained using the 1 s t TST, the TB rates observed when using the 1s t 2 n d TST had a small but steady increase from 0-4 mm to 10-14 mm; a large increase in TB rate in TST sizes > 15 mm was observed in both, the 1 s t TST and the 1s t or 2 n d TST. Table 108. Number ofTB cases and the TB rates for the different TST sizes 1st or 2nd TST. Previous BCG •Number TST size Total at risk TB cases* TB rate/100,000 0-4 mm 1,584 3 189 5-9 mm 529 2 378 10-14 mm 1,184 5 422 > 15 mm 1,162 21 1,807 Total 4,459 31 695 28.7; df: 3; p-value < 0.001. -245-Results In univariate analysis the risk of developing TB among all contacts was significantly higher for contacts in the > 15 mm TST category compared to the 0-4 mm category, as observed in Table 109; the TST size categories were not significantly different than the 0-4 mm category. Compared with the results from the 1 s t TST, the 1s t 2 n d TST hazard ratios are higher, but the conclusions are the same. Table 109. TB risk according to the TST size of the 1st or 2nd TST (univariate analysis). Previous BCG B SE P-value Hazard ratio 95% CI for hazard ratio Lower Upper TST size 0-4 mm (reference category) TST size 5-9 mm .702 .913 .442 2.019 .337 12.081 TST size 10-14 mm .812 .730 .266 2.252 .538 9.422 TST size > 15 mm* 2.274 .617 <.001 9.715 2.898 32.571 B = Coefficient. SE = Standard error. *Statistically significant (p<0.05) compared to the 0-4 mm category. When adjusting for the other prognostic factors in the multi