Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Occupational exposure to polycyclic aromatic hydrocarbons, breast cancer risk, and interactions with… Lee, Derrick Guang-Yuh 2016

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
24-ubc_2017_february_lee_derrick.pdf [ 4.63MB ]
Metadata
JSON: 24-1.0340690.json
JSON-LD: 24-1.0340690-ld.json
RDF/XML (Pretty): 24-1.0340690-rdf.xml
RDF/JSON: 24-1.0340690-rdf.json
Turtle: 24-1.0340690-turtle.txt
N-Triples: 24-1.0340690-rdf-ntriples.txt
Original Record: 24-1.0340690-source.json
Full Text
24-1.0340690-fulltext.txt
Citation
24-1.0340690.ris

Full Text

 Occupational exposure to polycyclic aromatic hydrocarbons, breast cancer risk, and interactions with genetic variants  by  Derrick Guang-Yuh Lee  A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF  DOCTOR OF PHILOSOPHY in The Faculty of Graduate and Postdoctoral Studies (Population and Public Health)  THE UNIVERSITY OF BRITISH COLUMBIA (Vancouver)  December 2016  © Derrick Guang-Yuh Lee, 2016   ii Abstract Polycyclic aromatic hydrocarbons (PAHs) are created by the incomplete combustion of fossil fuels, and are established human carcinogens. However, the effect of occupational PAH exposure on breast cancer is not well established. In addition, it is not known if genes involved in metabolizing xenobiotic compounds modify the risk of breast cancer in women exposed to PAHs.    The objectives of this study were to (1) estimate the association between PAH exposure and breast cancer, (2) examine how variants of select xenobiotic metabolizing genes influence breast cancer risk, and (3) assess how these variants – several of which are involved in PAH metabolism – interact with PAH exposure to modify breast cancer risk.  The relationships between PAH exposure, genetic susceptibility, and breast cancer were examined in a population-based case control study conducted in Vancouver, BC and Kingston, Ontario. A detailed questionnaire, including occupational history, and biological sample were collected from participants. Chapter 2 details how I developed a statistical model that predicts the probability of exceeding the permissible exposure limit for PAH through industry and occupation using workplace compliance testing data collected in the United States. Chapter 3 describes the use of the model to develop a job exposure matrix and estimate the association between PAH exposure and breast cancer. In Chapter 4, I assessed the associations between select gene variants and breast cancer, and evaluated whether there is evidence that those variants modify PAH exposure effect on breast cancer risk.    iii Long term exposure to PAHs was identified as a risk factor for breast cancer, and risk was highest among premenopausal women and women with a first degree family history of breast cancer. Six variants in xenobiotic metabolizing genes were observed to be related to breast cancer risk, three of which are directly involved in PAH metabolism. In addition, there is evidence to support the notion that three of these variants modify the effect of PAH exposure, implicating the role gene-environment interactions have on modifying breast cancer risk. Evidence from this research points to the potential importance of monitoring and limiting occupational exposures to PAHs in order to reduce breast cancer risk in women.      iv Preface Some of the research chapters of this thesis are constructed as scientific manuscripts and will be submitted, have been submitted for peer-review, or are published in a peer reviewed journal. This statement is to certify that the work presented in this thesis was conducted, analyzed, written, and disseminated by Derrick Guang-Yuh Lee. All research described was approved by the University of British Columbia – British Columbia Cancer Agency Research Ethics Board (UBC BCCA REB Certificate Number: R04-0142) and the Queen’s University Health Sciences Research Ethics Board (QU HSREB Certificate Number: EPID156-03) for the studies in Chapters 3 and 4.  A version of Chapter 2 has been published as “Lee DG, Lavoué J, Spinelli JJ, Burstyn I. Statistical modeling of occupational exposure to polycyclic aromatic hydrocarbons using OSHA data. Journal of occupational and environmental hygiene. (2015) 12(10):729-42.” DG Lee cleaned and organized the data provided by co-authors (JL and IB), performed the statistical analysis, and wrote the manuscript. Co-authors were involved in reviewing and revising the manuscript.  A version of Chapter 3 has been written with the intent to submit for peer reviewed publication, “Lee DG, Burstyn I, Lai AS, Grundy A, Friesen MC, Lavoué J, Aronson KJ, Spinelli JJ. Women’s occupational exposure to polycyclic aromatic hydrocarbons and risk of breast cancer.” DG Lee and A Grundy cleaned and organized the data, DG Lee performed all statistical analyses and wrote the manuscript. Co-authors (HK, MF) provided access to and guidance in the use of the DOM and NCI-JEMs, respectively, as described in Chapter 3, co-authors (JL, JJS, IB) were   v involved in the PPM-JEM as described in Chapter 2 and were involved in devising its application to the analyses detailed in Section 3.2. All co-authors were involved in reviewing and revision of the manuscript.  A version of Chapter 4 has been written with the intent to submit for peer reviewed publication, “Lee DG, Schuetz JM, Lai AS, Burstyn I, Brooks-Wilson A, Aronson KJ, Spinelli JJ. Interaction between exposure to polycyclic aromatic hydrocarbons and xenobiotic metabolizing genes, and risk of breast cancer.” DG Lee performed all statistical analyses and wrote the manuscript. Co-author (JMS) performed quality control on all genotyping assayed by Illumina GoldenGate as described in Chapter 4, and co-authors (JL, JJS, IB) were involved in the PPM-JEM described in Chapter 2 and its application to the analyses detailed in Section 4.2. All co-authors were involved in reviewing and revision of the manuscript.   vi Table of Contents Abstract ........................................................................................................................................... ii Preface............................................................................................................................................ iv Table of Contents ........................................................................................................................... vi List of Tables ...................................................................................................................................x List of Figures .............................................................................................................................. xiv List of Abbreviations .....................................................................................................................xv Acknowledgements ..................................................................................................................... xvii Dedication .................................................................................................................................. xviii 1 Introduction ............................................................................................................................1 1.1 Overview of Polycyclic Aromatic Hydrocarbons ...............................................................1 1.1.1 Exposure to PAHs ....................................................................................................3 1.1.2 PAH Metabolism and Formation of Reactive Intermediates .................................10 1.2 Breast Cancer ....................................................................................................................12 1.2.1 Descriptive Epidemiology ......................................................................................13 1.2.2 Established Risk Factors for Breast Cancer ...........................................................14 1.2.3 Environmental Factors ...........................................................................................14 1.2.4 Hormonal Factors and Their Relationship with PAH metabolism ........................15 1.2.5 Genetic and Familial Susceptibility .......................................................................16 1.3 Other Genetic Variations and Metabolism .......................................................................17 1.4 Gene-Environment Interactions ........................................................................................18 1.5 Research Purpose ..............................................................................................................19 1.6 Tables ................................................................................................................................21   vii 1.7 Figures...............................................................................................................................22 2 Estimating Polycyclic Aromatic Hydrocarbon Exposure.....................................................23 2.1 Occupational Safety and Health Administration Databanks ............................................25 2.1.1 Data Preparation and Definition of Exposure Limits .............................................25 2.1.2 Regrouping Industry Codes ....................................................................................28 2.1.3 Coding Occupations Within Industries for IMIS Dataset ......................................29 2.2 Statistical Modeling: Predictive Probability Model ..........................................................30 2.3 Results ...............................................................................................................................32 2.3.1 Descriptive Analysis ..............................................................................................32 2.3.2 Comparison of IMIS and CEHD Datasets .............................................................33 2.3.3 Probability of Exposure Above PEL According to Industry SIC Code .................33 2.3.4 Probability of Exposure Above PEL According to Industry NAICS Code ...........35 2.3.5 Comparison of Exposure Above PEL Between Industry Codes ............................36 2.3.6 Probability of Exposure Above PEL Incorporating Industry and Occupation: Subset Analysis using IMIS Dataset ......................................................................37 2.4 Discussion .........................................................................................................................38 2.5 Conclusions .......................................................................................................................45 2.6 Tables ................................................................................................................................46 2.7 Figures...............................................................................................................................52 3 Association Between Occupational PAH Exposure and Breast Cancer ..............................57 3.1 Methods.............................................................................................................................58 3.1.1 Study Population ....................................................................................................58 3.1.2 Questionnaire .........................................................................................................59   viii 3.1.3 Assessment of Menopausal Status .........................................................................60 3.1.4 Occupational Exposure Assessment .......................................................................60 3.1.5 Statistical Analysis .................................................................................................65 3.2 Results ...............................................................................................................................67 3.3 Discussion .........................................................................................................................72 3.4 Tables ................................................................................................................................79 4 Metabolizing-Genes and Breast Carcinogenesis ..................................................................87 4.1 Methods.............................................................................................................................88 4.1.1 Gene and Variant Selection ....................................................................................89 4.1.2 Genotyping .............................................................................................................90 4.1.3 Quality Control Procedures ....................................................................................90 4.1.4 Statistical Analysis .................................................................................................92 4.2 Results ...............................................................................................................................94 4.2.1 Genetic Susceptibilities and Breast Cancer Risk ...................................................94 4.2.2 Modification of Gene-Related Breast Cancer Risk by PAH Exposure ..................95 4.3 Discussion .........................................................................................................................97 4.4 Tables ..............................................................................................................................103 4.5 Figures.............................................................................................................................105 5 Conclusion ..........................................................................................................................106 5.1 Summary of the Study Findings .....................................................................................106 5.1.1 Estimating Occupational PAH Exposure .............................................................106 5.1.2 The Association Between Occupational PAH Exposure and Breast Cancer .......107   ix 5.1.3 Genetic Susceptibility in Xenobiotic Metabolizing Genes and Their Effect on Breast Cancer Risk ...............................................................................................109 5.2 Strengths and Limitations ...............................................................................................110 5.3 Implications and Future Research ...................................................................................119 5.4 Concluding Remarks .......................................................................................................121 Bibliography ................................................................................................................................123 Appendices ...................................................................................................................................148     x List of Tables Table 1-1: Sixteen polycyclic aromatic hydrocarbons deemed to be priority pollutants by the US Environmental Protection Agency ......................................................................................... 21 Table 2-1: Characteristics of Measurements of Coal Tar Pitch Volatiles (mg·m-3) in OSHA databanks ............................................................................................................................... 46 Table 2-2: Descriptive Statistics of Concentrations of Coal Tar Pitch Volatiles (mg·m-3) in OSHA data by regrouped SIC Code with 20 or more measurements ................................... 47 Table 2-3: Descriptive Statistics of Concentrations of Coal Tar Pitch Volatiles (mg·m-3) in the IMIS databank by Occupation Category derived from Free Text ......................................... 48 Table 2-4: Regrouped SIC coefficients and Predicted Probabilities of exceeding the PEL (0.2 mg·m-3) for PAHs assessed as Concentrations of Coal Tar Pitch Volatiles in 1980 from the OSHA databank for SIC code with 20 or more measurements ............................................. 49 Table 2-5: Regrouped SIC coefficients and comparison of Predicted Probabilities of exceeding Permissible Exposure Limit (0.2 mg·m-3) for Polycyclic Aromatic Hydrocarbons assessed as Concentrations of Coal Tar Pitch Volatiles in 1980 from the IMIS subset databank for Minimally Skilled Workers/Labourers and Administrative Personnel for SIC codes with 20 or more measurements ........................................................................................................... 50 Table 2-6: Regrouped SIC coefficients and comparison between the OSHA databank and the IMIS subset databank incorporating Occupation groups Minimally Skilled Workers/Labourers and Administrative Personnel of Predicted Probabilities of exceeding Permissible Exposure Limit (0.2 mg·m-3) for Polycyclic Aromatic Hydrocarbons assessed as Concentrations of Coal Tar Pitch Volatiles standardized to 1980 for SIC Codes with 20 or more measurements ............................................................................................................... 51 Table 3-1: Descriptive Statistics of the Study Population ............................................................ 79 Table 3-2: PAH exposure and breast cancer risk based on variations of the job exposure matricesⱡ ............................................................................................................................................... 80 Table 3-3: Polycyclic aromatic hydrocarbon exposure and breast cancer risk stratified by menopausal statusⱡ ................................................................................................................. 82 Table 3-4: Agreement of job exposure to polycyclic aromatic hydrocarbon within 3,514 jobs identified by NAICS and SOC codes between job exposure matrices – Cohen’s Kappa (κ) inter-rater reliability (95% Confidence Interval) ................................................................... 84 Table 3-5: Agreement of job exposure to polycyclic aromatic hydrocarbon within 3,514 jobs identified by NAICS and SOC codes between the PPM-JEM with a modified exposure   xi definition as a function of the percentage of the permissible exposure limit to match the prevalence of the exposed jobs based on the DOM and NCI-JEM – Cohen’s Kappa (κ) inter-rater reliability (95% Confidence Interval) ........................................................................... 85 Table 3-6: Agreement of “high” classification job exposure to polycyclic aromatic hydrocarbon within 3,514 jobs identified by NAICS and SOC codes between job exposure matrices – Cohen’s Kappa (κ) inter-rater reliability (95% Confidence Interval) ................................... 86 Table 4-1: Genetic analysis using gene-based permutations under inheritance-specific models 103 Table 4-2: Breast cancer odds ratios by genotype-exposure (duration at high PAH exposure) stratum based on co-dominant model for select SNPs with potential modifying effects‡ .. 104 Table 4-3: Breast cancer odds ratios by genotype-exposure (ever-never for duration at high PAH exposure) stratum based on co-dominant model for select SNPs with potential modifying effects‡ ................................................................................................................................. 104 Supplementary Table A1: Descriptive Statistics of Concentrations of Coal Tar Pitch Volatiles (mg·m-3) in OSHA data by regrouped SIC Code ................................................................ 148 Supplementary Table A2: Descriptive Statistics of Concentrations of Coal Tar Pitch Volatiles (mg·m-3) in OSHA data by regrouped NAICS Codes ......................................................... 152 Supplementary Table A3: Regrouped SIC coefficients and Predicted Probabilities of exceeding Permissible Exposure Limit (0.2 mg·m-3) for Polycyclic Aromatic Hydrocarbons assessed as Concentrations of Coal Tar Pitch Volatiles in 1980 from IMIS databank .......................... 155 Supplementary Table A4: Regrouped NAICS coefficients and Predicted Probabilities of exceeding Permissible Exposure Limit (0.2 mg·m-3) for Polycyclic Aromatic Hydrocarbons assessed as Concentrations of Coal Tar Pitch Volatiles in 1980 from IMIS databank ....... 159 Supplementary Table A5: Regrouped SIC coefficients and comparison of Predicted Probabilities of exceeding Permissible Exposure Limit (0.2 mg·m-3) for Polycyclic Aromatic Hydrocarbons assessed as Concentrations of Coal Tar Pitch Volatiles in 1980 for Minimally Skilled Workers and Administrative Personnel ................................................................... 162 Supplementary Table A6: Regrouped NAICS coefficients and Predicted Probabilities of exceeding Permissible Exposure Limit (0.2 mg·m-3) for Polycyclic Aromatic Hydrocarbons assessed as Concentrations of Coal Tar Pitch Volatiles in 1980 for Minimally Skilled Workers and Administrative Personnel ............................................................................... 165 Supplementary Table A7: Comparison between OSHA databank and IMIS databank incorporating Occupation groups Minimally Skilled Workers and Administrative Personnel of Predicted Probabilities of exceeding Permissible Exposure Limit (0.2 mg·m-3) for   xii Polycyclic Aromatic Hydrocarbons assessed as Concentrations of Coal Tar Pitch Volatiles standardized to 1980 for select SIC codes. .......................................................................... 168 Supplementary Table A8: Comparison between OSHA and IMIS databank incorporating Occupation groups Minimally Skilled Workers and Administrative Personnel of Predicted Probabilities of exceeding Permissible Exposure Limit (0.2 mg·m-3) for Polycyclic Aromatic Hydrocarbons assessed as Concentrations of Coal Tar Pitch Volatiles standardized to 1980 for select NAICS codes. ...................................................................................................... 170 Supplementary Table A9: Exposure assessment to PAHs and breast cancer risk stratified by hormone receptor statusⱡ ...................................................................................................... 172 Supplementary Table A10: Sensitivity analysis of exposure assessment to polycyclic aromatic hydrocarbons and breast cancer risk excluding cases not registered with the BC Screening Mammography Programⱡ ..................................................................................................... 174 Supplementary Table A11: Exposure assessment to polycyclic aromatic hydrocarbons and breast cancer risk stratified by smoking status‡ ............................................................................. 176 Supplementary Table A12: Exposure assessment to polycyclic aromatic hydrocarbons and breast cancer risk stratified by European and Asian ethnicities‡ ................................................... 178 Supplementary Table A13: Exposure assessment to polycyclic aromatic hydrocarbons and breast cancer risk stratified by socioeconomic status‡ ................................................................... 180 Supplementary Table A14: Exposure assessment to polycyclic aromatic hydrocarbons and breast cancer risk stratified by body mass index‡ .......................................................................... 182 Supplementary Table A15: Exposure assessment to polycyclic aromatic hydrocarbons and breast cancer risk stratified by (first degree) family history‡ ......................................................... 184 Supplementary Table A16: Top 10 most common at risk industries (NAICS 3-digit code) based on Yes-No JEM-specific exposure classification using the 3,514 unique occupations observed across all participants∆. ........................................................................................ 186 Supplementary Table A17: Breast cancer odds ratios by genotype-menopausal stratum under inheritance-specific models among European women ........................................................ 187 Supplementary Table A18: Genetic analysis for Asian women using gene-based permutations under inheritance-specific models ....................................................................................... 188 Supplementary Table A19: Breast cancer odds ratios by genotype-exposure (duration at high PAH exposure) stratum based on co-dominant model for remaining gene-representative SNPs among European women‡ .......................................................................................... 189   xiii Supplementary Table A20: Breast cancer odds ratios by genotype-exposure (average probability of PAH exposure) stratum based on co-dominant model for gene-representative SNPs among European women‡ .................................................................................................... 190 Supplementary Table A21: Breast cancer odds ratios by genotype-exposure (weighted duration of PAH exposure) stratum based on co-dominant model for gene-representative SNPs among European women‡ .................................................................................................... 192 Supplementary Table A22: Breast cancer odds ratios by genotype-exposure (ever-never for duration at high PAH exposure) stratum based on co-dominant model for remaining SNPs with potential modifying effects among European women‡ ............................................... 194 Supplementary Table A23: Breast cancer odds ratios by genotype-exposure (ever-never for average probability of PAH exposure∆) stratum based on co-dominant model for remaining SNPs with potential modifying effects among European women‡...................................... 195 Supplementary Table A24: Breast cancer odds ratios by genotype-exposure (ever-never for smoking) stratum based on co-dominant model for remaining SNPs with potential modifying effects among European women‡ ....................................................................... 197 Supplementary Table A25: Breast cancer odds ratios by genotype-exposure (ever-never for duration at high PAH exposure) stratum based on co-dominant model for remaining SNPs with potential modifying effects among European women stratified by smoking status‡ .. 199 Supplementary Table A26: Breast cancer odds ratios by genotype-exposure (ever-never for average probability of PAH exposure∆) stratum based on co-dominant model for remaining SNPs with potential modifying effects among European women stratified by smoking status‡ ................................................................................................................................... 203     xiv List of Figures Figure 1-1: PAH metabolism of benzo[a]pyrene (BaP) and the formation of carcinogenic intermediate ........................................................................................................................... 22 Figure 2-1: Quality control flow chart of PAH measurements used in the predictive probability models. ................................................................................................................................... 52 Figure 2-2: Cumulative distribution functions of detectable concentrations of Coal Tar Pitch Volatiles in All, CEHD, CEHD-only, IMIS, IMIS-only, and both-IMIS-CEHD data. ........ 53 Figure 2-3: Bland-Altman plot comparing the difference in the predicted EF for a SIC and its corresponding NAICS code against the average between the predicted EFs of the SIC-NAICS combinations assessed in 1980 from the IMIS databank: Regrouped coding (Left graph) and original coding (Right graph) with marker size scaled for the number of companies in the code. Only codes that have one-to-one mapping between NAICS and SIC codes are plotted. ................................................................................................................... 54 Figure 2-4: Bland-Altman plot comparing the difference in the predicted EF for a SIC and its corresponding NAICS code against the average between the predicted EFs of the SIC-NAICS combinations assessed in 1980 from the IMIS databank: Regrouped coding (Left graph) and original coding (Right graph) with marker size scaled for the number of companies in the code. Only codes that have multiple NAICS (SIC)-to-one SIC (NAICS) mapping are plotted. .............................................................................................................. 55 Figure 2-5: Bland-Altman plot comparing the difference in the predicted EF for a SIC and its corresponding NAICS code against the average between the predicted EFs of the SIC-NAICS combinations assessed in 1980 from the IMIS databank: Regrouped coding (Left graph) and original coding (Right graph) with marker size scaled for the number of companies in the code. All NAICS to SIC mappings are plotted.......................................... 56 Figure 4-1: Quality control flow chart of metabolism-related SNPs .......................................... 105 Figure 4-2: Quality control flow chart of assay samples ............................................................ 105     xv List of Abbreviations (Alphabetical order) AIM Ancestral Informative Marker BaP Benzo[a]pyrene BC British Columbia BMI Body Mass Index CEHD Chemical Exposure Health Data CEU Northern Europeans from Utah CHB Han Chinese in Beijing, China CHD Chinese in Metropolitan Denver, Colorado CI Confidence Interval CTPV Coal Tar Pitch Volatiles CYP450 Cytochrome P450 enzyme EF Exceedance Fraction ER Estrogen Receptor: Positive ( + ) / Negative ( – ) FDR False Discovery Rate GxE Gene-Environment Interaction HRT Hormone Replacement Therapy JEM Job Exposure Matrix JPT Japanese in Tokyo, Japan IARC International Agency for Research on Cancer IMIS Integrated Management Information System ISCO International Standard Classification of Occupations LD Linkage Disequilibrium   xvi LOD Limit of Detection LRT Likelihood Ratio Test MAF Minor Allele Frequency NAICS North American Industrial Classification System ND Non-Detect NCI National Cancer Institute NSAID Non-Steroidal Anti-Inflammatory Drug OEDB Occupational Exposure Databank ON Ontario OR Odds Ratio OSHA Occupational Safety & Health Administration PAH Polycyclic Aromatic Hydrocarbon PEL Permissible Exposure Limit PPM Predictive Probability Model PR Progesterone Receptor: Positive ( + ) / Negative ( – ) SIC Standard Industrial Classification SOC Standard Occupational Classification SNP Single Nucleotide Polymorphism TSI Toscans in Italy TWA Time-weighted Average UTR Untranslated Region YRI Yoruba in Ibadan, Nigeria    xvii Acknowledgements I would like to recognize the immense contribution and my sincerest thanks to my supervisory committee members, Drs. John J. Spinelli, Igor Burstyn, Kristan Aronson, and Angela Brooks-Wilson, who provided mentorship and guidance throughout my dissertation. I would also like to recognize the contributions of Drs. Jerome Lavoue and Melissa Friesen, whose expertise in the areas of occupational exposure assessment was vital in accomplishing the work I have done. I would also like to acknowledge Agnes Lai, Zenaida Abanto, Magali Coustalin, and the staff at the BC Cancer Research Centre for everything they have done to help not only me in my research but their tireless work at the BC Cancer Agency.    xviii Dedication           To my parents, without their support I would not be where I am now. To my friends, who have helped me through the good and bad and if it were not for them I would have finished this degree significantly quicker. To my wife, you are my best friend and my partner in crime. Thank you for being there for me and putting up with my foolishness, you literally are my better half.  1 1 Introduction  1.1 Overview of Polycyclic Aromatic Hydrocarbons Polycyclic aromatic hydrocarbons (PAHs) are a group of chemical compounds that are characterized by their structure, and are composed of two or more fused aromatic (benzene) rings with no heteroatoms, i.e. elements that are not carbon or hydrogen.(1) The general structure of a PAH allows the formation of isomers that results in hundreds of different PAHs, but those with five or six-rings are the most common.(2) The formation of PAHs occur during incomplete combustion of organic products (e.g. coal, oil, gas, and wood), which results in the sources of PAHs often occurring as a complex mixture of hundreds to thousands of different PAHs instead of a single compound.(3) PAHs are one of the most widespread organic pollutants,(3) and exposure occurs through multiple sources, including (but not limited to) certain jobs, diet, cooking methods (e.g. cooking or heating appliances that use gas or other petroleum-based fuels), contaminated drinking water, air pollution, and smoking.(4-9)  The sixteen most commonly profiled PAHs by the Environmental Protection Agency(10, 11) are listed in Table 1-1 and were chosen because they were suspected to be more harmful than some of the other PAHs. Acenaphthene, naphthalene, and fluoranthene were part of a list of 65 toxic pollutants from an EPA research report,(12) acenaphthylene, fluorene, and phenanthrene were part of a report on suspected carcinogens in water supplies,(13) and the remainder met previous priority selection criteria.(11) In general, these PAHs exhibit harmful effects that are representative of PAHs and individuals have a high likelihood of exposure to these particular PAHs.(3, 4, 14)    2 Table 1-1 presents physical and chemical properties of typical PAHs. The profiled PAHs, which have a molecular mass that ranges between 128–278 g·mol-1, are typically solid at room temperature and have high melting (range: 80–278C) and boiling points (range: 218–550C).(3, 15) Importantly, PAHs can vary between solid and gas phase depending on the number of rings, vapour pressure, and ambient temperature. For example, three to five-ring PAHs have high vapour pressure that gives them volatile properties,(16) which can affect the route and extent of exposure depending whether they are in a solid or gas phase. Furthermore, the fat-solubility of PAHs increases with each additional fused ring in its structure; this property can influence PAHs ability to accumulate in lipid-rich tissues and organs, which can have direct impacts on the health-related effects of PAH exposure.  PAHs are generally airborne and can be measured through air samples by collecting particulate matter on filters (e.g. glass fiber filters). Skin sampling from dermal exposure (e.g. using wipes) is a complementary measure due to the importance of dermal uptake of PAH in workplace exposure.(17) PAHs trapped by particle filters are treated in a solvent (e.g. benzene) that dissolves the filter, resulting in the particulate matter being suspended in the solvent, which is then extracted, dried, and weighed. In conjunction with collecting particles, sampling methods can also incorporate the use of sorbent tubes (e.g. silica, charcoal, XAD2, Chromosorb 106) to collect gaseous emissions when sampling semi-volatiles to ensure that both gas-phase and particle-phase PAHs are captured. Analyses of the PAH components is measured by high-pressure liquid chromatography and then a fluorescence or ultraviolet detector can then be used to determine the presence of selected PAHs.(18) Coal tar pitch volatiles (CTPV) quantified as benzene soluble fraction of total dust is a surrogate commonly used to characterize PAH   3 exposure,(19) as CTPV are a mixture of multiple PAHs composed of fused hydrocarbons that volatilize from the distillation residues of coal, petroleum, wood, and other organic matter. The calculated exposure level can also be done by looking at individual PAHs or added together several PAHs to estimate “total” PAH exposure during sampling; however, this can be time consuming. The proportion of the components in the mixture is complex, and dependent on several factors, including the temperature of the heating process. As such, the constituents of CTPV are not specific to any particular PAH, although heavier PAHs like phenanthrene, anthracene, pyrene, chrysene, and benzo[a]pyrene are common components.(18) Designation of particles trapped by filters and extraction by a solvent as CTPV depends on the presumption of the presence of PAH contamination.  1.1.1 Exposure to PAHs Exposure to PAHs can be categorized as either environmental (i.e. inside and outside the home), or occupational (i.e. encountered in the place of employment). Although environmental exposure to PAHs is ubiquitous, the exposure level and intensity from environment sources are generally lower compared to occupational settings.(14, 19) Many factors, including environment, occupation, and our eating and cooking habits contribute to the likelihood and total level of exposure.    Environmental PAH exposure  Exposure occurs through multiple routes, including the air from multiple natural (e.g. forest fires) and man-made sources (e.g. vehicle emissions).(3) Exposure to PAHs through contaminated water occurs through several avenues, including atmospheric forms of PAHs that are removed from the air through precipitation, waste (e.g. discharge from waste water treatment plants), and   4 improper disposal of petroleum-based products. Removal of PAHs from the atmosphere through dry deposition and precipitation can also contaminate soil and sediment with PAHs.(3) Although emission of PAHs can occur naturally, synthetic sources are currently the dominant source,(20-23) with residential burning of wood being the largest single source of PAH emission into general environment.(3) Some other common synthetic sources of human exposure include inhalation of tobacco smoke,(24-26) as well as cooking or heating appliances that use gas and other petroleum-based fuels.(27, 28) Although the content of PAHs in the food source varies depending on food product, method, temperature, and duration of cooking;(14, 29) cooked foods can be a major source of PAH exposure, as well as a source of exposure to aromatic amines and heterocyclic aromatic amines.(3, 30) Exposure to PAHs through cooked meat can even exceed exposure levels from smoking.(30) As such, the online database CHARRED (version 1.7) was developed by the National Cancer Institute to compute PAH values from meat intake based on the type of meat, method used, and frequency of consumption.(9)  Occupational PAH exposure Occupational PAH exposure occurs during the manufacturing of various products (e.g. pharmaceuticals, carbon electrodes, and processed foods), either as part of the manufacture or from the equipment used. The route of exposure during manufacturing is usually through either inhalation or dermal contact.(3, 31) For example, employees in petroleum refineries can experience extremely high levels of PAH exposure from various steps involved in the refinement process (e.g. distillation, bitumen processing and loading, coking, and wastewater treatment).(31, 32) In one study involving 9 refineries in the United States, air samples of PAH concentrations from coking units reported total PAH concentrations of 10 µg·m-3 and bitumen processing units ranging   5 between 1–40 µg·m-3.(33) By comparison, the estimated average air pollution concentrations for PAHs during the month of June in Detroit, MI, USA, which is heavily involved in the automobile industry, was orders of magnitude lower, around 0.035 µg·m-3 in 2009.(34) Similarly, the annual total PAH concentration for London (UK)was estimated at 0.166 µg·m-3 in 1991.(35)  Occupational PAH exposure occurs in many industries and jobs, with the level and intensity of exposure experienced by workers being heterogeneous. The types of industry prone to PAH exposure include aluminum smelting, coal gasification and coke production, as well as iron and steel foundries.(3, 17 265, 31) Industries where specific materials are handled (e.g. coal tar-related products) and tasks performed (e.g. working with diesel engines) also entail a risk of PAH exposure.   Aluminum Production The sources of PAHs in aluminum production arise from the evaporation of carbon electrode material during aluminum production when carbon (e.g. söderberg or graphite/prebake) anodes are made of coke and coal tar pitch. PAH exposure in Söderberg plants have been measured as high as 1,000–10,000 µg·m-3 (i.e. orders of magnitude higher than the levels seen in other settings).(21, 36-38) Even in prebake plants, where exposures are lower, exposure typically ranges between 100–1,000 µg·m-3,(21, 36-38)   Coal gasification and coke production PAH exposure occurs through the coal gasification process, which is a destructive distillation process that produces naphtha (coal tar) and coke as byproducts.(21, 39) Another byproduct is the   6 release of a PAH called benzo[a]pyrene (BaP), which is one of the most commonly measured and studied PAHs.(40) Typical levels of BaP in the air around the distillation ovens are below 10 µg·m-3,(20, 41, 42) but operations that include coke production have seen values as high as 32 µg·m-3.(43) One study in particular that examined a London retort house, which is a place where gas was manufactured by heating coal in the absence of air, measured BaP concentrations in excess of 200 µg·m-3,(42) and about 10,000 times the average level of BaP in the city of London during that time period. However, modern processes of gasification have lowered the level of PAH exposure.(21)  Iron and Steel Foundries The thermal decomposition of organic materials used in molds in the iron and steel industries act as a source for PAH exposure. Similar to aluminum smelting, the range of BaP levels vary among foundries. One study from Ontario that investigated 10 foundries observed a maximum BaP concentrations of 12.06 µg·m-3 and other variants (e.g. anthracene and fluoranthene) that ranged from non-detectable to 26.59 µg·m-3.(44) Although decomposition causes airborne emission of PAHs, major contributors to PAH exposure are the raw material and organic binders (e.g. coal powder) and other organic additives. Schimberg et al. identified that the foundries that used coal tar pitch as organic binders had substantially higher BaP levels, with measurements peaking at 57.5 µg·m-3.(45)   Occupational exposure to diesel and gasoline engine exhaust Exposure to PAHs from diesel and gasoline engine exhaust in occupational and environmental settings can occur during the particulate phase of the exhaust in which PAHs attach to respirable   7 dust particles.(46) Diesel engine exhaust has its own uniquely identifiable PAHs that are characterized by nitro- and dinitro-pyrenes.(47) Similar to other sources, diesel and gasoline exhaust emissions also contain naphthalene, phenanthrene, and BaP.(48) BaP exposure levels for garage workers that worked on buses can the range from 0.001–0.020 µg·m-3.(20, 21, 23) Similar sources of BaP exposure also occur in automobile repair shops,(20, 22) although BaP levels can be slightly higher (range: 0.004-0.090 µg·m-3) due to decomposing clutch and brake linings from overheating.(49) Although BaP is associated with diesel combustion, emissions from diesel engines are on the same order of magnitude as gasoline engines with catalytic converters.(50) Driving habits and conditions can also influence emission levels, with one study observing BaP concentrations from engine emissions varying between 0.03–3.23 µg·m-3 depending on quality of fuel.(51) Railroad workers are also another group at risk of PAH exposure from diesel emission. In a U.S. study, BaP levels from respirable diesel exhaust exposure in the rail industry have been estimated1 between 0.02–0.72 µg·m-3.(52) Although drivers (i.e. transport and delivery) and rail workers are at a highest risk of exposure, exposure levels are much lower than the previously mentioned industries.(53-55)  Exposure from materials handled Coal tar, pitch, asphalt, creosote, soot, and anthracene oil use is widespread and found in the manufacturing of various fuels, dyes, plastics, paints, insulating and building materials, and even in inks and brushes.(19) Roofers, road pavers and asphalt workers, and those in similar trades are at high risk for PAH exposure from fumes during the heating of coal tar and bitumen,(56-58) which                                                  1 In the work done by Speizer et al., the respirable diesel exhaust exposure in the rail industry was estimated between 4–103 µg·m-3, which would equate to a BaP level of 0.02–0.72 µg·m-3 based on a diesel composition of 0.5-0.7% of total diesel particulate mass for BaP by the Environmental Protection Agency.(48)   8 is another type of pitch that forms from petroleum,(59) with roofers being exposed to PAH levels as high as 1,000–100,000 µg·m-3.(60, 61) Among manual road pavers, BaP exposure (in tar-free applications) can be viewed as uniform,(62) with low levels of exposure being observed in the range of  5.8–8 µg·m-3.(58, 63) The concentration of BaP is expected to increase significantly when asphalt-tar mixtures are heated,(64) with studies observing levels between 5–350 µg·m-3.(62, 65) Workers involved in the forestry industry, particularly those involved in manufacturing and preparing/treating wood for commercial purposes are at risk of PAH exposure. Creosote, which is one of the main distillate fractions of coal tars and coal tar pitches, is used as a wood-preservative and at least 75% of creosote’s composition is a combination of various PAHs.(66) Exposure to PAHs from the heating of creosote-related material and particulate BaP concentrations is considerably lower than other PAH-related materials, as one study observed concentrations around 0.5 µg·m-3.(67) However, similar to the other industries mentioned, PAH concentration and exposure within the same work place and among workers is heterogeneous, especially when considering the different materials handled.  Exposure from tasks performed outside of heavily industrialized occupations Occupations in industries associated with the burning of fuels are at risk of exposure because incomplete combustion of fuels leads to formation of PAH. Exposure occurs regularly in individuals whose work involves furnaces and incinerators, which includes industries like waste management services. Furnaces can release BaP and other PAHs during the incineration process with concentration levels ranging from 0.042–17.2 µg·m-3 depending on the material.(68-70) Burning garden waste also poses a risk of exposure to PAHs from the smoke aerosols emitted   9 during the operation. In a Malaysian study, samples of smoke aerosol from burning garden waste saw peak BaP concentrations peak at 2.7 μg·m-3.(71)   One of the largest industries exposed to PAHs in Canada are restaurant workers, cooks, and those involved in food service.(72) PAHs can be formed during the curing and processing of raw food prior to cooking, as well as during the cooking process (e.g. grilling, smoking, or oil fumes).(27, 73) Combustion of fuels through use of firewood, which includes restaurants (e.g. pizzerias that have wood-ovens) and other facilities that burn firewood, can also lead to PAH exposure (range: 1.3–110 µg·m-3) depending on the type of wood fuel (e.g. dry birch, wood waste, etc).(74, 75) Cooking oils prior to heating have been reported to contain PAHs and various levels of PAH exposure can occur during and after heating from oil fumes depending on the type of oil used (e.g. vegetable and soybean oil).(73) Similar heterogeneous results were found when assessing PAH exposure based on the cooking process, with the magnitude of exposure dependent on the method and fuel source.(28, 76-78) PAH exposure can also vary depending on the work environment, as levels can vary heavily when fume extractors were involved. One study examining fumes from cooking oils found that BaP levels can increase by as much as three-fold (range: 2.1–2.7 μg·m-3) in the absence of a fume extractor; however, this level typically drops by as much as 75% when extractors are used,(79) reducing the concentration levels to previously reported levels.  Gender differences in occupational exposure to PAH The likelihood of exposure to PAH, as well as level of exposure, may be different between male and female employees, as industries where women work can vary from those of men, depending   10 on country and era. In the United States, current estimates(80) indicate that males represent 92% of all metal furnace operators, tenders, pourers, and casters, 85% of furnace, kiln, oven, drier, and kettle operators and tenders, and more than 75% of helpers (production workers). Similarly, in diesel and gasoline related industries, males represent 99% of bus and truck mechanics and diesel engine specialists; however, females represent 48% of bus drivers and 24% of automotive and watercraft service attendants. In the food industry, females represent 89% of hosts and hostesses, 73% of waiters and waitresses, 67% of combined food preparation and serving workers (i.e. fast food workers), and 59% of both first-line supervisors of food preparation and serving and food preparation workers. Given the difference in distribution, there may be an effect of occupational PAH exposure on women’s health that has not been described since the majority of studies have been in male-dominant industries.    1.1.2 PAH Metabolism and Formation of Reactive Intermediates Multiple PAHs have been suggested as risk factors for several types of cancer.(3, 21, 37, 38, 47, 81-87) PAHs are not actually carcinogenic and instead it is their reactive metabolic intermediates that initiate carcinogenesis.(4, 14) PAHs are modified through several metabolic pathways and will eventually produce an excretable (non-toxic), water-soluble product.(88, 89) Initially inert, it is the metabolic enzymes that bioactivates these compounds into intermediate metabolites that can react with a variety of sub-cellular components, including DNA.(88, 89) The rate at which these intermediates form will differ by tissue due to differences in enzyme activities.(88, 89)   There is evidence of three mechanistic pathways in the metabolic activation of carcinogenic polycyclic aromatic hydrocarbons that produce reactive intermediates: dihydrodiol-epoxide   11 (diol-epoxide), radical cations, and o-quinones.(2, 90-94) All three intermediates can produce carcinogenic effects through their interaction with DNA to form PAH-DNA-adducts. The formation of these adducts is crucial in the initiation of carcinogenesis as they are precursors for replication errors that can lead to mutations and the development of cancer.(95-98) Figure 1-1, adapted from Palackal et al.,(99) displays three intermediate metabolites formed during BaP metabolism.   The diol-epoxide pathway is the prominent pathway in PAH metabolism, and each PAH undergoes similar processes during biotransformation because of structural similarities.(2, 90) BaP is transformed into the proximate carcinogen benzo[a]pyrene-7,8-dihydrodiol (BaP-7,8-Diol) by cytochrome P450 (CYP450) and epoxide hydrolase (EH). These diols can then undergo a second biotransformation to produce benzo[a]pyrene-7,8-diol-9,10-epoxide (BaPDE), the ultimate carcinogen, that is capable of forming covalent adducts with DNA.(4, 100) The DNA adducts are formed because the carbons in the epoxide group of diol-epoxides are reactive electrophiles, due in large part to the fact that substantial ring strain is relieved when the ring opens upon interactions with nucleophilic DNA.(2, 92, 101)   The secondary pathway that produces a proximate carcinogen involves a different CYP450 enzymes, CYP-peroxidase, which catalyzes the loss of an electron during BaP oxidation to produce a BaP radical cation.(91, 92) Similar to the diol-epoxide pathway, PAH radical cations react with nucleophiles to form depurinating adducts.(2, 92, 101)     12 The third pathway, which involves dihydrodiol dehydrogenase (DD), competes for the diol intermediates produced by CYP450 in the first stages of the diol-epoxide pathway. The enzymes DD and CYP1B1 compete for the diol intermediate BaP-7,8-Diol to produce a catechol, which is an unstable metabolite that autoxidizes to o-quinones.(93, 94) The o-quinones are believed to react with DNA to form adducts, as well as reactive oxygen species that can induce oxidative damage leading to DNA replication errors. For example, 8-oxo-dGuo is an oxidized deoxynucleosides formed by reactive oxygen species and can create mismatches in base pairs that lead to point mutations.(102)   These three mechanisms are not mutually exclusive, as the enzymes involved in PAH metabolism clearly have multiple roles, and all three intermediates may play a role in the initiation of carcinogenesis. Although there are several factors that influence the likelihood of developing cancer, the established theory is that these adducts continuously interfere with DNA to distort its structure and cause errors during replication that eventually lead to cancer.(2)   1.2 Breast Cancer Given the biological role the intermediates of PAH metabolism play in carcinogenesis, there may exist a mechanism whereby PAH exposure may also be a risk factor for breast cancer. Due to the difference in distribution of female workers in occupations and industries previously studied for health-related effects of PAH exposure, we can expand on the literature of disease-related occupational risk in relations to breast cancer, especially in industries and occupations where women are the dominant workforce.    13 1.2.1 Descriptive Epidemiology Breast cancer continues to be the most common cancer in Canadian women, with an estimated 25,000 women diagnosed in 2015, which represents more than a quarter of all cancer diagnoses, excluding non-melanoma skin cancer. An estimated 5,000 women died from breast cancer in 2015, making it the second leading cause of mortality from cancer among women in Canada.(103) Originating in the cells of the breast, breast cancer is usually categorized as being either in situ or invasive. Breast cancer classified as in situ refers to breast cancer that has not spread beyond the original site in the ducts or the lobules of the breast. Invasive, or infiltrating, cancers begin in the ducts or lobules and eventually invade the surrounding fatty tissue of the breast. The most common form of breast cancer in the U.S. and Canada is invasive ductal carcinoma, also called infiltrating ductal carcinoma or ductal adenocarcinoma, and it arises in glandular cells.(103, 104) It accounts for almost 80% of all invasive breast cancers.(103) Both in situ and invasive breast cancers, which share similar etiologic factors,(105, 106) are morphologically similar to one another.(107) Most in situ breast cancers are identified through screening mammography and are treated by surgery.(108, 109) For invasive breast cancers, treatment and prognosis are affected by the tumor size, histologic type, nuclear grade, proliferation rate, hormone receptor status, and whether the cancer has metastasized.(110) Hormone receptor status is related to survival and response to treatment and therefore plays a large role in the treatment of breast cancer. The current evaluation of invasive breast cancers includes assessing estrogen receptor (ER) and progesterone receptor (PR) status, with positive subtypes generally have better prognoses while negative subtypes having poorer outcomes due to poor response to anti-hormone therapy.(107)    14 1.2.2 Established Risk Factors for Breast Cancer The causes of breast cancer are not well understood, but there are established risk factors and it is agreed that the etiology is multifactorial (i.e. there is no single factor that is the sole contributor for breast cancer development) and can differ by subtype.(111) Reproductive factors that modify risk of breast cancer include menarche (i.e. the first menstrual period) and menopause. Early menarche has been consistently associated with increased breast cancer risk(112) and later age at menopause (e.g. after 55 years of age) is related to an increased risk of breast cancer.(113) Women who are nulliparous also have a greater risk of breast cancer, but those who have their first child before the age of 35(114) and who breastfeed longer (e.g. more than 12 months) have a decreased risk for breast cancer.(107, 115) Additional risk factors include diet, obesity, and sedentary lifestyle,(116-118) as well as alcohol consumption.(116, 119, 120) These lifestyle factors are usually classified as environmental factors, a group that also includes chemicals and complex mixtures, as well as physical and biological agents; PAHs generally falling in the first category. The other classification for risk factors are those that are genetic in nature, with two of the most well-known genetic risk factors being the BRCA1/2 mutations.(121-125) Also, women with a first-degree family member (e.g. parent, sibling, or child) are at an increased risk for breast cancer, which can be seen as another genetic factor.  1.2.3 Environmental Factors The literature is inconsistent with respect to the effect of smoking on breast cancer,(126, 127) despite smoking being a potent source of PAH. More than 70 PAH compounds have been identified in cigarette smoke and BaP concentrations in a room extremely polluted with cigarette smoking measured at 0.022 µg·m-3.(40) However, some studies suggest that long duration of   15 smoking can result in an increased risk of breast cancer among women.(128, 129) Overall, the effect of PAH exposure on breast cancer risk is not as clearly defined,(30, 130-132) but these carcinogens do have properties that could (in theory) impact breast cancer risk, including being fat-soluble.(3, 14) In vitro studies have shown that human breast epithelial tissue metabolizes PAHs into carcinogenic compounds (as shown in Figure 1-1).(133-136) Given that mammary tissues bioaccumulate PAHs due to the solubility of PAHs in lipid-rich tissues,(137) this allows the formation of adducts and other oxidative intermediates, which can react with DNA, and contributes to the development of breast cancer.(4, 14)  1.2.4 Hormonal Factors and Their Relationship with PAH metabolism Hormonal factors associated with increased breast cancer risk include the reproductive factors mentioned above, and the prolonged use of some types of oral contraceptives and hormone replacement therapy.(116, 117) The mechanistic considerations behind these associations is as follows. Endogenous and exogenous forms of estrogen increase cellular proliferation in the breast, which increases the chances of random errors and mutations during cell division and, in turn, increases the risk of breast carcinogenesis.(138) Oral contraceptives, HRT, early menarche and late menopause imply a greater lifetime exposure to this endogenous hormone, and parous women, especially those who breastfed, undergo changes in endogenous estrogen levels during pregnancy.(107) Those considered sedentary are at risk of being obese, which is associated with increased adipose tissue, resulting in increased levels of estrogen.(139) Physical activity has an inverse relationship, and has been observed to lower the amount of estrogen circulating in the body.(118) On the other hand, alcohol consumption has the opposite affect by increasing the amount of estrogen in the blood.(140) Surprisingly, PAH exposure also influences estrogen   16 metabolism, and therefore may alter breast cancer risk. The introduction of PAH metabolites is theorized to trigger estrogenic(141) and antiestrogenic responses(142-144) through increased metabolism of estradiol. The reason for the crosstalk between the two metabolic pathways is because the AhR–ARNT protein, which is a heterodimer that regulates enzymes involved in PAH metabolism, affects the estrogen receptors ER-  and ER- (141) and results in increased formation of quinones.(145, 146)  1.2.5 Genetic and Familial Susceptibility Genetic and familial susceptibility also play a role in determining risk for breast cancer, as women with first-degree relatives have 1.5–3 times the risk of women without a family history,(107) although estimates are based on familial aggregation that can be attributed to both shared genes and environments. Genetic factors account for 5–10% of breast cancer cases,(147) and studying these factors play an important role in understanding the etiology of breast cancer. Specific genes involved in hereditary breast carcinogenesis include BRCA1/2, ATM, PTEN, and TP53.(147) Approximately 2–5% of all breast cancer incidence are estimated to be attributed to BRCA1/2 mutations,(122, 123) and carriers of either mutation have an estimated risk of breast cancer greater than 50% by age 70.(124) Similar to the BRCA1/2 genes, women with a heterozygous mutation in the ATM gene, which occurs in approximately 1% of the population of all women, have a 2–5 fold increase in breast cancer risk compared to the general public.(148, 149) Not surprisingly, mutations associated with PTEN and TP53, which are both involved in tumor suppression, also increase a woman’s risk for breast cancer.(147, 150, 151)    17 1.3 Other Genetic Variations and Metabolism Genetic variations are deviations in the DNA sequence within the population brought about by random mutations in the DNA sequences. These genetic variations may modify the risk of developing a disease or may be functionally neutral. The simplest variation is a single nucleotide polymorphism (SNP) and involves the alteration of a single nucleotide (i.e. A, T, C, or G) in the DNA sequence. The impact of this change can vary depending if it is in the non-coding or coding region, although if in the non-coding region variants can still influence protein production by affecting transcription. If the SNP is in the coding region, then its influence is dependent on if it is synonymous or nonsynonymous; in particularly, if the variant is nonsynonymous it will change in the amino acid encoded, which in turn could affect the enzymatic function of the protein.  Although many polymorphisms have low penetrance, and therefore only occasionally produce symptoms or traits of the disease, nonsynonymous SNPs can have substantial impacts. CYP1B1 is involved in metabolizing xenobiotic compounds, including PAHs, as well as endogenous compounds such as estradiol. Two polymorphisms for CYP1B1 (rs1056827 and rs1800440) have been observed to have higher enzymatic activity than the wild-type during 4-hydroxylation of estradiol (estrogen),(152, 153) which can produce free radicals during estradiol metabolism that can cause cellular damage.(154, 155) As part of the CYP450 superfamily, CYP1B1 also has a role in PAH metabolism and is directly involved in the initiation of PAH metabolism and the formation of reactive epoxide intermediates(156) mentioned in Section 1.1.2. Animal studies involving knockout mouse models observed that several CYP-enzymes, including CYP1A1 and 1B1, have different responses to PAH exposure in comparison to wild-type CYP+/+ mice.(157, 158) The   18 observed differential expression of CYP1-genes, and the greater BaP-DNA adduct formation in comparison to wild-type mice, imply potential variations in (human) enzymatic activity during the various steps of PAH-metabolism (see Figure 1-1) that may result in differences in the bioaccumulation of the reactive intermediates. The impact of these variants, especially in two interlinked metabolic pathways, may have significant effects on breast cancer risk through their influence on estrogen levels or formation of DNA-adducts.  1.4 Gene-Environment Interactions It is accepted that the etiology of most common diseases, including cancer, involves not only genetic and environmental causes, but also the interaction between the two.(159) The concept of gene-environment interaction is not novel: it has been argued for decades that environmental factors influence cancer risk through “the genetic mechanism of mutation”.(160) That is, variations in certain genotypes can modify the effect of exposures, which render some individuals more or less likely to develop a disease after exposure(161) and account for the excess number of new cases and increased risk for certain individuals.   Recent publications on genetic associations, as captured by the HuGE Navigator,(162) indicate that of the 5,334 articles relating to breast cancer, slightly more than 1% examined gene-environment interactions (GxEs)2. It is clear that GxEs have still not been thoroughly investigated in the area of breast cancer,(164) in part perhaps because prior research on genetic                                                  2  The HuGE Literature Finder, sponsored by the human genome epidemiology network, is an online curated and searchable knowledge base.(163) The database includes publications since 2001 and, at the time of access, contained 121,553 articles. Search queries were done using the terms [gene environment interaction], [gene environment interactions], and [GxE], which yielded a total of 1,406 publications (after removing duplicates). Query for [breast cancer] yielded 5,334 articles, of which 65 overlapped with the interaction query.    19 susceptibility in breast cancer have focused on high penetrance genes (e.g. BRCA1/2) that do not have clear environmental component. Although these type of genes can confer as high as an 80% lifetime risk of developing a disease,(165, 166) they are also rare and only account for less than 5% of breast cancer incidence.(167) The common low penetrance mutations that may not have substantial impact on disease risk per se; however, through joint action with environmental exposures certain genotypes may have larger effects on disease risk (i.e. a main gene effect only in the presence of exposure).   Interactions can obscure both environmental effects, which may be evident only in genetically susceptible people, and genetic effects, which may be evident only in those who are exposed. Furthermore, the consequences of ignoring an interaction are exacerbated when estimating the disease-exposure association. Statistical models often assume non-existence of interaction effects and failure to account for potential interactions may lead to a bias in estimation of both exposure-risk and genetic-risk because interaction effects will have to be weighted in the estimated effects for the environmental and heritable components.(168) Understanding interactions between environmental risk factors and genes that may modify a person’s breast cancer risk is important for understanding the biological mechanisms, pathways, and overall etiology of breast cancer.  1.5 Research Purpose We hypothesize that occupational PAH exposure increases the risk of breast cancer; genes involved in xenobiotic-metabolism, which are also involved in metabolizing PAHs, are associated with breast cancer; and variants of metabolizing genes involved in the bioactivation of PAH into their toxic (or detoxified) intermediates can modify PAH-mediated breast cancer risk.    20  The study will pursue the following objectives needed to test our hypotheses: 1. To develop a method of assessing occupational PAH exposure in (a sample drawn from) the general population, 2. To investigate the association between occupational PAH exposure and breast cancer risk, 3. To estimate the degree of association between genes involved with PAH metabolism and breast cancer risk, and 4. To determine whether associations (if any) between occupational PAH exposure and breast cancer risk are modified by genetic susceptibilities.     21 1.6 Tables Table 1-1: Sixteen polycyclic aromatic hydrocarbons deemed to be priority pollutants by the US Environmental Protection Agency Compound No. of rings Molecular Mass (g·mol-1) Melting point (º C) Boiling point (º C) Water Solubility (mg·L-1) Acenaphthene 3 154.2 95 279 0.40 Acenaphthylene 3 152.2 92–93 265–275 3.93 Anthracene 3 178.2 218 342 0.08 Benz[a]anthracene 4 228.3 162 435 0.01 Benzo[a]pyrene 5 252.3 178 310–312 <0.01 Benzo[g,h,i]perylene 6 276.3 278 550 <0.01 Benzo[b]fluoranthene 5 252.3 168 481 <0.01 Benzo[k]fluoranthene 5 252.3 216 480 <0.01 Chrysene 4 228.3 255–256 448 <0.01 Dibenz[a,h]anthracene 5 278.3 267 524 <0.01 Fluoranthene 4 202.3 111 384 0.20 Fluorene 3 166.2 116–117 295 1.68–1.98 Indeno[1,2,3-c,d]pyrene 6 276.3 164 530 0.06 Naphthalene 2 128.2 80 218 31.6 Phenanthrene 3 178.2 100 340 1.20 Pyrene 4 202.3 150 404 0.08  Results of the chemical and physical properties are extracted from the Toxicological Profile for Polycyclic Aromatic Hydrocarbons by the U.S. Department of Health and Human Services, Public Health Service(3) with minor corrections based on the NCBI PubChem Open Chemistry Database(15)   22 1.7 Figures  Figure 1-1: PAH metabolism of benzo[a]pyrene (BaP) and the formation of carcinogenic intermediate  23 2 Estimating Polycyclic Aromatic Hydrocarbon Exposure Exposure can occur through several sources, including diet, contaminated drinking water, and ambient air pollution.(4) Several variants are classified by the International Agency for Research on Cancer as carcinogenic to humans (Group 1).(4) The highest levels of exposure occur occupationally in hundreds of industries, including highly specialized work such as aluminum smelting or extracting and processing crude oil or coal. Lower, but more frequent, exposure can occur in more common industries such as food-service and automobile repair.(19) In Canada, an estimated 307,000 workers are occupationally exposed to PAHs, with the largest group of exposed workers being chefs and cooks.(72)   Currently there is no accessible Canadian source for information on PAH exposure, however in the United States there are several sources that are maintained by the Occupational Safety and Health Administration (OSHA), including the Integrated Management Information System (IMIS). Occupational exposure databanks (OEDBs) have been used as sources of data for exposure surveillance or occupational epidemiology.(169-171) Several authors have used IMIS data, which is described elsewhere,(172) to study a variety of occupational exposures, including beryllium(173, 174) and formaldehyde.(175) In this study we obtained access to two OSHA databanks, the first databank, referred to as the IMIS databank, contains approximately 2 million records collected from OSHA’s Salt Lake City Technical Center between 1979 and 2010. The second databank, referred to as the Chemical Exposure Health Data (CEHD), was first described in detail by Lavoue et al.(172) and contains analytical results from samples sent by OSHA inspectors to the Salt Lake City Technical Center between 1984 and 2009. The IMIS and CEHD databanks are related OEDBs that contain partially overlapping data on PAH exposure across the   24 United States. Most studies utilizing OEDBs have underlined their limitations and issues relating to the external validity of non-random selection of industries representing the general working population.(176) However, the databanks constitute invaluable sources of exposure data, provided that adequate efforts are put into better characterizing their limitations.(171, 172, 176) Furthermore, both the IMIS and CEHD databanks represent the largest single sources of occupational exposure data in the US and therefore their utility in assessing occupational exposures must be tested, including their applications to epidemiology. Similar levels of prevalence to PAH exposure(72) across industries within the US and Canada suggest that OSHA data can be relevant for exposure assessment in Canada.  The objective of this study was to develop a predictive probability model to identify industries and occupations that have a high risk of exceeding OSHA’s permissible exposure level for PAHs (PEL = 0.2 mg·m-3),(19) which is the same value as the Threshold Limit Value set by the American Conference of Governmental Industrial Hygienists.(177) The proposed model predicts the exceedance fraction (EF), i.e. the probability of exceeding OSHA’s PEL, to determine an individual’s likelihood of being exposed to excess levels of PAHs through his or her workplace history based on industry and occupation. Each job estimate, along with duration, can be used to approximate the cumulative exposure to PAHs. Predictive models can be used in epidemiological studies to help assess levels of exposure and similar models have been developed previously in studies of beryllium, formaldehyde, and wood-dust.(173, 175, 178) This model was developed in the context of a population-based case-control study of over 2,000 breast cancer patients in Canada that, as one of its objectives, aims to assess breast cancer risk due to occupational exposure to PAHs.   25 2.1 Occupational Safety and Health Administration Databanks OSHA provided two workplace compliance testing databanks that collected data across the US between 1979 and 2010. The IMIS databank was accessed through a Freedom of Information Act request. Each record included information about the company inspected (e.g. name, address, and number of employees) and information on the job title monitored was provided in free-text format. Date, sample number, chemical exposure, sample type (e.g. air particulate sample), exposure type (e.g. time-weighted average), and sampled exposure level were also recorded. The CEHD databank was accessed online.(179) The CEHD contained analytical results from samples sent to OSHA’s Salt Lake City Technical Center while the IMIS databank contained the inspectors’ interpretation of the lab results; hence the exposure data for CEHD were recorded as the raw duration of exposure monitored while IMIS samples were recorded as time-weighted averages (TWAs). The CEHD databank did not have any information regarding job title or occupation. Industries in both databanks were identified by 1987 Standardized Industrial Classification codes.(180)  2.1.1 Data Preparation and Definition of Exposure Limits Data for the analyses were restricted to personal measurements that represented approximately 90% and 77% of the IMIS and CEHD databank, respectively. Exposure to PAHs was measured by collecting multiple compounds, including benzo[a]pyrene, anthracene, naphthalene, and coal tar pitch volatiles.(19) To perform the analyses, we restricted the outcome measure to air samples of coal tar pitch volatiles (CTPV). Samples were collected by drawing known amounts of air through cassettes containing glass fiber filters; OSHA’s recommended air volume and sampling rate is 960 L at 2.0 min·L-1.(18) The filters were analyzed by extraction with benzene and   26 gravimetrically determining the benzene-soluble fraction. In cases where the benzene-soluble fraction exceeded the PEL, the sample was analyzed by high performance liquid chromatography with a fluorescence or ultraviolet detector to determine the presence of CTPV. CTPV quantified as benzene soluble fraction of total dust is a surrogate commonly used to characterize PAH exposure.(19) Exposure type was restricted to measurements identified as either TWA or non-detects; there were no short-term CTPV measurements. Variables that could identify companies were not utilized and the only identifier included in the analyses was the compliance inspection number. The inspection number is a unique 9-digit identifier tied to the particular inspection being done by a particular inspector. Although an inspector may perform multiple inspections over the course of his/her employment, the inspection number is unique to the particular site on that particular day. Only one inspector would typically visit a site and perform an inspection; there were no indicators if multiple inspectors were involved in any inspection. The analytical method for measuring PAHs was only available in the CEHD databank and therefore was not utilized in the analyses.  The IMIS databank contained 1,579 TWA personal measurements. The CEHD databank contained multiple personal measurements for each individual, which were used to calculate a total of 1,341 TWA measurements. Aggregating databanks yielded 2,920 measurements, with 822 overlapping measurements, i.e. 411 pairs, between the two databanks. Despite both sources being OSHA databanks, 137 of the 411 pairs were discordant in their reported concentrations. Two hundred and seventy four of the concordant pairs used equation (2.1):                   (2.1)   27 where    represents the measured exposure level at time i and ti represents the length of time sampled at time i. Half of the 137 discordant pairs used equation (2.1) while the other half used equation (2.2):                  (2.2) where the denominator represents the number of minutes in a standard work day. Equation (2.1) assumes the exposure levels at periods when sampling did not occur were the same levels as during the time of sampling, i.e. time-invariant exposure, and equation (2.2) assumes that the exposure levels at periods when sampling did not occur were zero, i.e. time-variant exposure. In cases where measurements were discordant we defaulted to the IMIS measurement calculated by the inspector since there was no reason to question his or her choice of the equation. A total of 2,509 TWA personal measurements representing 756 companies across 45 states were used for the analyses. Figure 2-1 summarizes the reasons for exclusion of PAH measurements from the two databanks. Empirical cumulative distribution functions were calculated to describe the distribution of detectable concentrations of CTPV concentrations of the two databanks and the associated subgroups, which can be seen in Figure 2-2.  The TWA measurements were dichotomized into a binary variable indicating whether or not the measurements exceeded the PEL. Although dichotomizing the data can result in loss of power(181) more than 25% of the sample measurements were censored, and by dichotomizing the sample measurements the censored values, otherwise known as non-detects (NDs), were classified as below the PEL. This negated the need of a model to impute NDs or for performing a complete case analysis using only detectable results, both of which pose a risk of inducing bias(182) and require sensitivity analyses that are heavily dependent on certain assumptions about   28 the censored exposure levels.(175) A binary variable may also be better than a single summary value, e.g. group mean, to represent individuals in similarly exposed groups because of bias in the sample values by measurement errors.(183) Using the PEL as our exposure threshold simplified the analyses, allowed us to model the probability of over-exposure to PAH, and provided an unbiased estimate of the proportion of samples above a threshold that exceeded the limit of detection (LOD) in the absence of other sources of bias.(173) Quantified by benzene soluble fraction, the current procedures and methods used by OSHA has a LOD of 0.006 mg·m-3 for CTPV.(18) Characteristics of the databanks can be found in Table 2-1, including the overlap between IMIS and CEHD (column 3), IMIS excluding the overlap (column 1), and CEHD excluding the overlap (columns 2).   2.1.2 Regrouping Industry Codes Industries within the amalgamated databanks were represented by 225 Standard Industrial Classification (SIC) codes or 179 North American Industry Classification System (NAICS) codes. Industries were represented by SIC coding in the OSHA databanks; not all observations had a corresponding NAICS codes because this industrial coding scheme was only introduced in 1997. To fill in the missing NAICS codes an algorithm was developed that used a crosswalk between the two coding systems and assigned a NAICS code if none was present.(184) Despite the discontinued use of SIC coding system in many countries, including the US and Canada, incorporating both coding systems allows analysis of historical data for epidemiological studies using SIC coding that can be transcribed into equivalent NAICS codes for modern applications.   A minimum requirement of 5 measurements per SIC or NAICS code was set to ensure stable estimates of the EF, and in cases where a code had an insufficient size the code was collapsed   29 into the upper-level of the hierarchical system. For example, SIC code 1791 (Structural Steel Erection) had only 1 observation, and similarly, SIC 1731 (Electrical Work) and 1794 (Excavation Work) had 2 and 3 observations, respectively. Therefore, all three codes were moved up the hierarchy to the 2-digit code 17 for a total of 6 observations; the 3-digit codes 173 and 179 also had insufficient sample sizes. In cases where other codes within the hierarchy of code 17 had sufficient sample sizes, the codes were excluded from the collapsed code 17 to maintain as much distinction as possible between industries with similar economic activities. For example, SIC 1761 (Roofing, Siding, and Sheet Metal Work) had 245 measurements and therefore was not included in SIC 17, thus the estimates for SIC 17 were actually for SIC 17 excluding SIC 1761, which had its own estimates. This algorithm was continued until the target sample size was achieved or collapsing the code(s) resulted in the single-digit division (SIC) or 2-digit sector (NAICS) code. Regrouping of industry codes resulted in a reduction of both SIC and NAICS codes from 223 and 179 to 125 and 110, respectively. Differences in the number of groups for each coding system occurred because a single SIC code could correspond to multiple NAICS codes and vice-versa.  2.1.3 Coding Occupations Within Industries for IMIS Dataset Job titles and descriptions in the original IMIS dataset were incorporated in the statistical modeling. A total of 862 unique free-text job titles and descriptions were present in the IMIS dataset; however, there were several instances where a single job was described in multiple forms. For example, free-text job descriptions firefighter and fireman were considered the same occupation. Occupations were initially coded to the 2010 Standard Occupation Classification (SOC) system,(185) which resulted in more than 140 SOC codes. For simplicity and allowing like-occupations to be grouped together, occupations were regrouped into 4 broader categories for the   30 modeling phase: 1) minimally skilled workers and labourers (e.g. janitor, packers), 2) skilled workers and operators (e.g. roofers, crane operator), 3) supervisors and foremen (e.g. foreman, engineer), and 4) administrative personnel (e.g. secretary, manager, owners).  2.2 Statistical Modeling: Predictive Probability Model The probability of exceeding the PEL for CTPV was modeled using mixed-effect logistic regression. Given that the IMIS and CEHD databanks partially overlapped (see Table 2-1), a binary indicator was created to separate IMIS data (IMIS-only and both-IMIS-CEHD) and the CEHD data (CEHD-only, which excludes the overlapping data from IMIS). The model is described in equation (2.3):                                           ndustry         S                                 (2.3) where ni is the number of measurements taken during inspection number i,     represents the probability of the sample exceeding the PEL value for CTPV, i.e.              mg m-  , and IMISij = (1 if data source is IMIS-only or both-IMIS-CEHD, 0 elsewhere). Equation (2.3) will henceforth be referred to as the OSHA model.  Model assumptions were (1) the random-effect bi for the i th inspection is normally distributed with mean μ = 0, (2) random-effect bi and random variation within-inspection     are statistically independent, and (3) within-inspection effect     is multi-normally distributed with mean μ = 0 and some variance-covariance Σ. Time trends were tested to explore the potential effects of changes in technology on PAH exposure. Inspection number was used as a random effect to explore potential correlation between measurements made in a small time window within the   31 same facility. Ignoring the possible correlation in the model structure when present can result in under-estimating the standard errors for the fixed-effects.(186) An unstructured variance-covariance matrix was used for the random-effects because it employs no constraints on the variance and covariance parameters. Moreover, there is no theoretical justification to infer any particular constraint (e.g. compound symmetry). Although this method reduces our degrees of freedom, the sample size is sufficiently large to estimate the additional parameters.   A second model using the same general equation was developed for a subset analyses on the IMIS databank that incorporated jobs performed within industries, which were represented by the 4-broad occupation categories. This was done because exposure across occupations within an industry may not be homogeneous. The model will henceforth be referred to as the IMIS model and is described by equation (2.4):                                           ndustry      Occup                                        (2.4) Effect coding was used in the industry variable to allow comparisons of the probability of exceedance in a specific category to the overall mean probability of exceedance.(187) For both models, SIC code 2759 and NAICS code 32311, which represent Commercial Printing (Not Elsewhere Classified), were selected as reference groups because interpreting their results were not of interest because they are ill-defined groups, i.e. Not Elsewhere Classified (NEC), and had sufficient sample sizes compared to other NEC groups. All data analyses were conducted with the statistical software R (version 2.14.2, R Foundation for Statistical Computing, Vienna, Austria) with mixed-effects modeling performed using the lme4 package.(188)    32 2.3 Results 2.3.1 Descriptive Analysis The amalgamated databanks contained 2,509 CTPV exposure measurements (mg·m-3), of which 25.3% were below the LOD. The median number of measurements per regrouped SIC and NAICS code was 8 and 9, respectively. Table 2-2 shows descriptive statistics of CTPV concentrations and EF stratified by regrouped SIC codes with at least 20 measurements. The descriptive statistics were calculated based on the time-weighted averages (TWAs) for each industry, excluding censored observations, and the EF values for each industry used all data for each respective industry. The 3 most surveyed industries by SIC code were: SIC 3321 (Gray and Ductile Iron Foundries, n = 301), SIC 1761 (Roofing, Siding, and Sheet Metal Work, n = 245), and SIC 3334 (Primary Production of Aluminum, n = 229), and by NAICS code were: NAICS 33151 (Ferrous Metal Foundries, n = 348), NAICS 33131 (Alumina and Aluminum Production and Processing, n = 304), and NAICS 23816 (Roofing Contractors, n = 245). A total of 2 SIC and 8 NAICS regrouped codes had less than 5 measurements, respectively. The average CTPV concentration and EF per year had a slight inverse temporal trend. The median number of measurements per year was 67 (interquartile interval of 27-116), with more than 80% of the tests made prior to 1995. Full descriptive statistics of SIC and NAICS codes can be found in the Appendix (see Supplementary Table A1 and Supplementary Table A2, respectively). The IMIS databank contained information on 1,579 observations relating to occupation in free-text description. These job titles and descriptions were grouped into a broader 4-occupational category where the majority of occupations were classified as minimally skilled workers/labourers. Similar to Table 2-2, Table 2-3 shows descriptive statistics of the 4-category occupation group using the non-censored data while the summary of the EF values used all the   33 data. Not surprisingly, minimally skilled workers/labourers had the highest average exposure and EF while administrative personnel had the lowest.  2.3.2 Comparison of IMIS and CEHD Datasets Details on the overall differences in the two databanks can be found in work by Lavoue et al.(172) Overall, the two datasets and both-IMIS-CEHD (the overlap) were fairly comparable, although the CEHD-only samples did have a larger variability compared to the other subsets.   Figure 2-2 shows the empirical cumulative distribution functions of the detectable levels of CTPV concentrations in each subset. As the exposure level increases towards the PEL, the empirical cumulative distribution function of CEHD is lower than IMIS-only, potentially indicating higher CTPV exposure levels in CEHD-measured sites. Subsequently, the 50th (median), 75th and 90th percentiles were higher among CEHD-only and both-IMIS-CEHD sites than IMIS-only sites (median: 0.20, 0.19, 0.15; 75th percentile: 0.46, 0.54, 0.37; 90th percentile: 1.10, 1.33, 0.97), respectively. However, this difference in exposure level may in fact be an artefact of the data due to usage of equation (2.2) when calculating TWA measurements.  2.3.3 Probability of Exposure Above PEL According to Industry SIC Code Industries were represented by 125 SIC codes in the predictive probability model. Twenty-eight SIC codes had no measurements exceeding the PEL, 94 had an EF between 0 and 1, and 3 had all measurements exceeding the PEL. For computational stability of the model we restricted the dataset to SIC codes that did not have all measurements either 0 or 1. This reduced our dataset to 2,292 personal TWA measurements. The variance for the random-effects, σb2 = 2.21, was tested for significance (H0: σb2 = 0) using the likelihood ratio test (LRT) to compare the model with and   34 without the random effects, LRT = 164.33 (p-value < 0.001). This was confirmed using a more accurate p-value obtained by the parametric bootstrap approach(189) (simulation size, s = 1000) that indicated a non-zero variance for the random-effects distribution.  Table 2-4 shows the coefficients of the logistic mixed-effect model using the restricted dataset for regrouped SIC codes that had at least 20 measurements and their respective predicted probabilities of exceeding the PEL,     when setting year to 1980 and the databank indicator to IMIS (see Appendix: Supplementary Table A3 for full results). For calculating the predicted probabilities, year was set to 1980 because it was one of the most monitored years and accounted for almost 10% of the data. There was an inverse time trend with decreasing likelihood of exposure above the PEL in more recent years, albeit not statistically significant (β1 = -0.008, p-value = 0.5) and the databank indicator was significantly associated with a decreasing likelihood of exposure above the PEL (β1 = -0.442, p-value = 0.02). Industries where the SIC code measurements were either all 0 or 1 had their estimates βi, standard error of the estimate SE(βi), predicted value ŷ for 1980, and standard error SE(ŷ) left blank. Seven SIC codes had predicted EF > 0.80: Major Group 13 (Oil and Gas Extraction), Major Group 30 (Rubber and Miscellaneous Plastic Products), SIC 2824 (Manmade Organic Fibres), which all had an observed EF = 1; and SIC 1623 (Water, Sewer, Pipeline, and Communications and Powerline Construction), 1711 (Plumbing, Heating, and Air-Conditioning), 5812 (Eating Places), and 3496 (Miscellaneous Fabricated Wire products). Fifteen SIC codes had predicted EF values between 0.50-0.80, 45 SIC codes had predicted EF values between 0.15-0.50, and 58 SIC codes had predicted EF < 0.15; 28 of which had an observed EF = 0. Overall, 103 out of 125 SIC codes had predicted EF < 0.50 and 34 had a predicted EF < 0.05.    35 2.3.4 Probability of Exposure Above PEL According to Industry NAICS Code Industries were represented by 110 NAICS codes in the predictive probability model. Twenty-four NAICS codes had no measurements exceeding the PEL, 84 had an EF between 0 and 1, and 2 NAICS codes had all measurements exceeding the PEL. Similar to the SIC version of the OSHA model, data were restricted to include only NAICS codes that did not have all measurements either 0 or 1, which reduced our dataset to 2,333 personal TWA measurements. There was a non-significant inverse time trend (β1 = -0.003, p-value = 0.8) and a significant databank indicator (β1 = -0.453, p-value = 0.03). The variance for the random-effects, σb2 = 2.80, was tested for significance (H0: σb2 = 0) using the LRT, LRT = 209.56 (p-value < 0.001), and the parametric bootstrap approach confirmed a non-zero variance for the random-effects distribution.  Appendix: Supplementary Table A4 shows the estimated coefficients and predicted probabilities for the NAICS model when setting year to 1980 and databank indicator to IMIS. Seven NAICS codes had a predicted EF > 0.80: Major Group 21 (Mining, Quarrying, and Oil and Gas Extraction), NAICS 32522 (Artificial and Synthetic Fibres and Filaments Manufacturing), both of which had an observed EF = 1, and NAICS 23712 (Oil and Gas Pipeline and Related Structures Construction), 42383 (Industrial Machinery and Equipment Merchant Wholesalers), 48412 (General Freight Trucking, Long-Distance), 23822 (Plumbing, Heating, and Air-Conditioning Contractors), and 33329 (Other Industrial Machinery Manufacturing). Ten NAICS codes had predicted EF values between 0.50-0.80, 40 NAICS codes were between 0.15-0.50, and 53 SIC codes had predicted EF < 0.15, of which 24 had an observed EF = 0. Overall, 93 out of 110 NAICS codes had predicted EF < 0.50 and 33 had a predicted EF < 0.05.    36 2.3.5 Comparison of Exposure Above PEL Between Industry Codes Figure 2-3 to Figure 2-5 contain Bland-Altman plots(190) comparing the difference in the predicted EF for a SIC and its corresponding NAICS code against the average between the predicted EFs for the SIC-NAICS combinations assessed in 1980. Incidences where the difference and average between predicted EFs were the same were scaled, i.e. the size of the point for a given x-y plot was scaled for the number of occurrences. Each observation from the 2,509 measurements was assigned a predicted probability, but to make the figures interpretable only unique SIC-NAICS combinations were plotted; this resulted in 262 combinations being plotted instead of 2,509 combinations. The plots on the left side of Figure 2-3 to Figure 2-5 contain data from the models with regrouped industry codes while the plots on the right side contain the data from the models with the original industry code. Figure 2-3 contains Bland-Altman plots where there was a direct one-to-one mapping for each SIC and NAICS pairing, i.e. each SIC code corresponded to a single NAICS code and vice-versa. Figure 2-4 contains the Bland-Altman plots where one SIC code corresponded to more than one NAICS code and vice-versa, while Figure 2-5 contains the plots for all the data.  The Bland-Altman plots describe varying degrees of agreement between the results of the two versions of the OSHA model depending on how well the SIC and NAICS codes matched across the two classification systems. In Figure 2-3, where there is direct one-to-one mapping between coding systems, regrouping industry codes may not be necessary since, as seen in the left plot of Figure 2-3, there is more bias being created by grouping multiple codes together. However, in Figure 2-4, where there is multiple-to-one mapping for various SIC-NAICS and NAICS-SIC combinations, regrouping industry codes may be required to ensure a stable estimate.    37 2.3.6 Probability of Exposure Above PEL Incorporating Industry and Occupation: Subset Analysis using IMIS Dataset A secondary analysis, which we will refer to as the IMIS model, was performed on the IMIS dataset using similar steps as the full OSHA dataset. Industries in this subset analysis were represented by 99 SIC or 84 NAICS codes. The variance of the random-effects for the SIC and NAICS versions of the IMIS model when incorporating occupation were σb2 = 2.11 and σb2 = 2.84, with LRT values of 102.32 (p-value < 0.001) and 133.23 (p-value < 0.001), respectively. Similar to the OSHA models, the parametric bootstrap approach confirmed a non-zero variance for the random-effects distribution.   Table 2-5 shows the results of the IMIS model, along with the coefficients for the occupation categories, using the IMIS dataset and regrouped SIC codes that had at least 20 measurements stratified for minimally skilled workers and administrative personnel. Full results for SIC and NAICS models incorporating occupation can be seen in the Appendix (Supplementary Table A5 and Supplementary Table A6, respectively). In both versions of the IMIS model there were non-significant inverse time trends (S C: β1 = -0.006, p-value = 0.7; NA CS: β1 = -0.0004, p-value = 0.9). Setting year to 1980 and occupation to minimally skilled workers/labourers there were 88 SIC codes and 70 NAICS (out of the original 99 and 84 regrouped codes, respectively) that produced estimates, as not all occupations were represented in each industry, e.g. half of the occupations monitored in SIC 1761 were classified at the minimally skilled level while none were classified as at the administrative level. Three SIC (5 NAICS) codes had a predicted EF > 0.8, 17 SIC (10 NAICS) codes had values between 0.50-0.80, and 34 SIC (29 NAICS) codes were between 0.15-0.50. There were 34 SIC (26 NAICS) codes with predicted EF < 0.15, of   38 which 17 SIC (10 NAICS) had an observed EF = 0. Overall, more than 77% of industries had a predicted EF less than 0.5, although less than 22% had an EF under 0.05 regardless of industry code used. Similar trends were found when setting occupation to administrative personnel, although not surprisingly the predicted EFs were lower and no SIC or NAICS code had a predicted EF > 0.8. Regardless of industry code used, administrative personnel in more than 97% of industries had a predicted EF less than 0.5 and almost 60% had a predicted EF under 0.05.   Table 2-6 compares the results from Table 2-4 and Table 2-5 for a subset of SIC codes that were estimated using the OSHA and IMIS models that had at least 20 measurements. Appendix Supplementary Table A7 and Supplementary Table A8 contain the full results of the comparison between the SIC and NAICS versions of the two models, respectively. As expected, in comparison to our OSHA model, which was based on industry-only, the estimates from the IMIS model for the majority of industries had differing risks of exceeding the PEL when controlling for occupation. For example, the OSHA estimate for SIC 1761 (Roofing, Siding, and Sheet Metal Work) and the IMIS estimate for minimally skilled worker/labourer in the same SIC were between EF = 0.40 – 0.50, while the estimate for administrative personnel was substantially lower at EF = 0.08.   2.4 Discussion Using US OSHA compliance monitoring data, we developed a set of predictive probability models that use industry classification, either through SIC or NAICS coding; occupation, in the form of our broad 4-category classification system; and year that can be used to predict the probability of exceeding OSHA's PEL for PAHs. Regardless of choice of the industry coding,   39 both the OSHA and IMIS models provided similar predictions of exceeding the PEL, although there appeared to be some deviation depending on how the coding schemes were grouped. For example, of the 7 industry codes that had EF > 0.8, only four of the codes were concordant between coding systems: SIC Major Group 13 corresponded to NAICS 21 and SIC 1623, 1711, and 2824 directly translated to NAICS 23712, 23822, and 32522.   As observed in the Bland-Altman plots, there were some differences in predicted EF values from the OSHA model depending on how the industry codes matched across the two classification systems. In Figure 2-3, where there was direct one-to-one mapping between industry codes, differences between either coding systems were minimal and regrouping codes may not be necessary assuming there was a sufficient sample size in each code to produce stable estimates. If an industry code has multiple counterparts in the other coding system, regrouping may be necessary, especially when we need to increase stability of estimates due to insufficient sample size in one of the counterpart codes. In Figure 2-4 it was observed that, compared to the collapsed groups, when we did not collapse the smaller industry code groups the incidence where the difference between SIC-NAICS pairings was greater than 0.5 occurred more frequently. Interestingly, Figure 2-4 and Figure 2-5 displayed a distinctive quadrilateral shape. This can be explained by the fact that, given the range of values for each probability is (0,1), the difference between the predicted probabilities of the two corresponding codes is dependent on their averages. The difference ε, is defined by the formula:                                          (2.5) where x is the average of the two probabilities from each SIC-NAICS combination. Therefore, if the average of the two probabilities were at either extreme of 0 or 1 then the difference must be   40 0, while if the average was 0.5 then the difference is between (-1, 1). The existence of such differences between the SIC and NAICS estimates is a limitation because it forces us to be cautious when predicting probabilities for likelihood of exceedance depending on the industry code. That is, in some cases, determining risk associated with an industry using a SIC code in the OSHA model may over or under estimate risk of exceeding the PEL when compared to the corresponding NAICS version. However, this is an artefact of the greater detail in NAICS coding, i.e. a single SIC code corresponds to multiple NAICS codes. Regardless, both the SIC and NAICS versions of the OSHA model offer a degree of agreement in their predictive abilities regardless of whether or not we collapse either coding scheme with Spearman rank correlation (using all measurements) ranging between 0.68-0.99 (all p-values < 0.01). The non-zero variance between coding systems was expected (and assumed) since we do not expect industry codes to yield homogeneous exposure groups; these subtle differences between the predicted probabilities illustrate the impact that the choice of coding schemes can have on a result. These differences can give investigators a sense of how much variability in exposure estimates can be due to coding and can therefore be used in sensitivity analyses. Nonetheless, the results are fairly consistent with the literature and multiple industries that we identified as having high risk of exposure to PAHs were in the upper ranges of the estimated EF values, including the trucking and motor industry,(22, 49, 191) production of synthetics,(61, 192, 193) mining and related industries,(194, 195) pipelines and related structures construction,(193) and food-service industries.(78, 196-198)  Remodelling the data to incorporate occupation using the 4-category variable allowed us to infer differences in likelihood of exceedance within industries. Although there were more variables available in the IMIS databank that would allow more complex modeling of the IMIS data, we   41 opted for a parsimonious model to minimize the differences between the OSHA model and the IMIS model that incorporated occupation. Appendix Supplementary Table A7 and Supplementary Table A8 compares the original OSHA models against the IMIS models when predicting for minimally skilled workers/labourers and administrative personnel where applicable. For some industries there were no differences between any of the three predicted probabilities, e.g. the predicted probabilities for SIC 1611 (Highway and Street Construction, Except Elevated Highways) based on the OSHA model, IMIS model for minimally skilled workers, and the IMIS model for administrative personnel ranged between 0.01-0.05. However, there were several instances where controlling for occupation in the IMIS model produced different estimates than the OSHA model, e.g. for SIC 3287 (Nonclay Refractories) the OSHA estimated EF = 0.22, but controlling for occupation the IMIS estimated EF for minimally skilled workers was EF = 0.80 while for administrative personnel it was EF = 0.27. Similarly, SIC 5812 (Eating Places) had an estimated EF = 0.82 based on the OSHA model, but adjustment for occupation by the IMIS model found that the estimated EF for servers was EF = 0.77 while management had an estimated EF = 0.23. These results indicate a difference in the level of exceedance that can occur when modeling through industry only, thereby risking bias due to misclassification when ignoring occupation.  Limitations of this study relate to the categorization of jobs performed, industry coding, and the outcome measure. Identification of jobs was limited to tasks that could be inferred from the recorded job description. This may result in errors that would bias the associations in the IMIS models not necessarily towards the null.(199) The bias in our models would certainly be away from the null if more accurate information was recorded for workers who were suspected to be   42 more highly exposed than for lower exposed workers. This would result in differential misclassification of the determinants of exposure with respect to the probability of exceedance to PAHs, i.e. if the inspector correctly believed that some workers were highly exposed and recorded a more precise description of job and industry, although the broader job classification would hopefully mitigate any such bias. Furthermore, although the broader job category can increase error, using the more accurate SOC codes to represent occupation would create a categorical variable with over 140 levels, thereby producing uninformative and unstable estimates. There is also the issue of assigning similar probabilities to multiple jobs that could have different probabilities of exceedance. For example, aluminum workers that work in Söderberg pot-room and prebake pot-room have very different levels of exposure, with the former having significantly higher levels of PAH exposure than the latter,(38) but are often classified as having similar jobs. Although this presents other sources of error, these results do concur with previous findings that indicate predicting exposure by industry alone may not be sufficient and that supplementing with occupation provides better more robust estimates.(200) Another limitation in this study is the inability to test for interactions with time and interactions between industry and job occupation. Regrouping occupations into much broader categories would have allowed us to look at interactions since not all occupations existed in all industries. Unfortunately, there was still an insufficient size for each stratum due to the limitation in the number of occupations measured for each company during testing. Similar issues relating to sample size were previously seen with the industry codes as we had to collapse SIC and NAICS codes higher-up in their hierarchy in order to generate reliable estimates. Although collapsing industries created heterogeneity within the same industry code this was an artefact of the data that could have occurred regardless of regrouping because companies tend to have multiple   43 classifications, e.g. General Electric has 5 NAICS codes, including 33522 (Major Kitchen Appliance Manufacturing), 33511 (Electric Lamp Bulb and Parts Manufacturing), and 33361 (Turbine and Turbine Generator Set Unit Manufacturing).   Our models used a dichotomized form of the outcome measure instead of modeling it as a continuous measure, which can be problematic for applying the outcome as a group-base measurement in epidemiologic research. For example, using the group mean is an established technique to reduce the impact of measurement error(201) and dichotomization via a threshold, in this case the PEL, can create differential misclassification.(202) However, it is plausible that differential misclassification exists when using group-means due to exposure levels being a mixture of lower and higher-levels of exposure, resulting in some individuals potentially receiving different intensities of exposure than estimated.(203) Consequently, even if our estimates of exceedance were perfect, the application of the model would still risk misclassification of exposure levels since we cannot model continuous exposure. It is also possible that the exposure estimates are contaminated by errors from the various codes, i.e. heterogeneously exposed industries with the same SIC or NAICS, the broad job category, and errors from the job titles of the observations obtained by OSHA. Use of an exceedance probability helps limit such bias, which is especially relevant given that traditional methods, e.g. averaging OSHA measurements, are inappropriate when there is a non-normal distribution of the samples due to a high percentage of NDs. Furthermore, despite using a non-continuous exposure, the methods OSHA uses when investigating work sites are for determining if the samples exceed the PEL instead of actually measuring exposure and therefore using probability may be more reliable. Inspectors may not   44 know the mean exposure precisely but their methods allow them to determine whether the PEL was exceeded, as ND issues do not arise near PEL.   Although not a limitation, it should be noted that with respect to declining trends of exposure, comparability between continuous and dichotomized outcomes could be problematic. It is possible for time-trends to be different between exposure levels and EF, even when most exposures are under control. A decrease in exposure levels can occur among industries that are in compliance, but due to our choice of cut-point when defining a limit for dichotomization this decrease may not be recognized in the EF. For example, the model would not pick up exposure levels just below the dichotomization cut-point and any potential decreases would not be identified. Unfortunately, we are unaware of efforts of modeling time trends in EF that would allow us to benchmark our results. The mode and type of exposure were also a limiting factor as we only considered airborne concentrations that may give rise to both inhalational and dermal exposure.(204) Complications may also arise from the use of CTPV for a surrogate of PAH exposure that may exacerbate the issue of misclassification with respect to categorizing levels of PAH exposure. However, since the 1970’s, many studies have been using CTPV has an indicator for airborne PAH and many of the surrogates for PAH, as recommended by the United States Environmental Protection Agency, constitute 40-90% of CTPV fractions.(205) Although not a limitation, another issue that should be acknowledged are differences in measurements between the databanks. Equations used to calculate the TWA varied depending on the databank, which could bring into question the validity of the measurements; however this issue is potentially alleviated when dichotomizing the outcome measure. Furthermore, there is still the conundrum of why there are measurements in the IMIS databank that are not in the CEHD that we cannot   45 explain. Despite these potential limitations, we can still draw meaningful inferences from our models and identify industries that are at risk of exposure to PAHs. Subsequently, these results can be applied to epidemiological studies by using an individual’s work history to determine his or her likelihood of exceedance to PAHs. Future applications of these models can be used to develop job exposure matrices whereby the probabilities can be used to determine levels of intensity of exposure that can be combined with duration (e.g. duration with no exposure greater than the PEL, duration with exposure that is likely to exceed the PEL, etc.) that can be used to approximate cumulative exposure to PAHs.  2.5 Conclusions OSHA monitoring data are an important element in both occupational and epidemiological research. Although evaluation through sampling and data collection is prone to error, these potential issues should not detract from the usefulness of the monitoring data. Through these databanks we have developed several predictive probability models that allowed the estimation of the individual likelihood of exceeding the PEL for PAHs based on industry and occupation. These models have direct applications in epidemiological research and are a useful method to determine exposure to PAHs through the development of job-exposure matrices. Furthermore, these metrics are expected to yield reasonable results that can provide acceptable ranges for classification of occupational PAH exposure that can be used to explore PAH exposure-outcome relationship. In particular, we plan to apply these models to a job-exposure matrix for a population-based case-control study of breast cancer.   46 2.6 Tables Table 2-1: Characteristics of Measurements of Coal Tar Pitch Volatiles (mg·m-3) in OSHA databanks  Data sources Characteristic IMIS-only CEHD-only both-IMIS-CEHD IMIS CEHD OSHA3 Time Period 1979-2010 1984-2009 1984-2009 1979-2010 1984-2009 1979-2010 Sample Size 1,168 930 411 1,579 1,341 2,509 SIC codes 132 130 79 169 164 225 NAICS codes 110 110 66 134 132 179 Mean1 0.43 0.65 0.56 0.47 0.63 0.54 Standard Deviation1 0.93 2.85 1.05 0.97 2.46 1.92 Median1 0.15 0.20 0.19 0.17 0.20 0.17 IQR1 0.30 0.37 0.44 0.36 0.39 0.36 Geometric Mean1 0.16 0.21 0.23 0.18 0.22 0.19 Geometric Standard Deviation1 4.00 3.70 3.77 3.98 3.73 3.89 Non-Detects (%)2 308 (26.4) 219 (23.5) 109 (26.5) 417 (26.4) 328 (24.5) 636 (25.3) Exceedance Fraction2 0.30 0.38 0.36 0.32 0.38 0.34 1 TWA=8-hour time weighted average with Non-detects excluded 2 EF=Percentage of values above the OSHA PEL (0.200 mg·m-3) 3 The OSHA databank is the combination of IMIS and CEHD, as well as the combination of IMIS-only, CEHD-only, and both-IMIS-CEHD.      47 Table 2-2: Descriptive Statistics of Concentrations of Coal Tar Pitch Volatiles (mg·m-3) in OSHA data by regrouped SIC Code with 20 or more measurements   TWA Measurements All Data SIC Label N1 Mean (SD)1 Median (IQR)1 Geo Mean (Geo SD)1 N2 EF2 1611 Highway and Street Construction, Except Elevated Highways 21 0.11 (0.10) 0.06 (0.05) 0.09 (2.08) 32 12.5 1623 Water, Sewer, Pipeline, and Communications and Power Line Construction 33 1.84 (2.40) 1.38 (1.51) 0.90 (3.78) 33 87.9 1761 Roofing, Siding, and Sheet Metal Work 192 0.77 (1.64) 0.26 (0.53) 0.26 (4.61) 245 44.5 2491 Wood Preserving 16 0.08 (0.05) 0.06 (0.06) 0.06 (1.96) 34 2.9 2759 Commercial Printing, Not Elsewhere Classified 22 0.30 (0.48) 0.12 (0.28) 0.12 (3.82) 23 26.1 2819 Industrial Inorganic Chemicals, Not Elsewhere Classified 8 0.31 (0.32) 0.16 (0.29) 0.20 (2.53) 20 15.0 2865 Cyclic Organic Crudes and Intermediates, and Organic Dyes and Pigments 13 0.47 (0.81) 0.16 (0.24) 0.18 (3.54) 22 18.2 2911 Petroleum Refining 17 0.09 (0.11) 0.06 (0.06) 0.06 (2.74) 21 9.5 2951 Asphalt Paving Mixtures and Blocks 18 0.20 (0.21) 0.07 (0.33) 0.07 (7.73) 21 33.3 2952 Asphalt Felts and Coatings 35 0.40 (0.58) 0.24 (0.42) 0.21 (3.19) 42 42.9 2999 Products of Petroleum and Coal, Not Elsewhere Classified 14 0.42 (0.62) 0.21 (0.42) 0.22 (2.91) 21 38.1 3011 Tires and Inner Tubes 60 0.55 (0.69) 0.28 (0.55) 0.27 (3.67) 67 49.3 3069 Fabricated Rubber Products, Not Elsewhere Classified 48 0.71 (0.79) 0.40 (0.63) 0.41 (3.00) 56 66.1 3297 Nonclay Refractories 23 0.28 (0.26) 0.18 (0.31) 0.19 (2.51) 26 42.3 3312 Steel Works, Blast Furnaces (Including Coke Ovens), and Rolling Mills 65 0.25 (0.35) 0.13 (0.17) 0.14 (3.00) 78 29.5 3321 Gray and Ductile Iron Foundries 254 0.42 (0.68) 0.19 (0.31) 0.20 (3.30) 301 39.5 3325 Steel Foundries, Not Elsewhere Classified 21 0.13 (0.10) 0.09 (0.11) 0.09 (2.32) 28 14.3 3334 Primary Production of Aluminum 193 0.27 (0.44) 0.13 (0.19) 0.13 (3.03) 229 26.6 3353 Aluminum Sheet, Plate, and Foil 22 0.23 (0.12) 0.22 (0.20) 0.19 (2.03) 23 52.2 3355 Aluminum Rolling and Drawing, Not Elsewhere Classified 25 0.35 (0.41) 0.18 (0.26) 0.18 (3.25) 42 26.2 3365 Aluminum Foundries 42 0.25 (0.31) 0.14 (0.30) 0.13 (3.84) 53 32.1 3366 Copper Foundries 25 0.29 (0.29) 0.20 (0.20) 0.18 (2.63) 37 35.1 3462 Iron and Steel Forgings 24 0.69 (1.51) 0.28 (0.44) 0.23 (4.55) 39 38.5 3479 Coating, Engraving, and Allied Services, Not Elsewhere Classified 46 1.41 (6.50) 0.14 (0.31) 0.17 (5.28) 61 23.0 3498 Fabricated Pipe and Pipe Fittings 24 1.44 (2.24) 0.42 (1.59) 0.43 (5.70) 26 53.8 3624 Carbon and Graphite Products 78 0.36 (0.51) 0.19 (0.32) 0.19 (3.00) 89 41.6 Division D Manufacturing (Major Groups: 20-39) 19 0.55 (0.95) 0.15 (0.56) 0.21 (3.85) 22 36.4 1 TWA=8-hour time weighted average with Non-detects excluded 2 EF=Percentage of values above the OSHA PEL (0.200 mg·m-3)     48 Table 2-3: Descriptive Statistics of Concentrations of Coal Tar Pitch Volatiles (mg·m-3) in the IMIS databank by Occupation Category derived from Free Text  TWA Measurements All Data Label N1 Mean (SD)1 Median (IQR)1 Geo Mean (Geo SD)1 N2 EF2 Minimally Skilled Worker/Labourers 342 0.51 (1.03) 0.16 (0.40) 0.17 (4.47) 441 34.2 Skilled Worker/Operator 763 0.46 (0.96) 0.17 (0.35) 0.18 (3.85) 1038 32.0 Supervisor/Foremen 23 0.36 (0.53) 0.13 (0.30) 0.17 (3.12) 31 22.6 Administrative Personnel 16 0.12 (0.07) 0.11 (0.08) 0.10 (1.88) 43 7.0 Unknown 18 0.32 (0.42) 0.15 (0.26) 0.17 (3.17) 26 30.8 1 TWA=8-hour time weighted average with Non-detects excluded 2 EF=Percentage of values above the OSHA PEL (0.200 mg·m-3)   49 Table 2-4: Regrouped SIC coefficients and Predicted Probabilities of exceeding the PEL (0.2 mg·m-3) for PAHs assessed as Concentrations of Coal Tar Pitch Volatiles in 1980 from the OSHA databank for SIC code with 20 or more measurements SIC Label b SE(b)   SE(  ) 1611 Highway and Street Construction, Except Elevated Highways -1.91 0.89 0.05 0.04 1623 Water, Sewer, Pipeline, and Communications And Power Line Construction 3.83 1.02 0.94 0.05 1761 Roofing, Siding, and Sheet Metal Work 0.71 0.27 0.42 0.06 2491 Wood Preserving -3.40 1.45 0.01 0.02 2759 Commercial Printing, Not Elsewhere Classified 0.00 0.00 0.15 0.17 2819 Industrial Inorganic Chemicals, Not Elsewhere Classified -1.24 1.04 0.09 0.09 2865 Cyclic Organic Crudes and Intermediates, and Organic Dyes and Pigments -0.80 0.85 0.14 0.10 2911 Petroleum Refining -2.02 1.18 0.05 0.05 2951 Asphalt Paving Mixtures and Blocks -0.82 0.88 0.14 0.10 2952 Asphalt Felts and Coatings 0.10 0.55 0.29 0.11 2999 Products of Petroleum and Coal, Not Elsewhere Classified 0.76 0.94 0.44 0.23 3011 Tires and Inner Tubes 0.87 0.49 0.46 0.12 3069 Fabricated Rubber Products, Not Elsewhere Classified 1.62 0.50 0.65 0.12 3297 Nonclay Refractories -0.28 0.83 0.22 0.14 3312 Steel Works, Blast Furnaces (Including Coke Ovens), and Rolling Mills -0.75 0.54 0.15 0.07 3321 Gray and Ductile Iron Foundries 0.02 0.25 0.27 0.05 3325 Steel Foundries, Not Elsewhere Classified -1.77 0.95 0.06 0.05 3334 Primary Production of Aluminum -0.56 0.40 0.17 0.06 3353 Aluminum Sheet, Plate, and Foil 0.78 1.00 0.44 0.25 3355 Aluminum Rolling and Drawing, Not Elsewhere Classified 0.21 1.07 0.31 0.23 3365 Aluminum Foundries -0.02 0.60 0.26 0.12 3366 Copper Foundries -0.03 0.56 0.26 0.11 3462 Iron and Steel Forgings 0.36 0.67 0.34 0.15 3479 Coating, Engraving, and Allied Services, NEC -0.72 0.51 0.15 0.06 3498 Fabricated Pipe and Pipe Fittings 1.04 0.72 0.51 0.18 3624 Carbon and Graphite Products 0.47 0.47 0.37 0.11 Div D Manufacturing 0.07 0.66 0.28 0.13   50 Table 2-5: Regrouped SIC coefficients and comparison of Predicted Probabilities of exceeding Permissible Exposure Limit (0.2 mg·m-3) for Polycyclic Aromatic Hydrocarbons assessed as Concentrations of Coal Tar Pitch Volatiles in 1980 from the IMIS subset databank for Minimally Skilled Workers/Labourers and Administrative Personnel for SIC codes with 20 or more measurements Covariate Description   Minimally Skilled Workers Administrative  Personnel b SE(b)   SE(  )   SE(  ) CAT 0 Minimally Skilled Workers/Labourers 0.73 0.27 NA NA NA NA CAT 1 Skilled Workers/Operators 0.58 0.25 NA NA NA NA CAT 2 Supervisor/Foremen -0.23 0.53 NA NA NA NA CAT 3 Administrative Personnel -1.65 0.68 NA NA NA NA 1611 Highway and Street Construction, Except Elevated Highways -2.20 1.05 0.05 0.05 0.01 0.01 1623 Water, Sewer, Pipeline, and Comm. And Power Line Construction 3.47 1.03 0.94 0.06 0.59 0.32 1761 Roofing, Siding, and Sheet Metal Work 0.65 0.32 0.49 0.08 0.08 0.07 2491 Wood Preserving -3.11 1.45 0.02 0.03 0.00 0.00 2759 Commercial Printing, Not Elsewhere Classified 0.00 0.00 0.30 0.32 0.04 0.06 2999 Products of Petroleum and Coal, Not Elsewhere Classified 0.61 0.92 0.48 0.23 0.08 0.09 3011 Tires and Inner Tubes 1.03 0.55 0.58 0.14 0.11 0.10 3312 Steel Works, Blast Furnaces (Including Coke Ovens), and Rolling Mills -1.33 0.71 0.12 0.07 0.01 0.01 3321 Gray and Ductile Iron Foundries 0.65 0.39 0.49 0.10 0.08 0.07 3334 Primary Production of Aluminum -0.73 0.48 0.19 0.08 0.02 0.02 3355 Aluminum Rolling and Drawing, Not Elsewhere Classified 0.03 1.05 0.34 0.24 0.04 0.06 3366 Copper Foundries 0.27 0.68 0.39 0.17 0.06 0.06 3462 Iron and Steel Forgings -0.44 0.75 0.24 0.14 0.03 0.03 3479 Coating, Engraving, and Allied Services, NEC -0.72 0.64 0.19 0.10 0.02 0.02 3624 Carbon and Graphite Products 0.31 0.54 0.40 0.14 0.06 0.06  51 Table 2-6: Regrouped SIC coefficients and comparison between the OSHA databank and the IMIS subset databank incorporating Occupation groups Minimally Skilled Workers/Labourers and Administrative Personnel of Predicted Probabilities of exceeding Permissible Exposure Limit (0.2 mg·m-3) for Polycyclic Aromatic Hydrocarbons assessed as Concentrations of Coal Tar Pitch Volatiles standardized to 1980 for SIC Codes with 20 or more measurements   OSHA: Ignoring Occupation IMIS subset: Minimally Skilled Workers IMIS subset:  Admin  Personnel SIC Description   SE(  )   SE(  )   SE(  ) 1611 Highway and Street Construction, Except Elevated Highways 0.05 0.04 0.05 0.05 0.01 0.01 1623 Water, Sewer, Pipeline, and Comm. and Power Line Construction 0.94 0.05 0.94 0.06 0.59 0.32 1761 Roofing, Siding, and Sheet Metal Work 0.42 0.06 0.49 0.08 0.08 0.07 2491 Wood Preserving 0.01 0.02 0.02 0.03 0.00 0.00 2759 Commercial Printing, Not Elsewhere Classified 0.15 0.17 0.30 0.32 0.04 0.06 2999 Products of Petroleum and Coal, Not Elsewhere Classified 0.44 0.23 0.48 0.23 0.08 0.09 3011 Tires and Inner Tubes 0.46 0.12 0.58 0.14 0.11 0.10 3312 Steel Works, Blast Furnaces (Including Coke Ovens), and Rolling Mills 0.15 0.07 0.12 0.07 0.01 0.01 3321 Gray and Ductile Iron Foundries 0.27 0.05 0.49 0.10 0.08 0.07 3334 Primary Production of Aluminum 0.17 0.06 0.19 0.08 0.02 0.02 3355 Aluminum Rolling and Drawing, Not Elsewhere Classified 0.31 0.23 0.34 0.24 0.04 0.06 3366 Copper Foundries 0.26 0.11 0.39 0.17 0.06 0.06 3462 Iron and Steel Forgings 0.34 0.15 0.24 0.14 0.03 0.03 3479 Coating, Engraving, and Allied Services, NEC 0.15 0.06 0.19 0.10 0.02 0.02 3624 Carbon and Graphite Products 0.37 0.11 0.40 0.14 0.06 0.06  52 2.7 Figures   Figure 2-1: Quality control flow chart of PAH measurements used in the predictive probability models. 965136478IMIS2,735 observations1,579 observationsRestricted to coal tar pitch volatilesRestricted to personal measurementsRestricted to TWA or non-detectRemoved duplicates 965136478IMIS2,735 observations1,579 observationsRestricted to coal tar pitch volatilest i t  t  personal measurementst i t  t  TWA or non-det ctmoved duplicates 965136478IMIS2,735 observations1,579 observationsRestricted to coal tar pitch volatilesRestricted to personal measurementsRestricted to TWA or non-detectRemoved duplicates Merged2,509 measurements411 Removed duplicates/overlapping measurements 965136478IMIS2,735 observations1,579 observationsRestricted to coal tar pitch volatilesRestricted to personal measurementsRestricted to TWA or non-detectRemoved duplicates 965136478IMIS2,735 observations1,579 observationsRestricted to coal tar pitch volatilesRestricted to personal measurementsRestricted to TWA or non-detectRemoved duplicates 965136478IMIS2,735 observations1,579 observationsRestricted to coal tar pitch volatilesRestricted to personal measurementsRestricted to TWA or non-detectRemoved duplicates 965136478IMIS2,735 observations1,579 observationsRestricted to coal tar pitch volatilesRestricted to personal measurementsRestricted to TWA or non-detectRemoved duplicates / 53  Figure 2-2: Cumulative distribution functions of detectable concentrations of Coal Tar Pitch Volatiles in All, CEHD, CEHD-only, IMIS, IMIS-only, and both-IMIS-CEHD data.    54   Figure 2-3: Bland-Altman plot comparing the difference in the predicted EF for a SIC and its corresponding NAICS code against the average between the predicted EFs of the SIC-NAICS combinations assessed in 1980 from the IMIS databank: Regrouped coding (Left graph) and original coding (Right graph) with marker size scaled for the number of companies in the code. Only codes that have one-to-one mapping between NAICS and SIC codes are plotted.    55   Figure 2-4: Bland-Altman plot comparing the difference in the predicted EF for a SIC and its corresponding NAICS code against the average between the predicted EFs of the SIC-NAICS combinations assessed in 1980 from the IMIS databank: Regrouped coding (Left graph) and original coding (Right graph) with marker size scaled for the number of companies in the code. Only codes that have multiple NAICS (SIC)-to-one SIC (NAICS) mapping are plotted.     56   Figure 2-5: Bland-Altman plot comparing the difference in the predicted EF for a SIC and its corresponding NAICS code against the average between the predicted EFs of the SIC-NAICS combinations assessed in 1980 from the IMIS databank: Regrouped coding (Left graph) and original coding (Right graph) with marker size scaled for the number of companies in the code. All NAICS to SIC mappings are plotted.      57 3 Association Between Occupational PAH Exposure and Breast Cancer Polycyclic aromatic hydrocarbons (PAHs) are a large group of chemical compounds that are a common class of environmental pollutants because they are by-products of combustion of organic matter. Exposure to PAHs occurs through several sources including diet, contaminated drinking water, air pollution, and smoking.(4, 6, 7) Although environmental exposure to PAH is ubiquitous, the intensity of exposure in workplaces is usually much higher than in the general environment.(14, 19) Experimental studies show that PAH carcinogenicity occurs through the metabolic activation by xenobiotic enzymes to form DNA-binding metabolites, which include diol-epoxides and quinones.(89, 156, 206-209) These enzymes exist in all tissues(136, 209-211) and, given that mammary tissues bioaccumulate PAHs,(137) this creates an environment in the breast that allows the formation of adducts and other oxidative intermediates that can react with DNA. Thus, it is biologically plausible that PAH exposure contributes to the development of breast cancer.(4, 14)   Several epidemiologic studies have assessed the effects of PAH exposure using PAH-DNA adducts, or some other surrogate for exposure. Those investigating lung, bladder, urinary tract system, and skin cancers have reported increased risks.(131, 212-219) The effects of occupational PAH exposure on risk of female breast cancer remain under-studied, as the majority of sources of exposures studied were ambient air pollution, active and passive cigarette smoking, grilled and smoked foods, or other sources where the intensity of exposure is low compared to PAH-contaminated workplaces.(131, 212-214)  Only three studies reported on the association of occupational exposure to PAH and risk of breast cancer.(217-219) They reported increased breast cancer risk with PAH exposure using diverse approaches to exposure assessment. The objective    58 of our research was to evaluate the association between breast cancer and women who worked in industries and occupations with PAH exposure.  3.1 Methods A population-based case-control study was conducted in the greater metropolitan area of Vancouver, British Columbia (BC) and Kingston, Ontario, between 2005 and 2010. Results for shift work,(220) physical activity,(221) and genetic variants(222-224) were published previously. Ethics for this study was approved by the University of British Columbia / BC Cancer Agency Research Ethics Board and the Queen’s University Health Sciences Research Ethics Board.  3.1.1 Study Population Greater Vancouver We recruited breast cancer cases from the BC Cancer Registry. Inclusion criteria for eligible cases were women between 40–80 years of age, with a diagnosis of either in situ or invasive breast cancer with no previous cancer history, except for non-melanoma skin cancer, who were living in Vancouver, New Westminster, Richmond, or Burnaby at the time of diagnoses. Controls were women recruited from the Screening Mammography Program of BC who consented to participate during routine screening mammography and were living in the same geographic areas. Controls (n = 1,014) were frequency-matched to cases (n = 1,001) within 5-year age groups; response rates were 57% and 54%, respectively.  Kingston We recruited cases and controls from the Hotel Dieu Breast Assessment Program in Kingston,    59 Ontario. Inclusion criteria included women under the age of 80 years with no previous cancer history, except non-melanoma skin cancer, and not currently receiving cancer preventative drugs. Cases were women with a subsequent diagnosis of either in situ or invasive breast cancer, and controls were women with either normal mammogram results or a diagnosis of benign breast disease, frequency-matched to cases by age in 5-year groups. Response was 59% and 49% among cases (n = 131) and controls (n = 164), respectively.  Due to minimum age restrictions for screening mammography in BC being at least 40 years of age, all participants under 40 years of age from the Ontario cohort were excluded, reducing the number of eligible participants to 129 cases and 155 controls. Following these exclusions, we included 1,130 cases and 1,169 controls from Vancouver and Kingston in the analysis.   3.1.2 Questionnaire Participants completed a questionnaire that was either self-administered and mailed (n = 726 case, n = 825 controls), or administered by telephone interviews in English, Cantonese, Mandarin, or Punjabi (n = 404 case, n = 344 controls). The questionnaire included inquiries about education, ethnicity, reproductive and contraceptive history, family history of cancer, lifestyle characteristics (e.g. alcohol consumption and smoking), and lifetime work history. Lifetime work history recorded information for all jobs held for at least 6 months, including industry, job title, length of employment, work hours (e.g. part-time or full-time), as well as tasks performed and materials handled.      60 3.1.3 Assessment of Menopausal Status Using guidelines similar to the work by Friedenreich et al.,(225) women were considered postmenopausal if (1) menstruation had stopped for more than 1 year, (2) menstruation had stopped naturally and they were over 50 years of age (if time since last menstruation was unavailable), (3) they had a bilateral oophorectomy, or (4) they were over 55 years of age and menstruation stopped due to other reasons (e.g. chemotherapy).  3.1.4 Occupational Exposure Assessment We recorded lifetime work history: start and end dates, industry, employer, job title, and tasks performed for any job held for at least six months. This information was used to infer exposure to PAHs using three JEMs; two JE s based on industrial hygienists’ judgements,(226, 227) and the third JEM based on modeling of compliance measurements of coal tar pitch volatiles, a common measure of total PAH exposure (as described in Chapter 2). Industries were classified into categories based on the Canadian version of the North American Industry Classification System (NAICS) 2007 and occupations were classified into categories based on the United States version of the Standard Occupation Classification (SOC) 2010. Industrial classification was done through manual review, whereas the occupational classification was initially done using an automated approach for clustering job descriptions(228) that assigned free-text job descriptions their corresponding SOC 2010 code, and then underwent a manual review to ensure accuracy.   DOM-JEM The first JEM, referred to as the DOM-JE , was based on industrial hygienists’ judgement and designed to estimate levels of occupational PAH exposure on a semi-quantitative scale.(227, 229)    61 The DOM-JEM assigned an ordinal level of exposure to PAH using a scale of none, low, and high (values: 0, 1, 4, respectively) to each occupation that was coded to the International Standard Classification of Occupations 1968 (ISCO68) system. To summarize the lifetime exposure, we calculated the cumulative exposure(200) defined for participant i as:                                             (3.1)  where Di,k is the duration, Ii,k is the ordinal level of exposure from DOM-JEM for participant i during job number k, and Ki is the total number of jobs reported by participant i in this study. To apply the DOM-JEM, a set of crosswalks were employed to translate the SOC2010 codes to ISCO1968 equivalent (see Appendix: Supplementary Material B1 for more details).  NCI-JEM The second JEM, referred to as the NCI-JEM, was based on judgments of industrial hygienists and designed by the National Cancer Institute to estimate the job intensity and probability of PAH exposure.(217, 226) The NCI-JEM is comprised of an occupation-based matrix and an industry-based matrix that were both coded to the 1980 US Bureau of Census Industry and Occupation classification (OCC1980) scheme. Each industry and occupation codes were assigned separate ordinal levels of intensity and probability of PAH exposure using a scale of none, low, medium, and high (values: 0, 1, 2, 3, respectively for each); that is, each industry code received an intensity level and probability score and each occupation code independently received an intensity level and probability score. By design, the NCI-JEM also included a variable that indicated whether the calculation of exposure for a job is based solely on the occupation estimate (occupation group 1) or based on a function of industry and occupation    62 estimates (occupation group 2). The industry and occupation estimates were then combined for each job using equation (3.2) to estimate the job’s exposure intensity and equation (3.3) to estimate the job’s exposure probability, which varies depending on the occupation group (1 or 2).(226) The intensity score for job number k for participant i was defined as:                                                                                                    (3.2) and the probability score for job k for participant i was:                                                                                                    (3.3) The final scores for intensity and probability were each divided into 4 levels: none, low, medium, and high (values: 0, 1–2, 3–4, 6–9, respectively). We calculated the cumulative exposure estimate for each participant by using a modified version of equation (3.1) that integrated the probability score defined by equation (3.3)                                                                                                                     (3.4) where Pk in equation (3.3) is the same Pi,k calculated in equation (3.4) but is specific to participant i. Equation (3.4) is a variant of equation (3.1) that excludes jobs with a low probability of exposure from contributing to the cumulative exposure in order to maximize specificity. Jobs classified as low probability will be grouped with unexposed jobs and so the unexposed group is a mixture of truly unexposed and jobs with a low probability of PAH exposure. Although the NCI and DOM-JEM utilize an ordinal intensity level for exposure, the scales for these JEMs were developed by independent teams of experts who likely used different exposure definitions; these two JEMS have not previously been compared. Similar to the DOM-   63 JEM, to apply the NCI-JEM, a set of crosswalks were employed to translate the SOC2010 codes to OCC1980 equivalent (see Appendix: Supplementary Material B1 for more details).  PPM-JEM The final JEM, referred to as the PPM-JEM, was based on work designed by coauthors (DL, JL, JJS, IB) that was derived from modeling compliance measurements from U.S. Occupational Safety & Health Administration workplace safety inspections collected between 1979 and 2010 (as describe in Chapter 2). The PPM-JEM produced a probability of job-specific exposure (τ) exceeding the permissible exposure level (PEL = 0.2 mg·m-3) for PAH (θ = Pr(τ >PEL)). Jobs with “high” exposure were defined as those with probability of exceeding PEL at least 9%; this corresponded to the 50th percentile of non-zero probabilities assigned to all occupations among controls. Jobs with zero probability of exceeding the PEL according to the PPM-JEM were treated as unexposed. Two intermediate exposure groups were also created and labeled as “low” (θ = 0.1 – 2.9%) and “medium” (θ =  .0 – 8.9%), which corresponded to less than the 25th and between the 25th and 50th percentile, respectively. Within each participant, we then calculated duration of exposure at each of the four levels such that each woman could have spent some time in a job exposed to one of the four levels. The values of duration at each non-zero level of exposure were segregated into categories based on the tertiles of non-zero duration among the controls. Furthermore, to attenuate any effects of the definitions of “low”, “medium”, and “high” risk of exceeding the PEL, we calculated the average probability of exposure weighted by duration by summing the product between the probability and duration for job k divided by the total duration for all jobs for participant i, which is described by equation (3.5):    64                                                                  (3.5) where Di,k is the duration, PPi,k is the predictive probability from the PPM-JEM for participant i during job number k, and Ki is the total number of jobs reported by participant i in this study. A third exposure assessment using the PPM-JEM is the weighted duration by probability defined by equation (3.6),                                                    (3.6) which is a variant of equation (3.5) that involves only the numerator of the quotient described above. Weighted duration is analogous to the cumulative exposure equations defined in equations (3.1) and (3.4) but uses probability instead of an ordinal value for intensity levels. The nomenclature of weighted duration is used as opposed to cumulative exposure because these two metrics are not comparable due to the lack of intensity scores of the former. Similar to the ordinal scores between the NCI and DOM-JEM, both the NCI and PPM-JEM utilize probabilities based on different scales and were developed by independent teams. Moreover, the PPM-JEM estimates probabilities of exceeding the PEL for PAH exposure as opposed to the probability of exposure; these two JEMS have not previously been compared.  Modification of Exposures Assigned by JEMs In some situations, described here, the JEM-based estimates were modified based on the participants reported tasks and any materials they handled during a particular job. These situations included when a JEM determined that there was no exposure while the participant stated that they worked with materials that are known sources of PAH, or performed tasks that    65 are likely to have PAH exposure, the level of exposure for that job was modified based on a priori criteria (see Appendix: Supplementary Material B2). These modifications were applied to all three JEM used.  3.1.5 Statistical Analysis Multivariable unconditional logistic regression was used to calculate adjusted odds ratios (ORs) and 95% confidence intervals (CI) to examine the relationship between occupational PAH exposure and breast cancer risk. We calculated the number of years spent employed at each occupation, with part-time work scaled accordingly to allow the calculation of full-time equivalent duration for each woman; exposure in occupations that were part-time or greater than a standard 40-hour work week were scaled by the reported number of hours per week to weight the duration appropriately when calculating lifetime exposure. Age (continuous), centre (Kingston vs. Vancouver), and education were included in all models, and all other variables were selected using an all-possible-models backwards selection procedure for confounder assessment.(230) We retained a set of confounders if they altered the OR for the highest level of exposure or longest duration from any of the JEMs by greater than 10% using the following variables: ethnicity, body mass index (BMI), use of fertility drugs, oral contraceptives, non-steroidal anti-inflammatory drugs (NSAIDs), anti-depressants, and/or hormone replacement treatment; menopausal status, age of menarche, number of pregnancies and births, age at first birth, duration of breastfeeding (in years), age at first mammogram, first-degree family history of breast cancer, and smoking status and pack-years of cigarettes. We classified all exposure assessments (i.e. cumulative exposure, duration of exposure at a given ‘level’, etc.) into the tertiles of non-zero scores of the controls for each JEM and assessed the reliability of exposure    66 assessment by the three JE  using Cohen’s kappa statistics based on an exposed-unexposed exposure assessment for all unique jobs observed in the study. To examine the sensitivity of our exposure definition for the PPM-JEM, we compared the PPM-JEMs agreement with the expert-based JE s in identifying exposed jobs by varying the definition of “exposed”; this was done by adjusting the threshold for exposed that is based on the exceedance fraction, i.e. the probability of exceeding the PEL.  To test for trends with respect to the exposure variables, we treated levels of exposure as a continuous ordinal variable with the comparison group (None) receiving a value of 0 and each ordinal category receiving a sequential numerical value, i.e. Low = 1, Medium = 2, and High = 3. We assessed the interactions with menopausal status, the effects of smoking (pack-years), and ethnicity through stratified analysis and assessed the cross-product interaction terms in their respective models. For the stratified ethnicity analyses, strata were restricted to European and Asian cohorts, where Asians were defined as Chinese, Japanese, or Korean. We also conducted additional analyses to explore interactions with socioeconomic status, body mass index, and first degree family history of breast cancer; first degree was defined as having at least one immediate members of their family (e.g. mother, sister, or daughter) diagnosed with breast cancer. A case-only multivariable logistic regression was used to evaluate whether breast cancer risk associated with PAH exposure differed between hormone receptor positive (ER+/PR–, ER–/PR+, or ER+/PR+) or negative (ER–/PR–) subtypes. We assessed the impact of differences between sources of cases and controls in Vancouver by excluding cases (n = 227) who did not participate in the breast cancer screening program. All analyses were conducted using the statistical software R (version 2.14.2, R Foundation for Statistical Computing, Vienna, Austria).    67 3.2 Results Controls were more likely to be of European descent and less likely to be East Asian compared to cases (Table 3-1). Controls also had a higher percentage with a family income greater than $80,000, with a graduate or professional school degree, a lower frequency of being overweight or obese, and were more likely to have used NSAIDS, anti-depressants or oral contraceptives. Cases were more likely to have ever been pregnant and, among parous women, they were older at their first pregnancy, had fewer additional pregnancies (i.e. only had one child), did not breastfeed as long, were older when they had their first mammogram, and a higher proportion had a first-degree family history of breast cancer.   Approximately one quarter of participants worked in an occupation that had some PAH exposure based on the DOM-JEM, while the NCI-JEM indicated a lower prevalence of exposure of about 20%. The PPM-JEM indicated that 64% were ever employed in an occupation with the chance of exposure to PAH above PEL (θ > 0), with approximately 40% ever being employed in at least one occupation that was classified as a “high” risk receiving exposure to PAH above the PEL (θ ≥ 9%). Table 3-2 shows the results of the multivariable logistic regression models for each of the JEMs. For all exposure assessments based on the three JEMs, age, centre, education, ethnicity, and smoking were included in the models.   Analyses of exposure assessed by the two expert-based JEMs suggested minor elevation in breast cancer risk with both 95% CIs including the null value (DOM-JEM: OR = 1.11, 95% CI: 0.91–1.36; NCI-JEM: OR = 1.13, 95% CI: 0.91–1.40). Neither the pattern of effect estimates nor the tests for trend for cumulative exposure support the presence of exposure-response (ptrend >    68 0.2). Analyses of any level of exposure assessed by the PPM-JEM suggested elevated breast risk (OR = 1.32, 95% CI: 1.10–1.59). We also observed elevated risk for ever-never based on maximum level at “high” exposure (OR = 1.43, 95% CI: 1.17–1.76), where the maximum level classification, regardless of duration, is the maximum exposure level to which the participant was exposed across all occupations. Although we observed no association between breast cancer risk and ever-never maximum level at “low” or “medium” exposure, tests for trend support the presence of exposure-response and positive trend (ptrend < 0.01). We examined a dose-response relationship by assessing duration at any level, at medium and/or high, and at high levels of PAH exposure. To ensure that the referent group is truly unexposed when examining the latter two variants of the PPM-JEM metric, a nuisance variable defined as: {1 = if maximum level of exposure was low, 0 = other} and {1 = if maximum level of exposure was low or medium, 0 = other} was used to adjusted for participants who experienced low and/or medium exposure, respectively. We observed evidence of significantly elevated risk for duration at any level, although the behaviour was non-increasing monotonic. Similar evidence of a positive association with duration was noted for those in the “medium or high” exposure level (longest duration: OR = 1.41, 95% CI: 1.10–1.81) and in the “high” exposure level (longest duration: OR = 1.45, 95% CI: 1.10–1.91); however, there was no positive trend with duration in either variant despite ptrend < 0.01. Using the estimated probabilities from the PPM-JEM, analyses of the average probability of exceeding the PEL and weighted duration of exposure, represented by equation (3.5) and equation (3.6), respectively, found evidence of an exposure-response (ptrend < 0.01) with women in the highest tertiles exhibiting increased risk for breast cancer; similar to the analyses of duration, there was no positive trend across tertiles.      69 No measurable differences in PAH-mediated breast cancer risk among pre- and postmenopausal women were evident based on test of interaction. Among postmenopausal women, results across the JEMs were similar to the original analyses with all women (Table 3-3). However, among women with prolonged exposure there were indications of differences in PAH-mediated breast cancer risk by menopausal status across all three JEMs with some indication of positive trends, especially among premenopausal women. Analyses of exposure by DOM and NCI-JEMs suggested elevated breast cancer risk in the highest cumulative exposure tertile, but both 95% CIs included the null. We observed associations between breast cancer and long duration according to PPM-JEM for medium/high and high exposure levels (medium/high: OR = 1.68, 95% CI: 1.12–2.52; high: OR = 1.74, 95% CI: 1.10–2.74; ptrend = 0.01). Similar results were also observed among premenopausal women for average probability and weighted duration, with the highest and longest tertiles also showing elevated risks and potential positive trends (average probability – highest tertile: OR = 1.50, 95% CI: 1.04–2.17, ptrend = 0.02; weighted duration – longest tertile: OR = 1.58, 95% CI: 1.08–2.29, ptrend < 0.01).  A total of 844 cases were classified as hormone receptor positive and 166 were classified as hormone receptor negative. There were no measurable differences in PAH-mediated breast cancer risk by hormone receptor status (all p-values > 0.5); however, risks associated with PAHs were slightly lower for those with ER/PR– tumours (Appendix: Supplementary Table A9). Sensitivity analysis excluding non-screened cases yielded results similar to those from the full study population (Appendix: Supplementary Table A10), although risk estimates were somewhat lower; we observed no differences after stratifying by smoking status (Appendix: Supplementary Table A11). Analyses stratified for ethnicity gave results similar to the overall analysis for all    70 JEMs, but with slightly elevated point estimates among Europeans across exposure levels and duration; however, no significant interactions were observed (Appendix: Supplementary Table A12). Similarly, when stratifying by SES status and BMI, there were minor elevations in risk for those considered to be of low socioeconomic status (Appendix: Supplementary Table A13) and those with a BMI less than 25 (Appendix: Supplementary Table A14); however, no significant interactions were observed. Conversely, there were indications of effect modification in PAH-mediated breast cancer risk by family history, as those with at least one first degree family member showed consistently higher risk estimates that those without (Appendix: Supplementary Table A15). We observed elevated risks in the longest and highest categories of the PPM-JEM variants for duration at high exposure (longest duration: OR = 2.79, 95% CI: 1.25–6.24), average probability (highest tertile: OR = 2.55, 95% CI: 1.34–4.84), and weighted duration (longest duration: OR = 2.26, 95% CI: 1.19–4.28), all of which showed positive trends (all ptrend < 0.01).  Comparing Exposure Assessments The prevalence of exposed jobs and industries differed between the three exposure assessment methods. We recorded over 10,000 jobs, with a median of four jobs per participant throughout their lifetime. The participants were most commonly employed in healthcare & social assistance (NAICS: 62; 19.3% of all observed occupations), educational service (NAICS: 61; 12.2%), retail trade (NAICS: 44–45; 10.3%), manufacturing (NAICS: 31–33; 7.6%), and accommodation & food service (NAICS: 72; 7.5%).   We observed 3,514 unique occupations based on an identifier that combined the industry (NAICS) and occupation (SOC) codes. DOM-JEM classified 2.5% of these unique occupations    71 as exposed, which resulted in approximately 25% of participants classified as “ever exposed”. The NCI-JEM had a prevalence of exposed jobs nearly double that of the DOM-JEM at 5.8%; however, it classified only 20.6% of participants as “ever exposed.” The PP -JEM classified 40.4% of the observed jobs in the study as having a non-zero probability of exposure, although only half of the observed jobs at 20.4% were considered to be at a high chance of PAH exposure above PEL. Among the industries that were identified as being at risk of PAH exposure based on the PPM-JEM, the most commonly observed industries included food service (NAICS: 722), food manufacturing (NAICS: 311), and a variety of other manufacturing-related industries. For the DOM-JEM, similar industries were identified; however, for the NCI-JEM, although food manufacturing and other manufacturing-related industries were identified, food service was not considered at risk for PAH exposure. To determine the most commonly observed industries at risk of PAH exposure based on classifications for each JEM, only unique occupations that had non-zero scores based on their respective JEM were retained and then using those unique at risk occupations we truncated the industry code (NAICS 3-digit code). For example, 72211-353031 is the combined code for Restaurant (Industry) and Waitress (Occupation) and 72211-352012 is the combined code for Restaurant (Industry) and Cook (Occupation), the PPM-JEM identifies both as exposed, and the truncated industry code for both is 722. Using the truncated codes for exposed industries, the marginal percentage (i.e. proportion of jobs with a particular truncated industry code) within each JEM were calculated. (Appendix: Supplementary Table A16).  Using the 3,514 unique combinations of NAICS and SOC codes, representing all occupations observed in the study, the degree of inter-JEM agreement beyond chance(231) for Yes-No exposed for the DOM and NCI-JEM and Yes-No (High) exposed for the PPM-JEM was evaluated.    72 Overall, there was poor agreement, with the highest kappa statistic (κ) observed between the PPM-JEM and NCI-JE  (κPN0 = 0.13, 95% CI: 0.10–0.16) (Table 3-4). To explore the effect of the definition of high-exposed within the PPM-JE , the definition of “exposed” was calibrated so that the prevalence of occupations considered exposed by the PPM-JEM matched the prevalence of exposed occupations classified by the DOM-JEM and NCI-JEM. This was accomplished by increasing the threshold for the probability of exceeding the PEL (%PEL) used to classify an occupation as being at risk for PAH exposure. Initially, the PPM-JEM defined any occupation with a θ ≥ 0.09 as having a ‘high’ risk of PAH exposure. These definitions were adjusted to θ ≥ 0.71 and θ ≥ 0. 5 to match the exposure prevalence of jobs based on DOM-JEM and NCI-JEM, respectively. The change in threshold resulted in a large increase in the agreement between the PPM-JEM and DOM-JE  (κPD1 = 0.60, 95% CI: 0.51–0.68); however, although the agreement between the PPM-JEM and NCI-JEM increased it did not change substantially through matching prevalence rates (Table 3-5). Comparing the “high” definitions of all three JEMs, the agreement between the PPM-JEM and the expert-based JEMs did not improve, although agreement between the DOM and NCI-JEM on occupations at high risk (κDN0 = 0.27, 95% CI: -0.03–0.56) was substantially higher (Table 3-6). As the DOM-JEM and NCI-JEM are much more specific and the PPM-JEM is more sensitive, these results are not surprising.  3.3 Discussion We observed evidence of an increased risk of breast cancer associated with exposure to PAHs among working women. Expert-based JEMs, which were more specific when classifying an occupation as exposed, yielded evidence of positive but weaker associations compared to the measurement-based PPM-JEM. There are no apparent associations at lower PAH-exposure    73 levels (i.e. low risk of excess PAH exposure), no differences based on hormone receptor status,  and no interaction with ethnicity, SES, or BMI.   Associations between PAH exposure and breast cancer risk in premenopausal women were stronger, and there was evidence of a dose-response in premenopausal women based on the PPM-JEM; however, the data do not yet support measurable heterogeneity of effect. Petralia et al.(217) used the NCI-JEM(226) and found elevated risks among premenopausal women with medium-to-high (average) probability of exposure to PAH (OR = 2.40, 95% CI: 0.91–6.01) that was calculated similar to equation (3.5). Although we did not merge the medium-to-high probability categories, we observed elevated risks in the high tertile among all women (OR = 1.31, 95% CI: 1.03–1.66; p-trend = 0.01) and among premenopausal women (OR = 1.50, 95% CI: 1.04–2.17; p-trend = 0.02). Petralia et al. found no evidence of trend with cumulative exposure nor with duration, however they calculated the cumulative exposure in a similar manner as equation (3.1), whereas cumulative exposure was calculated using the more specific equation (3.4) and analogous weighted duration based on equation (3.6), and therefore their results are not directly comparable to our own. Nonetheless, their results support the notion of an association between PAH exposure and breast cancer risk.  Family history, especially first degree family members, is a known risk factor for breast cancer and estimates have indicated a doubling of risk.(232) Analyses of the association between PAH exposure and breast cancer risk based on the PPM-JEM variants were much stronger in women with first-degree family history, with risk in the longest or highest categories being more than doubled. Furthermore, given the role genetics play in the etiology of breast cancer, the    74 heterogeneous effects support the notion of potential interactions between PAH exposure and some genetic susceptibility.  We hypothesized that there would be weaker effects of PAH exposure among ER/PR– cases because the influence on estrogen levels from PAH exposure would have no impact on the growth of ER/PR– tumours, but would influence ER/PR+ tumours.  This hypothesis is based on the observation that PAH can trigger estrogenic(141) and antiestrogenic responses(142-144) through increased metabolism of estradiol, which could increase the formation of quinones.(145, 146) The effect of PAH exposure among ER/PR+ cases that we observed is consistent with this idea but the data can equally likely to be due to chance.  The manner in which prolonged exposure to PAHs modifies estradiol metabolism could also explain the differences in the effects observed among premenopausal women compared to postmenopausal women. Socioeconomic status is a known risk factor for breast cancer,(233) in particular for those of high SES; however, the observed elevated risks in this study occurred within the lower SES strata. The hypothesized relationship between breast cancer risk and high SES status is based on factors associated with high SES, including nulliparity, rather than status itself. Similarly, the observed risks in the lower SES strata could be explained by differences in types of occupations, i.e. manual labour jobs being associated with lower SES status. We also hypothesized that the effects of PAH exposure would be stronger among those with higher BMI due to the bioaccumulation of PAHs in mammary tissue(137) and PAHs being fat-soluble;(3, 14) however, similar to SES status, we found no data to support measurable heterogeneous effects.     75 Age, centre, education, smoking and ethnicity were included in all models. We examined other differences in characteristics between cases and controls as potential confounders using the change-in-estimate criterion to identify a set of confounders that was adjusted for in all models. Varying the selection method (e.g. significance test vs. change-in-estimate), tolerance (e.g. α = 0.05 vs. α = 0.10), or scale of measurement used (e.g. continuous vs. categorical) did not influence which confounders were included in the regression models. Differences in the recruitment of cases in Vancouver from the population-based BC Cancer Registry and controls from the BC Screening Mammography Program may have introduced selection bias; however, sensitivity analysis showed similar results to the overall analyses, suggesting that it is negligible.   Since participants reported lifetime work history retrospectively, there is potential for recall bias if case status influenced reported work circumstances. However, during data collection the type of exposure that was of interest was never revealed and participants were only asked to provide information about the establishments of their employment. Therefore, given that PAH exposure is not widely recognized as a breast cancer risk factor, the likelihood of differential recall bias influencing the self-reported work histories to be negligible. Additional questions about materials handled and tasks performed were also asked, which could also induce recall bias, but these questions were only used to supplement exposure assessment when the JEMs indicated no exposure. The number of participants affected varied by JEM but was similar in cases and controls. For the DOM-JE , a total of 101 cases and 100 controls had at least one occupation’s exposure level changed from None to Low. The overall effects were negligible as none of the unexposed women were re-classified as exposed because, although a particular job’s exposure may have changed from None to Low, all women affected by the update from the supplemental    76 questions already had at least one job that would have classified them as being exposed at some time in their work history. The NCI-JEM observed a higher increase in shifts due to the supplemental question, as 229 cases and 218 controls had their probabilities changed from none to high; however, changes to cumulative exposure were negligible. The effects of the supplemental exposure questions were minimal in the PPM-JEM, as only 19 cases and 20 controls had an occupations exposure status shifted from None to Low and any effects on the analyses involving the high PAH exposure level group were negligible. To update average probability and weighted duration with the supplemental exposure questions, the same occupations for the 19 cases and 20 controls shifted from 0 to 2.9%, which represents the 25th percentile of non-zero probabilities among controls and is the upper limit of the “low” exposure group. Similar to the previous JEMs, any effects on the analyses for average probability or weighted duration were negligible (data not shown).  Implementing three independently developed JEMs is among strengths of the study. Although other studies have used “industry worked” to assess PAH exposure,(215, 216) the expert-based JEMs identified risks associated with specific occupations, which is a better tool for assessing exposure than industry alone.(200) We are concerned about poor agreement (κ range: 0.05 – 0.13) among JEMs, as the κ values are proportional to validity of measures; however, the actual values of sensitivity and specificity cannot be calculated without some assertions about the prevalence of exposure.(234) This issue is evident as the κ values increased between the PP -JEM and DOM-JEM, as well as the PPM-JEM and NCI-JEM, when the definition of exposed job was modified. There is misclassification in exposure assignment due to differences in opinion in the development of the expert-based JEMs, and the probability associated with “exposed” by the    77 PPM-JEM. For example, the food-service industry, which employed more than 20% of participants at some point during their respective careers, had varying risk of exposure depending on the choice of JEM. The DOM-JEM indicated that the majority of workers in the food-service industry had a low risk of exposure, while the NCI-JEM indicated the majority had no risk of exposure. Conversely, the PPM-JEM indicated that the majority of these types of workers were high-risk; the likelihood of exposure above PEL was greater than 80%.  The anchoring of PPM-JEM in measurements and chance of exceeding workplace exposure limits is among the strengths of the innovative approach to exposure assessment that we adopted.   Differential misclassification is a potential limitation of using JEMs for classifying exposure status, as differential misclassification can either attenuate or accentuate the estimates. This differential misclassification can rise due to dichotomization in presence of non-differential measurement error in the construct that is segregated into categories.(202) All JEMs suffer from assigning exposure on group-based level, such that a group of persons classified as exposed will consist of a mixture of truly exposed and truly unexposed individuals.(235) This can produce complex biases, as was demonstrated in the case of one measurement-based JEM.(236) Although there are differences in exposure classification among the JEMs, all three metrics agree that there may be an association between occupational PAH exposure and breast cancer, or at least an association present within premenopausal women.   Tobacco smoke is a known source of PAH exposure, with some studies suggesting that long duration of smoking can result in an increased risk of breast cancer among women,(128, 129) and so it was not surprising that smoking was included in the regression models. We observed similar    78 effects of PAH exposure among women who never smoked, as was observed in the overall analyses, and no evidence of interactions between PAH exposure and smoking across any of the exposure metrics (ptrend-interaction > 0.3); the median pack-years among smokers was only 7.5 years. However, epidemiologic data are inadequate to evaluate possible interactions between occupational PAHs and other sources (e.g. diet), and because exposure to PAHs is ubiquitous the comparison of occupationally exposed populations and ‘unexposed’ reference populations is biased to some unknown extent.(215) Although we are unable to identify differences in effect by smoking status, there was a noticeable increase in risk among smokers with the longest weighted duration based on the PPM-JEM, implying that there is some evidence of additional risk from occupational exposures to PAH in our sample.  In summary, we assessed exposure to PAH through three independently developed JEMs and analyzed exposure levels in association with breast cancer risk on over 2,000 participants that provided in-depth, complete work history, and other information on confounders that may be associated with breast cancer. Results support the notion that prolonged occupational exposure to PAHs in jobs with a measurable chance of exceeding occupational exposure limits is associated with increased breast cancer risk, especially among premenopausal women. Furthermore, the appearance of an effect modification by family history supports the notion of a genetic factor playing a role in PAH-mediated breast cancer and gives evidence of potential gene-environment interactions.     79 3.4 Tables Table 3-1: Descriptive Statistics of the Study Population Variable Cases (%) Controls (%)  Age    Mean, Standard deviation (SD) 56.82, SD=10.29 56.39, SD=9.90  Education    High School or less 389 (34.5) 300 (25.7)  College/Trade certificate 339 (30.0) 347 (29.7)  University degree 271 (24.0) 299 (25.6)  Graduate or professional school degree 130 (11.5) 223 (19.1)  Household income    Less than $15,000 70 (6.2) 32 (2.7)  $15,000 to $29,999 140 (12.4) 89 (7.6)  $30,000 to $59,999 281 (24.9) 268 (22.9)  $60,000 to $79,999 139 (12.3) 158 (13.5)  $80,000 or more 350 (31.0) 463 (39.6) ptrend < 0.01 Not stated 150 (13.3) 159 (13.6)  Ethnicity₸    European 703 (62.2) 912 (78.0)  Chinese 239 (21.2) 115 (9.8)  South Asian 32 (2.8) 34 (2.9)  Filipino 60 (5.3) 38 (3.3)  Japanese 24 (2.1) 14 (1.2)  Other 50 (4.4) 42 (3.6)  Mixed 22 (1.9) 14 (1.2) p < 0.01 BMI    Mean, SD 25.61, SD=5.27 25.15, SD=5.00 p = 0.05     Underweight (< 18.5) 27 (2.4) 27 (2.3)  Normal (18.5 - 25) 585 (52.1) 665 (57.3)  Overweight (25 - 30) 336 (29.9) 309 (26.6)  Obese (30+) 174 (15.5) 159 (13.7) ptrend = 0.03 Reproductive History    Menopausal status    Premenopausal 434 (38.4) 474 (40.5)  Postmenopausal 695 (61.6) 695 (59.5) p = 0.30 Ever Pregnant    Never 191 (16.9) 240 (20.5)  Ever 937 (83.1) 928 (79.5) p = 0.03 Family History of Breast Cancer    Never 906 (80.2) 1002 (85.7)  Ever 224 (19.8) 167 (14.3) p < 0.01 Lifestyle    Age at first mammogram    Years: Mean, SD 44.69, SD=8.99 42.72, SD=7.70 p < 0.01 Smoking    Current Smoker    No 1057 (93.7) 1096 (93.8)  Yes 71 (6.3) 72 (6.2) p = 0.90 Pack-years    Years: Mean, SD 5.63, SD=11.97 5.33, SD=11.33 p = 0.72     80 Table 3-2: PAH exposure and breast cancer risk based on variations of the job exposure matricesⱡ JEM Metric Cases (%) Controls (%)  OR 95% CI DOM         Ever-Never: Any level    Never 812 (74.4) 866 (75.9)  -----    Ever 279 (25.6) 275 (24.1)  1.11 0.91 1.36          Cumulative Exposure – Eq. (3.1)        None 812 (74.4) 866 (75.9)  -----    Low (0.1–1.8) 79 (7.2) 91 (8.0)  1.05 0.76 1.46  Medium (1.9–6.8) 95 (8.7) 91 (8.0)  1.20 0.87 1.64  High (6.9–90.0)  105 (9.6) 93 (8.2)  1.09 0.80 1.48      ptrend > 0.3 NCI ø       Ever-Never: Any level        Never 858 (78.6) 914 (80.1)  -----    Ever 233 (21.4) 227 (19.9)  1.13 0.91 1.40          Cumulative Exposure – Eq. (3.4)        None 858 (78.6) 914 (80.1)  -----    Low (0.1–1.8) 60 (5.5) 77 (6.7)  0.94 0.66 1.36  Medium (1.9–7.0) 90 (8.3) 75 (6.6)  1.38 0.99 1.92  High (7.1–79.0) 83 (7.6) 75 (6.6)  1.06 0.76 1.49      ptrend  > 0.2 PPM         Ever-Never: Any level        Never 342 (31.3) 454 (39.8)  -----    Ever 749 (68.7) 687 (60.2)  1.32 1.10 1.59          Ever-Never: At maximum level†       Never 342 (31.3) 454 (39.8)  -----    Maximum level at low* 90 (08.2) 107 (09.4)  1.02 0.74 1.42  Maximum level at medium¶ 175 (16.1) 178 (15.6)  1.26 0.97 1.64  Maximum level at high 484 (44.4) 402 (35.2)  1.43 1.17 1.76      ptrend < 0.01        Duration (years) of exposure at any level       None (0) 342 (31.3) 454 (39.8)  -----    Short (0.1–4.2) 235 (21.5) 229 (20.1)  1.42 1.12 1.80  Moderate (4.3–13.0) 256 (23.6) 230 (20.1)  1.34 1.06 1.71  Long (13.1–82.2) 258 (23.6) 228 (20.0)  1.20 0.94 1.53      ptrend = 0.09         Duration (years) of exposure at medium¶ or high levels        None (0) 342 (31.3) 454 (39.8)  -----    Ever: Maximum at low level∆ 90 (08.2) 107 (09.4)  1.02 0.74 1.41  Short (0.1–2.7) 203 (18.6) 194 (17.0)  1.41 1.10 1.81  Moderate (2.8–9.0) 203 (18.6) 196 (17.2)  1.32 1.02 1.70  Long (9.1–80.8) 253 (23.2) 190 (16.7)  1.41 1.10 1.81      ptrend  < 0.01          Duration (years) of exposure at high level       None (0) 342 (31.3) 454 (39.8)  -----    Ever: Highest at low* or medium¶ levels◊ 265 (24.3) 285 (25.1)  1.17 0.93 1.47  Short (0.1–2.3) 156 (14.3) 134 (11.7)  1.58 1.19 2.09  Moderate (2.4–7.4) 136 (12.5) 134 (11.7)  1.27 0.95 1.70  Long (7.5–74.1) 192 (17.6) 134 (11.7)  1.45 1.10 1.91      ptrend < 0.01    81 JEM Metric Cases (%) Controls (%)  OR 95% CI PPM Average Probability – Eq. (3.5)       None (0) 342 (31.3) 454 (39.7)  -----    Low (0.01–0.02) 218 (20.0) 229 (20.1)  1.25 0.98 1.59  Medium (0.03–0.07) 255 (23.4) 229 (20.1)  1.41 1.11 1.78  High (0.08–0.88) 276 (25.3) 229 (20.1)  1.31 1.03 1.66      ptrend = 0.01         Weighted Duration (Years) – Eq. (3.6)       None (0) 342 (31.3) 454 (39.7)  -----    Short (0.1–0.4) 234 (21.4) 229 (20.1)  1.32 1.04 1.67  Moderate (0.5–1.7) 233 (21.4) 229 (20.1)  1.27 0.99 1.61  Long (1.8–55.1) 282 (25.8) 229 (20.1)  1.38 1.09 1.75      ptrend < 0.01 ⱡ Adjusted for age, centre, education, ethnicity, smoking (pack-years). All ptrend values are calculated by treating ordinal categories as continuous values (ptrend values are calculated by treating ordinal categories as continuous values for the exposed categories only) ø Exposed for the NCI-JEM is defined as non-zero intensity score conditional on the probability score being medium or higher. Jobs that have a probability of low become grouped with the unexposed jobs using this approach. † Maximum level classification, regardless of duration, is the maximum exposure level to which the participant was exposed across all occupations.    * Analysis for exposure at low level (estimated probability of exposure above 0.2 mg·m-3 of coal tar pitch volatiles) is θ = (0.1 – 2.9%) in at least one job ¶ Analysis for exposure at medium level (estimated probability of exposure above 0.2 mg·m-3 of coal tar pitch volatiles) is θ = (3.0 – 8.9%) in at least one job Analysis for exposure at high level (estimated probability of exposure above 0.2 mg·m-3 of coal tar pitch volatiles) is θ ≥ 9% in at least one job  ∆ To ensure referent group are truly unexposed, a nuisance variable was created for the low-exposed group where value = 1, if highest duration at low level exposure, else 0 ◊ To ensure referent group are truly unexposed, a nuisance variable was created for the low/medium-exposed group where value = 1, if highest duration at low or medium level exposure, else 0     82 Table 3-3: Polycyclic aromatic hydrocarbon exposure and breast cancer risk stratified by menopausal statusⱡ   Postmenopausal  Premenopausal  JEM Metric Cases (%) Controls (%)  OR 95% CI  Cases (%) Controls (%)  OR 95% CI Interaction DOM               Cumulative Exposure – Eq. (3.1)              None 524 (78.7) 536 (79.5)  -----    287 (67.7) 330 (70.7)  -----     Low (0.1–1.8) 35 (5.3) 40 (5.9)  1.04 0.64 1.69  44 (10.4) 51 (10.9)  1.08 0.69 1.69   Medium (1.9–6.8) 49 (7.3) 38 (5.7)  1.35 0.86 2.13  46 (10.8) 53 (11.3)  1.07 0.69 1.67   High (6.9-90.0) 58 (8.7) 60 (8.9)  0.85 0.57 1.27  47 (11.1) 33 (7.1)  1.60 0.98 2.62       ptrend > 0.9     ptrend > 0.1 ptrend > 0.2 NCIø               Cumulative Exposure – Eq. (3.4)              None 551 (82.7) 562 (83.4)  -----    306 (72.2) 352 (75.4)  -----     Low (0.1–1.8) 23 (3.4) 34 (5.0)  0.79 0.45 1.39  37 (8.7) 43 (9.2)  1.08 0.66 1.74   Medium (1.9–7.0) 47 (7.1) 32 (4.8)  1.63 1.01 2.64  43 (10.1) 43 (9.2)  1.20 0.75 1.91   High (7.1-79.0) 45 (6.8) 46 (6.8)  0.84 0.54 1.32  38 (9.0) 29 (6.2)  1.48 0.88 2.51       ptrend > 0.8      ptrend > 0.1 ptrend > 0.3 PPM              Ever-Never: Any level            Never 221 (33.2) 290 (43.0)  -----    120 (28.3) 164 (35.1)  -----     Ever 445 (66.8) 384 (57.0)  1.30 1.03 1.65  304 (71.7) 303 (64.9)  1.32 0.98 1.78               p > 0.9                Ever-Never: At maximum level†            Never 221 (33.2) 290 (43.0)  -----    120 (28.3) 164 (35.1)  -----     Maximum level at low* 71 (10.7) 67 (09.9)  1.20 0.81 1.78  19 (04.5) 40 (08.6)  0.67 0.36 1.23   Maximum level at medium¶ 112 (16.8) 101 (15.0)  1.32 0.94 1.84  63 (14.9) 77 (16.5)  1.14 0.75 1.74   Maximum level at high 262 (39.3) 216 (32.1)  1.33 1.02 1.74  222 (52.3) 186 (39.8)  1.53 1.11 2.11       ptrend = 0.03     ptrend < 0.01 ptrend > 0.4                Duration (years) of exposure at any level            None (0) 221 (33.2) 290 (43.0)  -----    120 (28.3) 164 (35.1)  -----     Short (0.1–4.2) 128 (19.2) 114 (16.9)  1.47 1.07 2.02  107 (25.2) 115 (24.6)  1.34 0.93 1.93   Moderate (4.3–13.0) 154 (23.1) 134 (19.9)  1.30 0.96 1.77  102 (24.1) 96 (20.6)  1.39 0.95 2.03   Long (13.1–82.2) 163 (24.5) 136 (20.2)  1.17 0.85 1.59  95 (22.4) 92 (19.7)  1.23 0.83 1.82       ptrend > 0.2     ptrend > 0.2 ptrend > 0.8                                                                          83   Postmenopausal  Premenopausal  JEM Metric Cases (%) Controls (%)  OR 95% CI  Cases (%) Controls (%)  OR 95% CI Interaction PPM Duration (years) of exposure at medium¶ or high levels             None (0) 221 (33.2) 290 (43.0)  -----    120 (28.3) 164 (35.1)  -----     Ever: Maximum at low level∆ 71 (10.7) 67 (10.0)  1.20 0.81 1.77  19 (04.5) 40 (08.6)  0.67 0.36 1.24   Short (0.1–2.7) 106 (15.9) 95 (14.1)  1.46 1.04 2.05  97 (22.9) 99 (21.2)  1.32 0.91 1.93   Moderate (2.8–9.0) 114 (17.1) 102 (15.1)  1.30 0.93 1.83  89 (21.0) 94 (20.1)  1.32 0.89 1.95   Long (9.1–80.8) 154 (23.1) 120 (17.8)  1.24 0.90 1.71  99 (23.3) 70 (15.0)  1.68 1.12 2.52       ptrend > 0.1     ptrend = 0.01 ptrend > 0.2                Duration (years) of exposure at high level              None (0) 221 (33.2) 290 (43.0)  -----    120 (28.3) 164 (35.1)  -----     Ever: Maximum at low*  or medium¶ levels◊ 183 (27.5) 168 (24.9)  1.27 0.95 1.69  82 (19.3) 117 (25.1)  0.98 0.67 1.43   Short (0.1–2.3) 75 (11.3) 60 (08.9)  1.65 1.11 2.45  81 (19.1) 74 (15.8)  1.49 0.99 2.24   Moderate (2.4–7.4) 69 (10.4) 70 (10.4)  1.11 0.75 1.60  67 (15.8) 64 (13.7)  1.42 0.92 2.18   Long (7.5–74.1) 118 (17.7) 86 (12.8)  1.29 0.90 1.84  74 (17.5) 48 (10.3)  1.74 1.10 2.74       ptrend > 0.1     ptrend = 0.01 ptrend > 0.1                Average Probability – Eq. (3.5)              None (0) 221 (33.2) 290 (43.0)  -----    120 (28.3) 164 (35.1)  -----     Low (0.01–0.02) 149 (22.4) 142 (21.1)  1.30 0.96 1.76  69 (16.3) 87 (18.6)  1.12 0.74 1.68   Medium (0.03–0.07) 146 (21.9) 116 (17.2)  1.46 1.07 2.00  109 (25.7) 113 (24.2)  1.31 0.91 1.90   High (0.08–0.88) 150 (22.5) 126 (18.7)  1.14 0.82 1.57  126 (29.7) 103 (22.1)  1.50 1.04 2.17       ptrend > 0.1     ptrend = 0.02 ptrend > 0.5  Weighted Duration (Years) – Eq. (3.6)              None (0) 221 (33.2) 290 (43.0)  -----    120 (28.3) 164 (35.1)  -----     Short (0.1–0.4) 155 (23.3) 127 (18.8)  1.49 1.10 2.02  79 (18.6) 102 (21.8)  1.06 0.72 1.56   Moderate (0.5–1.7) 127 (19.1) 124 (18.4)  1.17 0.85 1.61  106 (25.0) 105 (22.5)  1.35 0.93 1.95   Long (1.8–55.1) 163 (24.5) 133 (19.7)  1.24 0.91 1.69  119 (28.1) 95 (20.6)  1.58 1.08 2.29          ptrend > 0.2      ptrend < 0.01 ptrend > 0.3 ⱡ  Adjusted for age, centre, education, ethnicity, smoking (pack-years). All ptrend values are calculated by treating ordinal categories as continuous values (ptrend values are calculated by treating ordinal categories as continuous values for the exposed categories only) ø Exposed for the NCI-JEM is defined as non-zero intensity score conditional on the probability score being medium or higher. Jobs that have a probability of low become grouped with the unexposed jobs using this approach. † Maximum level classification, regardless of duration, is the maximum exposure level to which the participant was exposed across all occupations.    * Analysis for exposure at low level (estimated probability of exposure above 0.2 mg·m-3 of coal tar pitch volatiles) is θ = (0.1 – 2.9%) in at least one job ¶ Analysis for exposure at medium level (estimated probability of exposure above 0.2 mg·m-3 of coal tar pitch volatiles) is θ = (3.0 – 8.9%) in at least one job Analysis for exposure at high level (estimated probability of exposure above 0.2 mg·m-3 of coal tar pitch volatiles) is θ ≥ 9% in at least one job ∆ To ensure referent group are truly unexposed, a nuisance variable was created for the low-exposed group where value = 1, if highest duration at low level exposure, else 0 ◊ To ensure referent group are truly unexposed, a nuisance variable was created for the low/medium-exposed group where value = 1, if highest duration at low or medium level exposure, else 0   84 Table 3-4: Agreement of job exposure to polycyclic aromatic hydrocarbon within 3,514 jobs identified by NAICS and SOC codes between job exposure matrices – Cohen’s Kappa (κ) inter-rater reliability (95% Confidence Interval)   DOM-JEM  NCI-JEM PPM-JEM  Exposed◊ Unexposed  Exposed∆ Unexposed Exposed  70 649  105 614 Unexposed  19 2768  97 2690   κPD0 = 0.13 (0.10–0.16) κPN0 = 0.12 (0.09–0.15) DOM-JEM       Exposed◊     13   76 Unexposed     189 3229    κDN0 = 0.05 (0.01–0.09)       Exposed for the PPM-JEM is defined as having an estimated probability of exposure above 0.2 mg·m-3 of coal tat pitch volatiles θ ≥ 9%.  ◊ Exposed for the DOM-JEM is defined as non-zero intensity score ∆ Exposed for the NCI-JEM is defined as non-zero intensity score conditional on the probability score being medium or higher.      85 Table 3-5: Agreement of job exposure to polycyclic aromatic hydrocarbon within 3,514 jobs identified by NAICS and SOC codes between the PPM-JEM with a modified exposure definition as a function of the percentage of the permissible exposure limit to match the prevalence of the exposed jobs based on the DOM and NCI-JEM – Cohen’s Kappa (κ) inter-rater reliability (95% Confidence Interval)   PPM-JEM DOM-JEM  NCI-JEM Definition (θ > 0.71)◊ Exposed Unexposed  Exposed Unexposed Exposed 56 40  5 91 Unexposed 33 3377  197 3213  κPD1 = 0.60 (0.51–0.68)  κPN1 = -0.004 (-0.03–0.03) Definition (θ > 0. 5)∆      Exposed 62 187  32 217 Unexposed 27 3230  170 3087  κPD2 = 0.34 (0.28–0.41)  κPN2 = 0.08 (0.04–0.13)  ◊ Threshold for definition of exposed adjusted to an estimated probability of exposure above PEL of 0.2 mg·m-3 for coal tar pitch volatiles (θ > 0.71 = Exposed, Else Unexposed) to match exposure prevalence of jobs based on DOM-JEM. ∆ Threshold for definition of exposed adjusted to an estimated probability of exposure above PEL of 0.2 mg·m-3 for coal tar pitch volatiles (θ > 0.35 = Exposed, Else Unexposed) to match exposure prevalence of jobs based on NCI-JEM.       86 Table 3-6: Agreement of “high” classification job exposure to polycyclic aromatic hydrocarbon within 3,514 jobs identified by NAICS and SOC codes between job exposure matrices – Cohen’s Kappa (κ) inter-rater reliability (95% Confidence Interval)   DOM-JEM  NCI-JEM PPM-JEM  Exposed◊ Unexposed  Exposed∆ Unexposed Exposed  2 717  7 712 Unexposed  1 2786  5 2782   κPD0 = 0.004 (-0.002–0.01) κPN0 = 0.01 (0.001–0.02) DOM-JEM       Exposed◊     2   10 Unexposed     1 3494    κDN0 = 0.27 (-0.03–0.56)       Exposed for the PPM-JEM is defined as having an estimated probability of exposure above 0.2 mg·m-3 of coal tat pitch volatiles is θ ≥ 9%.  ◊ Exposed for the DOM-JEM is defined as an intensity score of 4 ∆ Exposed for the NCI-JEM is defined as an intensity score of 3 conditional on the probability score being medium or higher.    87 4 Metabolizing-Genes and Breast Carcinogenesis Understanding the effects of exogenous exposure, which includes PAH exposure, poses on breast cancer risk may allow us to better our understanding of the etiology of breast cancer. Evidence from several epidemiological studies has suggested that PAH exposure is a risk factor for several cancer sites, including breast cancer,(217-219) and the International Agency for Cancer Research has classified multiple PAH variants as carcinogenic to humans.(4, 14) Our previous research indicated an association between breast cancer and PAH exposure, which was assessed as the duration of work in occupations with “high” PAH exposure (as observed in Section 3.2).  PAH carcinogenicity occurs through the metabolic activation of PAHs by Cytochrome P450 (CYP450), which consists of a superfamily of hemoproteins that coordinates the metabolism numerous endogenous and exogenous chemicals. CYP450 enzymes are present in most tissues of the body and function to metabolize potentially toxic compounds,(210, 211) but also metabolizes PAHs to form “ultimate carcinogenic” metabolites, including diol-epoxides, radical cations, and quinones,(89, 156, 206-209) that bind to DNA. PAH metabolism can also trigger estrogenic and antiestrogenic responses(142-144) through the increase of estradiol metabolism, which in turn increases the formation of quinones.(145, 146)  Several studies have investigated the association of CYP and other metabolism-related genes with breast cancer risk, many of which are also involved in PAH metabolism.(71, 136, 237, 238) CYP1B1, which is involved in drug metabolism, is an important activator of PAH in mammary glands(136) and certain genotypes have been linked to increased breast cancer risk.(239) However, despite the involvement of these various genes in PAH-metabolism, there is very little research   88 that explores modification of the effect of metabolism-related genes by environmental exposures, although interactions have been studied with regard to smoking(240, 241) and dietary sources of PAHs.(239, 242) Several studies have shown evidence of interactions between PAH-DNA adducts and metabolism-related genes.(243-247) However, PAH-DNA-adducts are modestly associated with increased breast cancer risk(213) adduct levels cannot distinguish the source of PAH exposure.   In our previous research, interactions between PAH exposure and family history of breast cancer (see Section 3.2 and Supplementary Table A15) has lead us to investigate potential interactions between PAH exposure and genetic susceptibility. The objective of our study was to determine if there are associations between genetic variants in metabolism-related genes and breast cancer risk, and determine if interactions between these genetic variants and occupational PAH exposure modify breast cancer risk.  4.1 Methods Description of the study design and population, which was based on a population-based case-control study conducted in Vancouver, British Columbia (BC) and Kingston, Ontario (ON) between 2005 and 2010, is described in Section 3.1.1. Following exclusions according to eligibility criteria, 1,130 cases and 1,069 controls were included. Participants completed a questionnaire, which is described in Section 3.1.2, and provided either a blood or saliva sample for genotyping; DNA was extracted from blood (n = 1980) or saliva (n = 204). Ethics was approved by the University of British Columbia / BC Cancer Agency Research Ethics Board and the Queen’s University Health Sciences Research Ethics Board. PAH exposure assessment was calculated for each individual using the job-exposure matrix described in Section 3.1.4   89 developed by myself and Drs. Lavoué, Spinelli, and Burstyn (PPM-JEM), and the total number of years employed in each job.  4.1.1 Gene and Variant Selection Twenty-seven genes related to endogenous or xenobiotic metabolism were identified from the literature. The majority of genes are members of the CYP450 superfamily (CYP1A1, CYP1A2, CYP1B1, CYP2C19, CYP2E1, CYP19A1). The remaining genes are grouped by function: modulation of the PAH metabolism response (AHR, AHRR, ARNT, AIP), formulation (or activation) of carcinogenic intermediates during metabolism (AKR1A1, AKR1C1-AKR1C4, DHDH, EPHX1, PTGS2, NAT1, NAT2), and detoxification of metabolites (COMT, NQO1, GSTP1, NRF2, PON1) into their final inactive, excretable forms. Additional genes selected are related to estradiol metabolism (ESR1, ESR2),(248) which is influenced by PAH metabolism.(141-144) A set of HapMap tag SNPs, which represents a group of SNPs in high linkage disequilibrium, were selected for each gene using the CEU (European) population from HapMap release 28 using Tagger(249) and the program Haploview(250) using a minimum minor allele frequency (MAF) of 0.10 and an r2 threshold of 0.8. A SNP in high linkage disequilibrium (LD) with another SNP means the two SNPs are strongly associated with one another and result in a nonrandom association. That is to say, if SNP A and B are in high LD, then if SNP A is observed there is a high likelihood that SNP B will also be observed. A total of 158 SNPs and tag SNPs associated with the xenobiotic metabolism genes described above were submitted for genotyping.    90 4.1.2 Genotyping SNPs included in the analysis were initially part of a larger Illumina GoldenGate genotyping assay (768 SNPs) that included SNPs related to other potential pathways for breast cancer. SNPs that failed the initial assay design were those for which the successful assaying of the loci were unlikely (i.e. the assay would be undesignable) and can be due to multiple reasons, including location, tri/quad-allelic (i.e. more than two nucleotide variants), or insertion/deletion polymorphisms, and were replaced with equivalent tag SNPs. Genotyping was performed by the McGill University and Genome Quebec Innovation Centre in Montreal, PQ, Canada.  4.1.3 Quality Control Procedures Genotype quality control procedures for the 768 SNP set was performed in Genome Studio v2011.1 (Illumina, San Deigo, CA, USA), PLINK v1.07,(251) GRR,(252) and Excel 2007 (Microsoft, Redmond, WA, USA). Figure 4-1 and Figure 4-2 summarize the reasons for exclusion of SNPs and samples, respectively.  SNP SNP exclusion was based on recommendations by Illumina (Illumina User Guide, Illumina, Part #11319113): GenCall Scores < 0.25, GenTrain scores < 0.40, poor clustering; mono-allelic, genotype discrepancies in replicate samples (n = 124), call rate < 95%, unexpected low MAF in European controls compared to HapMap CEU data, or out of Hardy-Weinberg equilibrium (p < 0.001) in European-ancestry controls. Call rate was examined separately for saliva samples. If a SNP had a call rate < 95% for saliva samples, but >95% for blood samples, participants that provided samples through salvia only were excluded, i.e. the sample size for that SNP was   91 reduced in comparison to the other SNPs. Genotyping of metabolism-related genes included 158 SNPs, of which 20 were excluded: 6 by the genotyping centre, 8 due to poor clustering, 2 that were monoallelic within the study population, 3 with a low call rate, and 1 with a lower MAF compared to the HapMap CEU population. A total of 138 SNPs remained for analysis.  Samples Samples were excluded if heterozygosity was greater than three standard deviations compared to other samples within the same ethnicity, call rate < 0.95, genotypes at Y chromosome markers indicated the sex was male, unrelated samples had identical genotypes, and if there were discrepancies between genotype-estimated and self-reported ethnicity. Comparison between self-reported and genotype-estimated ethnicities were done by calculating identity by state and multidimensional scaling plots(251) with HapMap samples in the CEU, CHB, CHD, JPT, TRI, and TSI populations.(253) Associations between case status and ethnicity were detected through calculation of the genomic inflation factor (λ = 16.99) using ancestral informative markers (AIMs) after SNP quality control.(254-256) Reanalysis using European samples only resulted in no discernible inflation (λ = 1.0), indicating there is no population structure or genotyping error.(257) Nine samples were excluded due to familial relationship, which were verified using questionnaire data. If both pairs were cases, the individual with the earlier diagnosis was included (n = 4); if both pairs were controls, the oldest of the pair was included (n = 3); and if one of the pair was a case and the other a control, the case was included (n = 2). Controls from the BC arm of the study were recruited from the Screening Mammography program of BC, which had a minimum age requirement of 40 years, and therefore all samples from participants   92 under the age of 40 were excluded. Following sample quality control and inclusion criteria, a total of 2,083 samples (1,037 cases and 1,046 controls) were available for analysis.  4.1.4 Statistical Analysis Multivariable logistic regression was used to estimate odds ratios (OR) and 95% confidence intervals (CI) for associations between breast cancer and the SNPs; all regression models were adjusted for age and study centre. We investigated the associations between metabolism-related SNPs and breast cancer using a SNP-specific inheritance model (i.e. one of three inheritance models: additive, dominant, or recessive) based on the two-step approach described below. To control for confounding due to population stratification, all analyses were restricted to women of European ethnicity.(258, 259) Results for women of Asian descent, defined as Chinese, Japanese, or Korean ancestry, are in supplementary tables (see Appendix); other ethnic groups were excluded due to small sample sizes. Differences in risk by menopausal status were examined through stratified analysis and the addition of an interaction product-term in the multivariable logistic regression.  Multiple testing was corrected through a two-step gene-based process similar to the work done by Schuetz et al.(260) with modifications for the different inheritance models. Each metabolism-related SNP went through a set-based permutation (10,000 permutations) where, for each permutation, case status was randomly assigned. For each inheritance model, a p-value was calculated for the permutation; thus, three inheritance-specific p-values are obtained for each SNP. An adjusted p-value for each SNP was calculated using the number of times a more extreme (i.e. smaller) p-value, compared to the original inheritance-specific p-value, was   93 observed during the 10,000 permutations. Within each gene, the minimum adjusted inheritance-specific p-value was used to select the inheritance model and the gene-representative SNP. The Benjamini-Hochberg procedure(261) was applied to control the false-discovery rate (FDR) to obtain a corrected p-value for the representative SNP (n = 27).  SNPs that displayed any evidence of a potential association with breast cancer (permutation adjusted p-value < 0.1) were examined for gene-environment interactions (GxE) with PAH exposure assessed in   ways: (1) duration at “high” PAH exposure, (2) average probability of exposure, and (3) weighted duration of exposure (as described in Chapter 3). GxE analyses were done by including a genotype-exposure interaction product-term in the regression models. For some of the genotypes-exposure strata, insufficient sample sizes  required the PAH exposure to be dichotomized into an ever-never categorization for duration at “high” PAH exposure and average probability of exposure; weighted duration is a variant of average probability and will produce the same ever-never categorization. All interaction term p-values were corrected for the FDR.(261) In the situation where the homozygous minor allele genotype group had insufficient sample size to test for interactions using the recessive inheritance model (minimum requirement: n = 50), the additive model was used to ensure stable estimates. Education and smoking were identified as potential confounders of the PAH association in Section 3.2, therefore all GxE analyses were adjusted for education and smoking (pack-years), in addition to age and centre, and a nuisance variable {1 = if maximum level of exposure was low or medium, 0 = other} was added to the models to ensure that the referent group is truly unexposed. Statistical analyses were conducted using the statistical software R (version 2.14.2, R Foundation for Statistical Computing, Vienna, Austria).   94 4.2 Results Description of the study population is presented in Table 3-1 of Chapter 3. Cases had a lower percentage of Europeans and higher percentage of Chinese compared to controls, and were less likely to have had post-secondary education than controls. Cases were also more likely to be overweight or obese, have entered menarche at a later age, to be older at first pregnancy, as well as when they had their first mammogram, and have first-degree family history of breast cancer compared to controls.  4.2.1 Genetic Susceptibilities and Breast Cancer Risk Results from the main genetic analysis involving women of European decent are in Table 4-1. Following the permutation step of the gene-based approach, 12 SNPs were observed to have associations with breast cancer risk with 6 SNPs remaining significant after FDR adjustment: AKR1C3: rs12387, AKR1C4: rs3812617, CYP2C19: rs12248560, NAT1: rs7845127, NAT2: rs4646243, and ESR1: rs2813543. We observed differences in associations between genotype and breast cancer risk among pre- and post-menopausal women for SNPs associated with AKR1A1, AKR1C3, AKR1C4, CYP1B1, and NQ01; however, none remained significant after FDR adjustment (padj-value > 0.2) (see Appendix: Supplementary Table A17). For women of Asian descent, the same 27 SNPs and SNP-specific inheritance models were assessed; no associations with breast cancer were observed (see Appendix: Supplementary Table A18). Examining differences due to ethnicity among the SNPs yielded some evidence of an interaction with ethnicity for SNPs associated with COMT, CYP19A1, and NAT2; however, none remained significant after FDR adjustment (padj-value > 0.2).    95 4.2.2 Modification of Gene-Related Breast Cancer Risk by PAH Exposure The 6 SNPs observed to have significant main-effects after FDR adjustment and six other SNPs (COMT: rs5993882, CYP19A1: rs10046, CYP1A1: rs2470893, EPHX1: rs2854461, PON1: rs854551, PTGS2: rs5275) that initially showed associations with breast cancer risk, but did not meet the threshold for significance after FDR adjustment, were assessed for interactions with PAH exposure for women of European ancestry. Three SNPs were observed to have a significant interaction with duration at “high” PAH exposure and are shown in Table 4-2. One SNP is a member of the cytochrome c-oxidase (COX) family: rs5275 (PTGS2) and the other two are from the aldo-keto reductase (AKR) super-family: rs12387 (AKR1C3) and rs3812617 (AKR1C4). Following FDR adjustment, the interaction effect for each of the three SNPs remained marginally significant (padj-interaction < 0.10). For both AKR SNPs (rs12387 and rs3812617), an increasing risk for breast cancer with increased duration of high exposure was observed within the homozygous major allele genotype stratum; within the heterozygous or homozygous minor strata no associations were observed. Furthermore, within the non-exposed group, there was a clear indication of increasing risk with each minor allele. For PTGS2 SNP rs5275, effects across duration of exposure were similar to those for the AKR SNPs within the homozygous major stratum. However, the effects were null within the non-exposed strata, and as duration of exposure increased, there was an increasing protective association of the minor allele. For results concerning the remaining SNPs, see Supplementary Table A19. Assessing exposure by average probability of PAH exposure and weighted duration of exposure, significant interactions were observed for rs5275 (PTGS2), which remained significant after FDR adjustment (padj-interaction < 0.05); no significant interactions were observed among the AKR SNPs (see Supplementary Table A20 and Supplementary Table A21). When the exposure assessment was dichotomized as   96 ever-never (see Table 4-3), similar behaviour as the original exposure categorization for duration at “high” exposure was observed (for remaining SNPs see Supplementary Table A22). The dichotomized average probability and weighted duration of PAH exposure produced similar results as their original categorization with only rs5275 (PTGS2) showing marginal significant interaction effects after FDR adjustment (see Supplementary Table A23).  In order to examine interaction effects within women of Asian ancestry, the same ever-never categorization described above was used because of the same issue with small sample sizes in many of the genotype-exposure stratum. We observed no evidence of statistically significant effect modification by PAH exposure on genotype-associated breast cancer risk. For the NAT2 SNP, a marginally significant interaction effect was observed, which was driven by an increased risk for minor allele carriers within the non-exposed stratum; however, the marginal significance of the interaction effect failed after FDR adjustment (data not shown).  Furthermore, as smoking is an additional source of PAH exposure,(128, 129) interaction effects by smoking status were also examined, as well as heterogeneous effects of smoking on observed GxEs; due to sample size issues these analyses were restricted to ever-never smoking and exposure status. We observed no evidence of statistically significant effect modification by smoking status on genotype-associated breast cancer risk. There was some significant and marginally significant heterogeneous effect within smoking stratum for the NAT2 and CYP1A1 SNP, respectively, but neither meet the significance threshold after FDR adjustment (see Supplementary Table A24). Examining potential modifying effects of smoking on the previously observed GxEs, similar behaviour within the smoking stratum for the 3 main SNPs of interest   97 was observed, but there was no evidence of a three-way interaction or two-way PAH exposure-smoking interactions after FDR adjustment (see Supplementary Table A25 and Supplementary Table A26).  4.3 Discussion In this population-based case-control study, we observed evidence of associations between various xenobiotic metabolism-related gene variants and breast cancer that may modify the exposure to PAHs. Of the twelve SNPs that showed significant or marginally significant associations with breast cancer, five SNPs (AKR1C4: rs3812617, NAT1: rs7845127, NAT2: rs4646243, CYP2C19: rs12248560, ESR1: rs2813543) were marginally significant (padj-value < 0.10) and one SNP (AKR1C3: rs12387) was significant (padj-value < 0.05) after FDR adjustment. Two of these were members of the AKR-superfamily (AKR1C3, AKR1C4), which are involved in the production of carcinogenic intermediate o-quinone during PAH metabolism.(94, 99, 262)    The AKR1C3 SNP rs12387, which represents a synonymous variant, showed significant effects for breast cancer risk among those with the homozygous minor allele genotype and remained significant after FDR adjustment (padj-value < 0.05). Although rs12387 is a synonymous variant, the observed increased risk for breast cancer among those with the homozygous minor allele genotype is consistent with work done by Reding et al., who also observed an increased risk among women undergoing estrogen-progesterone therapy.(263) Furthermore, AKR1C3 regulates receptor access of androgens and estrogens, and since AKR1C3 is involved in the biosynthesis of prostaglandins and is overexpressed in steroid hormone-dependent breast tumors,(264) it may have a role in the etiology of breast cancer.(265)   98  The other AKR SNP, rs3812617, is an upstream variant of a gene also involved in estrogen metabolism; however, its role in the breast cancer etiology is less clear as AKR1C4 expression and tissue distribution is predominantly liver-specific.(266) Although rs3812617 is not a nonsynonymous mutation (i.e. it does not alter the amino acid sequence in the encoded protein), the SNP is in high LD with AKR1C4 SNP rs3829125 (r2 = 0.9), which is associated with estrogen-mediated breast cancer through regulation of estrogen receptor α and β levels; however, Hein et al. observed no main effect of rs3829125 on breast cancer risk.(267) The SNP rs3812617 is also in high LD with rs17134592 (r2 = 0.9), which is associated with mammographic percentage density,(268) and therefore may influence breast cancer risk through this pathway because increased density is a major risk factor for breast cancer.(269)   The CYP2C19 SNP rs12248560, which is an upstream variant, was observed to be associated with breast cancer. However, whereas our results suggest an increased risk for minor allele carriers, the only other study examining this SNP suggested a protective effect, although the result was not significant after adjustment for multiple testing.(270) In a follow-up pooled-analysis, no overall breast cancer risk association was noted, with the only significant association observed within the hormone-replacement therapy subgroup (≥ 10 years on HRT).(271) We found no protective association within our study population that received HRT for at least 10 years (data not shown).    The ESR1 gene has been shown to play a major role in the development and treatment of breast cancer,(272) and like other polymorphisms within the gene, rs2813543 showed a protective effect   99 against breast cancer for carriers of the minor allele.(273) However, given the downstream location of the SNP, it is more likely that the variant is in LD with another variant. NAT1 SNP rs7845127 and NAT2 SNP rs4646243 are both upstream variants, with the former in the 5’ UTR, which showed increased risk for breast cancer in heterozygous and/or homozygous minor allele genotypes, respectively. The NAT1/2 genes have established roles in detoxifying and/or bioactivating a variety of aromatic and heterocyclic amines,(274, 275) and certain polymorphisms have demonstrated associations with breast cancer.(240, 274)  The study of gene-environment interactions in diseases like cancer may be pivotal in understanding their etiology, especially when risks from certain exposures are only detectable in those with certain genetic susceptibilities, which in turn prevents us from identifying the true impact of either without considering both effects. We observed some evidence of the interactions between SNPs in xenobiotic metabolism-related genes (rs12387, rs3812617, and rs5275) and duration of PAH exposure. The AKR SNPs rs12387 (AKR1C3) and rs3812617 (AKR1C4) are involved in the production of quinones during PAH metabolism.(93, 94, 276-278) Exposure to PAHs is thought to trigger estrogenic(141) and antiestrogenic responses(142-144) through increased metabolism of estradiol, which result in the increased formation of quinones(145, 146) through a similar metabolic pathway.(94) Women with the homozygous major genotype were found to be at an increased risk for breast cancer in proportion to longer occupational PAH exposure; however, for both heterozygous and homozygous minor allele genotypes, the increased risks were attenuated. PTGS2 SNP rs5275 showed a reduced risk of breast cancer for the minor allele carriers, but was not significant after FDR adjustment. Similar to the AKR SNPs, women with the homozygous major genotype of the PTGS2 SNP had an increased risk for breast cancer with   100 increasing duration of PAH exposure; no association was observed in the heterozygous and homozygous minor allele strata. The modifying effect of the PTGS2 SNP on PAH exposure remained consistent when compared against average probability and weighted duration of PAH exposure. Other studies have also observed a decreased risk for breast cancer with this variant, including a pooled analysis involving the Nurses’ Health Study 2 and Harvard Women’s Health Study.(279, 280) Like the AKR-superfamily, the PTGS2 SNP may influence breast cancer risk through estrogen metabolism. PTGS2/COX-2 encodes a prostaglandin synthase enzyme (cyclooxygenase) that can increase the production of prostaglandins, e.g. PGE2, which in turn simulates estrogen production through steroidogenesis.(281) High levels of cyclooxygenase have been observed in human mammary tumor tissues(282, 283) and overexpression of the gene is capable of inducing mammary epithelial tumorigenesis in animal models.(284)   As our focus when selecting candidate genes was based on the use of tag SNPs, one limitation of this approach is there is no guarantee that the SNP tested for association is the contributing SNP. However, one of the intents of this study was to identify gene variants that are associated with breast cancer, which we have demonstrated. Furthermore, although the identity of the causal SNP may not be known, by using a tag SNP it can be surmised with reasonable certainty that the causal SNP is in high LD and therefore can be identified by the set of SNPs tagged by our SNP.    Another limitation of the study involves measurement error in assessed PAH exposure. Differential misclassification is a potential limitation of using a job-exposure matrix (JEM) for classifying exposure status that can either attenuate or accentuate the interaction estimates.(202, 236) This misclassification in inferred exposure status also affects the efficiency of GxE studies in   101 epidemiology.(285) There are also circumstances where an observed association with a gene provides evidence of gene-environment interaction even if the effect estimate of the interaction term in regression models is not strong. This occurs when measurement error in exposure dilutes the power of the test of interaction compared to the test of genetic association alone. In this case, the observed effect of a gene depends on its interaction with the true exposure; thus, without even estimating exposure, the genetic effect can be used to detect (rather than quantify) the interaction.(286) We note that after FDR adjustment, of the six SNPs associated with breast cancer in our gene-disease analyses that passed adjustment and of the other six SNPs that showed associations but failed to meet our threshold after FDR adjustment, only three of them had estimated GxE coefficients that passed FDR adjustment. Consequently, it is plausible that several of the other SNPs are providing evidence for gene-environment interactions.  Smoking is a known source of PAH exposure with doses comparable to those arising from occupational exposure. It is also a potential confounder due to the literature suggesting that long duration of smoking can result in an increased risk among women with certain genotypes,(128, 129) although there is little evidence of a measurable effect of smoking on breast cancer risk in this study. Likewise, there is no evidence of interaction between PAH exposure and smoking within this study (see Section 3.3). We observed similar null effects from smoking within genotype-specific strata in the analyses exploring two-way (GxS) interactions. Examination of the three-way (GxExS) interaction also yielded no heterogeneous effects of smoking on our observed GxEs; the lack of effect modification from smoking by PAH-related genes weakens our argument in support of involvement of PAH in general in elevating risk of breast cancer. However, as this study was not designed to for such finer examination, there is limited power to   102 explore if such interactions occur. Nonetheless, as the general behaviour remains consistent within both smoking strata, it can be reasonably assumed that PAH exposure has similar risk with and without additional exposure to smoking.  In summary, we observed relationships between genetic variants in six xenobiotic metabolism-related genes and breast cancer, with two of the SNPs belonging to genes from the AKR-superfamily and four SNPs were novel variants from genes that have known associations with breast cancer. Three genetic variants displayed modifying effects on breast cancer risk that differ among those exposed to occupational PAH exposure. Our analysis reminds us that the interplay of genetic and environmental risk factors can be helpful in understanding the modifiable risk factors of breast cancer.    103 4.4 Tables Table 4-1: Genetic analysis using gene-based permutations under inheritance-specific models   Europeans (Cases: 641, Controls: 803) Gene SNP MAF Model OR 95% CI p-valueⱡ padj-value₸ Regulates PAH and xenobiotic metabolism AHR rs3757824 0.22 Dominant 0.87 (0.70–1.07) 0.185 0.307 AHRR rs349583 0.42 Dominant 1.18 (0.94–1.47) 0.147 0.265 AIP rs4084113 0.38 Dominant 0.93 (0.75–1.16) 0.532 0.537 ARNT rs11204735 0.46 Recessive 1.23 (0.96–1.58) 0.105 0.217 Production of carcinogenic intermediates AKR1A1 rs2088102 0.46 Recessive 1.15 (0.89–1.47) 0.291 0.393 AKR1C1 rs6650153 0.11 Recessive 0.59 (0.24–1.46) 0.278 0.393 AKR1C2 rs11252867 0.24 Additive‡ 1.12 (0.95–1.33) 0.193 0.307 AKR1C3 rs12387 0.18 Recessive 2.71 (1.42–5.19) 0.001 0.030 AKR1C4 rs3812617 0.16 Recessive 2.50 (1.23–5.07) 0.005 0.057 DHDH rs2270939 0.18 Dominant 1.07 (0.86–1.34) 0.537 0.537 EPHX1 rs2854461 0.34 Dominant 1.23 (0.99–1.52) 0.052 0.155 NAT1 rs7845127 0.32 Dominant 1.30 (1.06–1.61) 0.012 0.065 NAT2 rs4646243 0.14 Recessive 3.33 (1.28–8.64) 0.008 0.057 PTGS2 rs5275 0.36 Additive‡ 0.86 (0.74–1.01) 0.071 0.186 CYP450 superfamily CYP19A1 rs10046 0.47 Additive‡ 1.16 (1.00–1.35) 0.043 0.143 CYP1A1 rs2470893 0.31 Recessive 1.33 (0.95–1.87) 0.092 0.207 CYP1A2 rs2470890 0.34 Recessive 1.13 (0.83–1.55) 0.424 0.458 CYP1B1 rs162558 0.21 Recessive 1.25 (0.79–1.99) 0.335 0.411 CYP2C19 rs12248560 0.23 Dominant 1.33 (1.08–1.65) 0.009 0.057 CYP2E1 rs2070673 0.16 Additive‡ 0.85 (0.69–1.05) 0.133 0.256 Detoxification of reactive intermediates during xenobiotic metabolism COMT rs5993882 0.24 Dominant 1.21 (0.98–1.49) 0.076 0.186 GSTP1 rs1695 0.33 Recessive 1.17 (0.83–1.64) 0.365 0.428 NFE2L2 rs1806649 0.25 Dominant 1.11 (0.90–1.37) 0.327 0.411 NQO1 rs1800566 0.19 Dominant 0.91 (0.73–1.14) 0.410 0.458 PON1 rs854551 0.21 Additive‡ 1.22 (1.02–1.46) 0.027 0.103 Estradiol metabolism ESR1 rs2813543 0.23 Recessive 0.52 (0.30–0.90) 0.021 0.094 ESR2 rs1271572 0.44 Recessive 0.84 (0.64–1.10) 0.210 0.315 ⱡ Adjusted for age and centre ‡ Additive model shows OR for each additional minor allele ₸ Adjusted p-value for the false discovery rate  104 Table 4-2: Breast cancer odds ratios by genotype-exposure (duration at high PAH exposure) stratum based on co-dominant model for select SNPs with potential modifying effects‡      Exposure: 0 Years   Exposure: 0.1 - 3.6 Years   Exposure: 3.6 - 74.1 Years   Interaction Gene SNP Geno  Case Control OR 95% CI   Case Control OR 95% CI   Case Control OR 95% CI   ptrend (pFDR) AKR1C3 rs12387 AA 228 380 1.00 ----------  89 86 2.11 (1.46 - 3.06)  92 76 2.14 (1.46 - 3.12)       GA 115 140 1.42 (1.05 - 1.92)  40 41 1.78 (1.09 - 2.91)  34 49 1.19 (0.73 - 1.96)     GG 13 7 3.43 (1.33 - 8.84)  6 2 5.89 (1.15 - 30.1)  10 5 4.15 (1.37 - 12.5)   0.02 (0.06) AKR1C4 rs3812617 GG 233 395 1.00 ----------  93 87 2.19 (1.52 - 3.15)  96 81 2.14 (1.48 - 3.10)       AG 114 127 1.56 (1.14 - 2.11)  37 41 1.70 (1.03 - 2.79)  31 45 1.16 (0.69 - 1.95)     AA 9 5 3.27 (1.07 - 10.0)  5 2 4.96 (0.93 - 26.4)  9 5 3.81 (1.23 - 11.7)   <0.01 (0.06) PTGS2 rs5275 AA 142 215 1.00 ----------  57 48 2.09 (1.32 - 3.31)  70 49 2.34 (1.50 - 3.66)       GA 172 248 1.05 (0.78 - 1.41)  68 61 2.00 (1.30 - 3.07)  54 63 1.33 (0.85 - 2.07)       GG 42 64 0.99 (0.63 - 1.56)   10 21 0.81 (0.36 - 1.79)   12 19 0.97 (0.45 - 2.10)   0.01 (0.06) ‡ Adjusted for age, centre, education, and smoking (pack-years)   Table 4-3: Breast cancer odds ratios by genotype-exposure (ever-never for duration at high PAH exposure) stratum based on co-dominant model for select SNPs with potential modifying effects‡        Exposure: Never   Exposure: Ever   Interaction Gene SNP  Genotype  Case Control OR 95% CI   Case Control OR 95% CI   ptrend (p-FDR) Production of carcinogenic intermediates AKR1C3 rs12387  AA 228 380 1.00 ----------  181 162 2.12 (1.57 - 2.88)      GA 115 140 1.42 (1.05 - 1.92)  74 90 1.45 (0.99 - 2.13)      GG 13 7 3.43 (1.33 - 8.84)   16 7 4.64 (1.85 - 11.7)   0.02 (0.08) AKR1C4 rs3812617  GG 233 395 1.00 ----------  189 168 2.17 (1.60 - 2.92)      AG 114 127 1.55 (1.14 - 2.11)  68 86 1.41 (0.96 - 2.09)      AA 9 5 3.27 (1.07 - 10.0)   14 7 4.14 (1.61 - 10.6)   <0.01 (0.06) PTGS2 rs5275  AA 142 215 1.00 ----------  127 97 2.22 (1.54 - 3.19)      GA 172 248 1.05 (0.78 - 1.41)  122 124 1.65 (1.16 - 2.35)         GG 42 64 0.99 (0.63 - 1.56)   22 40 0.89 (0.50 - 1.59)   0.01 (0.08) ‡ Adjusted for age, centre, education, and smoking (pack-years)  105 4.5 Figures  Figure 4-1: Quality control flow chart of metabolism-related SNPs   Figure 4-2: Quality control flow chart of assay samples768 SNPs689 SNPs138 SNPs24 failed Illumina assay design5 GenTrain < 0.434 bad clusters5 monoallelic1 discrepancies in replicate sample6 call rate < 95%2 low MAF in Caucasian controls2 failed HWEMetabolism-related pathway2,318 samples2,275 samples2,151 samples2,090 samples16 failed samples2 gender discrepancies5 excess heterozygosity6 (3 pair) identical genotypes9 pairs familial related (1 per pair excluded)5 ethnicity discrepanciesMerge replicate genotype (n = 124)Exclusion criteria (n = 61)  106 5 Conclusion In this thesis, I evaluated the risk of breast cancer associated with occupational PAH exposure, as well as the risks posed by genetic susceptibility through select xenobiotic-metabolizing gene variants. A novel feature of my doctoral research is the use of compliance data from workplace observations in the U.S. over several decades to predict PAH exposure in a population-based case-control study, as opposed to the more traditional methods that use experts’ judgements to assess occupational exposure. Furthermore, this study identifies SNPs that may have implications in breast cancer due to evidence of their ability to modify PAH-mediated breast cancer risk.   5.1 Summary of the Study Findings 5.1.1 Estimating Occupational PAH Exposure Chapter 2 explored the use of OSHA workplace compliance testing collected from various employers across the United States between 1979 and 2010 to develop a predictive model for estimating the probability of exceeding OSHAs permissible exposure limit for PAHs (PEL = 0.2 mg·m-3). Estimates of PEL exceedance fractions differed with use of the two major coding schemes; the Standard Industrial Classification (SIC) and North American Industry Classification System (NAICS).  Differences in the results were due to the varying degree of detail in the two coding schemes, in particular some SIC codes corresponded to multiple NAICS codes and vice-versa. Furthermore, estimates for workers based on the industry-only models varied considerably from estimates for workers based on the combined industry-occupation models, implying heterogeneous risks among worker classification within industries. These results are not surprising as manual labourers would likely have had direct exposure to PAHs,   107 whereas those in administrative positions would likely be exposed indirectly. Furthermore, the differences in estimates between the two types of models support the notion that that industry-alone is insufficient to describe exposure, and that occupations within industries, especially those at risk for PAH exposure, have to be taken into consideration to obtain better estimates of exposure or at least estimates with reduced misclassification.  The resulting predictive models became the bases for a job-exposure matrix (JEM) that uses industry and occupation, in the form of NAICS and SOC codes, to predict the probability that an employee is at risk of exceeding the PEL. To our knowledge, this is the first study to use compliance data to estimate PAH exposure, or to the predict probability of exceeding the PEL for the purposes of developing a JEM. As opposed to the more traditional methods of using expert-based metrics, this method of exposure assessment is novel because it estimates probabilities based on statistical analysis of exposure measurements. The use of the monitoring data removes ambiguity and differing opinions of exposure that stem from expert-based metrics, which often involve industrial hygienists reviewing the occupation and then assigning the likelihood, i.e. probability, and/or intensity of exposure.(235) In addition, these metrics usually employ ordinal values that may not be translatable into objective measurements of exposure.    5.1.2 The Association Between Occupational PAH Exposure and Breast Cancer Chapter 3 evaluated the contributions of occupational PAH exposure on breast cancer risk through three independently developed JEMs: two developed by industrial hygienists in the United States (NCI-JEM) and Europe (DOM-JEM), and a third that was developed as part of this dissertation (PPM-JEM).    108 Analyses of exposure assessed by the DOM-JEM and NCI-JEM showed minor elevations in breast cancer risk, although no observed associations were statistically significant (Type I error: > 5%). For the analyses based on the PPM-JEM, we observed significantly elevated breast cancer risk associated with PAH exposure for several of the exposure metrics. Among those who were ever exposed to PAHs, we observed, on average, a 1.3-times increase in risk for breast cancer. Examination of potential dose-response relationships showed significant associations with duration at “high” PAH exposure, weighted duration, and with average probability. Moreover, when we considered the effects of PAH exposure on breast cancer risk among premenopausal women we found even stronger evidence of a dose-response. Similar to the main analyses, we found elevated, albeit non-significant, risks in cumulative exposure to PAHs based on the DOM and NCI-JEM. However, unlike in the main analyses, there was some evidence of a dose-response relationship using these metrics among premenopausal women. Analyses based on the duration algorithms of the PPM-JEM showed, on average, more than a 50% increase in breast cancer risk among premenopausal women who were in the longest or highest tertile group for exposure. This stronger relationship could be, in part, due to PAH metabolism’s influence on estradiol metabolism (see estrogenic and antiestrogenic responses mentioned in Section 3.3 and 4.3) that influence the formation of quinones that can increase breast cancer risk.  We also observed an interaction between occupational PAH exposure and family history of breast cancer. In particular, those with a first-degree family history were estimated to have, on average, more than a 2.5 fold increase in risk of breast cancer, in comparison to those with no first-degree relative, for the longest or highest tertiles of the PPM-JEM exposure algorithms.   109 This evidence supports the notion of possible interactions between genetic susceptibility and PAH exposure that may modify breast cancer risk in certain populations.  5.1.3 Genetic Susceptibility in Xenobiotic Metabolizing Genes and Their Effect on Breast Cancer Risk Chapter 4 explored the roles certain genetic variants, which were in the form of single nucleotide polymorphisms (SNPs), had on breast cancer risk and how their effects modify PAH-mediated breast cancer risk. We identified six SNPs (rs12387, rs3812617, rs12248560, rs2813543, rs7845127, rs4646243) that were observed to have some association with breast cancer after false-discovery-rate (FDR) adjustment for multiple testing (corrected p-value < 0.1). Four of the SNPs belong to genes involved in the production of carcinogenic intermediates during PAH metabolism, one SNP is a member of the CYP450 superfamily that is also involved in PAH metabolism, and the final SNP belongs to a gene involved in estradiol metabolism.  The SNPs rs12387 and rs3812617 were of particular interest as they are members of the AKR-superfamily that produces quinones that can react with DNA to form adducts (as described in Section 1.1.2). Furthermore, along with rs3812617, the SNPs rs2813543, rs7845127, rs4646243 may represent novel SNPs as, to our knowledge, no other studies have investigated their effects on breast cancer risk and, with the exception of rs3812617. We found no other variants that have known associations with breast cancer and are in high linkage disequilibrium with our SNPs.  To further explore the role of these genetic variants have on breast cancer, we tested for the existence of gene-environment interactions with PAHs. The AKR-SNPs were observed to have a   110 modifying effect on breast cancer risk associated with duration to high PAH exposure; although, after FDR adjustment the associations were only marginally significant. Women with the homozygous major allele had an increased risk for breast cancer with high PAH exposure and, within the non-exposed group, each additional minor allele increased breast cancer risk. More interestingly, women with one or more minor allele of the PTGS2 SNP rs5275 showed a protective main effect on breast cancer risk, with a decrease in breast cancer by an average of 14% for each additional minor allele. The association was marginally significant and failed to meet our threshold after FDR adjustment. However, when we explored its effect on PAH exposure we found clear and consistent evidence of effect modification in all three of our PPM-JEM exposure metrics. Women who were never exposed to PAHs showed null results, while those who were exposed showed decreasing risk with each minor allele.    Overall, these analyses add to the literature the role genetic variants have in modifying breast cancer risk. In summary, we noted (1) a modification of occupational PAH exposure on risk of breast cancer by (first degree) family history of breast cancer (see Chapter 3), and (2) the observation that certain genetic variants can modify the effects of PAH exposure on breast cancer risk (see Chapter 4).  5.2 Strengths and Limitations The majority of the strengths and limitations are summarized in Chapters 2–4. However, we will take this opportunity to expand on these topics outside their respective chapters.    111 To address the issue of determining PAH exposure, this study utilized measurement data from OSHA compliance databanks from 756 companies across 45 states between 1979 and 2010. The data collected included detailed information relating to type of industry, occupation, type of exposure (i.e. chemical), and exposure of level. The use of this type of databank to create the PPM-JEM provides a significant advantage over other types of JEMs, including the DOM and NCI-JEM used in the study, because it is based on empirical evidence of exposure that is more reliable than an expert-based metric, which can be subjective. For example, occupations classified as administrative would be considered not at risk of PAH exposure by some JEMs. However, as demonstrated in the results of Chapter 3, although occupations classified as administrative are expectantly at a low risk of PAH exposure, their position does not preclude them from exposure, especially administrative positions involved in industries where manufacturing maybe involved. Estimates for administrative personnel in NAICS 23731 (Highway, Street, and Bridge Construction) have a predicted probability of 1%, which is not surprising as these employees probably work off-site and away from construction. In contrast, administrative personnel working in NAICS 23711 (Water and Sewer Line and Related Structures Construction) have a 55% chance of exceeding the PEL, which could be the result of indirect exposure in the facilities that produce and/or modify pipes used in the various structures for construction. The end result of the model is a predicted probability of the likelihood of exceeding the PEL that is a tangible metric, as opposed to traditional JEMs that assign ordinal values that may not accurately describe the intensity of exposure and are not anchored by objective measurements, but instead utilize a “black box” method of expert judgment.(235) In addition, the ordinal values used in traditional JEMs can be somewhat arbitrary, and often there is little explanation as to why, for example, a high intensity of exposure in the DOM-JEM is   112 given the value of 4 while the same level of exposure in the NCI-JEM is given the value of 2. The use of probabilities from our predictive probability model removes all ambiguity behind the values assigned by the JEM, which may also make the results more interpretable.  Limitations not presented in Chapter 2 relate to the choice of exposure and how and when the measurements were taken. As previously mentioned, the statistical model predicts the probability of exceeding the PEL by modeling the number of times the calculated TWA for coal tar pitch volatiles (CTPV) exceeded 0.2 mg·m-3. The use of CTPV presents a potential limitation because PAHs are often a mixture of multiple compounds that include compounds that are not components of CTPV and therefore our predicted estimates may underestimate the exceedance fractions of PAH exposure; components of CTPV are detectable quantities of one or more of benz[a]anthracene, benzo[b]fluoranthene, chrysene, anthracene, benzo[a]pyrene, phenanthrene, acridine, and/or pyrene. Another potential issue is measurement error, as we may be measuring an imperfect surrogate; however, the use of CTPV as a surrogate for measuring total PAHs is considered an acceptable practice by OSHA and, with respect to the available databank, afforded us the largest dataset for modeling. For context, exposure measurements from the two OSHA databanks also included measurements of phenanthrene, anthracene, pyrene, chrysene, and benzo[a]pyrene. A limited number of measurements of these compounds were available for many industries and occupations, and therefore reliable estimates for exposure would have not been possible. Aside from the choice of exposure, two other limitations relate to the device used for sampling and the type of testing (scheduled versus surprise surveying). Air samples are collected by drawing known amounts of air through a vacuum cleaner-like device that contain glass fiber filters (GFF). The specificity of the GFF depends on the number of interfering   113 compounds present and the concentration of the substance being measured relative to any interferences. Therefore, given that PAHs or CTPV are mixtures of compounds, there may be a lack of specificity, i.e. not highly selective. Furthermore, the filter only captures particular matter, as smaller particles or vapors may not be captured and instead pass through the filter; particulate-phase PAH tend to be lost from the filter during sampling due volatilization. The compliance data was obtained through air sampling, and since PAHs can be in solid state and gaseous state, measurements could be underestimated without the use of a sorbent tube downstream of the filter itself. These issues may also be exacerbated by inaccurate sample volume determinations from inadequate temperature and humidity conditioning prior to weighing.   Regarding exposure measurements available, the timing of the survey and whether or not the inspection was programmed or a surprise inspection due to complaints could result in a biased measured exposure level because of changes in procedures prior to and during the inspection. For example, air population levels in Beijing are a major and continuous health concern. However, prior to and during the 2008 Olympic Games, the Chinese government shut down many of the manufacturing plants within and surrounding the city in order to control air pollution and improve air quality(287) to dispel health and safety concerns. Analyses of the IMIS databank (one of the two OSHA databanks used to model the exceedance fractions) by Sarazin et al. found that, for the 219,000 measurements analyzed during inspections of 50 chemicals, the detected concentrations were similar for compliant (i.e. surprise) and referral (i.e. programmed) inspections. Therefore, although exposure levels during non-programmed inspections may be higher than those obtained during presumably representative programmed inspections,   114 heterogeneous exposure levels are expected and no consistent differences across chemicals were observed.(288)   To study the association between PAH exposure and breast cancer risk, one of the strengths of this study was the use of an in-depth questionnaire to ascertain lifetime employment history, as well as supplementary questions relating to materials and tasks performed during employment. The questionnaire allowed us to determine an accurate work histories with minimal recall bias, as interviewers and questions never alluded to the type of exposure we were interested in studying. Although recall error might be present, particularly for jobs many years ago, the error is expected to be distributed equally among cases and controls so that any impact is negligible. Another important strength of this study was its use of multiple, independently developed JEMs for PAH exposure assessment. Although studies have used industry as a method for assessing PAH exposure,(215, 216) the DOM-JEM classifies risk by assessing specific occupations while the NCI-JEM also incorporates industry, which is more accurate when classifying exposure than simply occupation or industry alone.(200) The PPM-JEM, which also classifies exposure based on industry and occupation, uses empirical measurements to predict its estimates for exceeding the PEL in an innovative approach to exposure assessment. As such, although the statistical significance of the results for the three metrics varied, all indications point towards a positive association between PAH exposure and breast cancer risk that support the notion that prolonged exposure to occupational PAHs increases a woman’s risk for breast cancer, especially in women who are premenopausal or have a (first degree) family history of breast cancer.    115 A challenge not discussed in Chapter 3, although was broached in Chapter 1, was the issue of assessing occupational exposure among women. There are clear differences in the type of industries and occupations that women work in that are at risk of PAH exposure, and the majority of the industries studied for PAH exposure (e.g. aluminum smelting and gasification) are mostly male-dominated. This can pose issues when assessing exposure using older JEMs that use expert judgement that were based mostly on male samples, and therefore these JEMs may have gender-bias. For example, current statistics from the U.S. Department of Labor estimate that 72% of Waiters and Waitresses are women,(289) and exposure estimates of workers in the food-service industry depended on our choice of JEM. In particular, for food-service industry workers the DOM-JEM estimates workers have a low risk of exposure, the NCI-JEM estimates no risk of exposure, while the PPM-JE  estimates a “high” risk of exposure. This potential gender-bias may be related not to women specifically, but is more so an under-representation of those industries and occupations that women may dominate due to certain characteristics (i.e. non-industrialized exposure). Regardless, consideration should be made when using older JEMs within certain populations, and caution may be warranted when considering the age or the ability of the JEM to capture the intended exposure. The PPM-JEM has an advantage over the other JEMs, is that it can be updated based on new additions to the OSHA databanks to maintain relevance as exposure trends change or more industries and occupations are measured.  A limitation not discussed above nor in Chapter 3 with respect to the PPM-JEM exposure assignments relates to measurement error. The results of Chapter 2 are the foundation of the PPM-JEM, and any issues relating to the accuracy of the model, i.e. estimates probabilities poorly, can bring into question our results. Since our estimated exceedance fractions identify   116 previously established industries that are known to have PAH exposure, a more realistic concern would be measurement error. The PPM-JEM looks at the probability of exceeding the PEL, as opposed to the probability of exposure as in the NCI-JEM, a lack of measurements exceeding the PEL for an industry can result in missing low-level exposure and misinterpretation of the PPM-JEM. As such, although our novel metric can be used to estimate with some certainty if there is risk of exceeding the PEL, or even risk of exposure, it cannot allow us to state unequivocally that an occupation is not at risk for PAH exposure. For example, if some industry had all its measurements below 0.2 mg·m-3 but above the LOD, then it can be argued that said industry is at risk for PAH exposure; however, our metric would estimate a 0% probability of exceeding the PEL, which is not the same as 0% probability of exposure. As discussed in the Section 3.3, these limitations relating to measurement error and misclassification exists for all JEMs since, in general, there is a mix of true exposure status within any occupation and industry.  The issues detailed in previous paragraph are further exacerbated by how these exposure assessments are presented (e.g. cumulative exposure, average exposure, duration, and weighted duration), as well as lags between exposure and disease. We did not assume there was a latency period between exposure and diagnosis of breast cancer because of the continuous accumulation of the reactive intermediates and their effect on DNA. With respect to my metric (PPM-JEM), there were several exposure algorithms and the choice of which one to use and how to present it is subjective and, as observed in Table 3-2 and subsequently analyses, the estimates differed slightly between assessments. Although the consequence of misclassification poses a threat to the validity of our estimates, and in spite of their apparent lack of agreement with respect to   117 which jobs are truly exposed, the agreement of the three metrics of a potential association between occupational PAH exposure and breast cancer helps reduce such concerns.  To examine the gene-disease relationship in Chapter 4, a major strength of our study was the a priori selection of genes through the candidate gene approach after a thorough examination of the existing literature at the time. Although preference has moved towards genome-wide association studies (GWAS), which scan the entire genome for common genetic variation, the candidate gene approach focuses on associations between genetic variation within particular genes of interest based on known or hypothesized modes of action. By focusing on genes in the pathways associated with PAH metabolism, we identified genes (and SNPs) that are of biological relevance. Limiting the number of genetic variants we investigated also reduced the number of comparisons and the Type I error rate, as well as the Type II error rate for FDR adjustment. Furthermore, as discussed in Chapter 4, qualitative interactions (i.e. associations with the disease only exist in the presence of exposure) can only be observed for genes that have a biological role in PAH metabolism. To evaluate the gene-disease association there is often no a priori evidence supporting the inheritance model (i.e. dominant, recessive, or additive) for the risk allele, and often is selected because of a strong preference for one mode, which is unjustified,(290) convenience, or what is accepted within the field of study. In this study, we did not assume an a priori inheritance model and instead compared multiple models and selected the mode of inheritance that produced the strongest signal. In contrast to many studies that had multiple comparisons,(290) we adjusted all p-values using our proposed two-step gene-based FDR correction method in order to control the Type I error rate.    118 With respect to the choice of genetic marker, the use of tag SNPs based on an LD threshold is the main limitation. Patterns of LD in the human genome limit our ability to interpret genetic associations because any association observed maybe due to the tag SNP being the causal SNP or in LD with the causal SNP. Moreover, tag SNPs can be associated with hundreds of SNPs, and therefore it is often not associated with the observed effect (i.e. the association with breast cancer is with another SNP). However, since the tag SNP could be in LD with the causal SNP the effect can still potentially be identified. Furthermore, by using tag SNPs we increase the likelihood of observing the genetic effect, if it exists, because it allows a systematic representation of genetic variation across the gene. Similar to Chapter 3, one of the same limitations of the examination of gene-environment interactions relates to the accuracy of the PPM-JEM. Even if the exposure metric is imperfect, if there are GxE interactions, ignoring joint effects of environmental exposures on genetic markers and can result in loss of statistical power, and therefore some attempt at using environmental exposure data to account for gene-environment interactions is beneficial.(291, 292) However, as Gustafson and Burstyn demonstrated, there is a caveat of this benefit when using a non-differential misclassified exposure surrogate as both the exposure and interaction coefficients will tend to attenuated towards zero.(285) As discussed in in Chapter 4, even if we were to assume that our metric suffers from severe misclassification, the strong gene-disease association observed with several of the genes, whose effects may only be observed in the presence of exposure, suggests the existence of an interaction under certain conditions.  For this argument to hold, we need to be able to assert that (A) there is no main genetic effect and the interaction coefficient alone describes the extent of departure from effect when genetic   119 vulnerability is absent (i.e. the main effect of effect exposure), and (B) genes and exposures are acquired independently in the source population. Burstyn et al. showed that, in presence of measurement error in exposure, detecting a positive association for the gene in a mis-specified model can be more powerful for detecting an interaction than the correctly specified model (i.e. a model with main E and GxE effects).(286) The observed effect of a gene depends on its interaction with the true exposure; thus, without accurately estimating exposure or using an imperfect surrogate, the genetic effect can be used to detect (rather than quantify) the interaction.(286) However, the trade-off in efficiency between testing a mis-specified versus a correct model does not always hold for a fixed sample size.(293) Specifically, prevalence of exposure and the extent of the exposure classification influence whether mis-specified or correct model has more power. When exposure is less prevalent (e.g. < 30%), assessing exposure is necessary to boost power, because genotype subgroups are dominated by unexposed subjects due to rare exposure. The gene-disease association becomes only evident among exposed subjects, and therefore the interaction test will yield more power.(293) However, when exposure is common (e.g. ≥  0%), the test of gene-only association is more likely to yield higher power. Our work is likely in the realm of common exposure studied by Luo et al. because at least 30% of our subjects were assessed as ever-exposed to PAHs. Although we do not know the sensitivity of my JEM, historical experience with JEM suggests that it will have poor sensitivity compared to specificity.(234, 294)  5.3 Implications and Future Research Databanks such as the OSHA monitoring data are an important element in both occupational and epidemiological research that has allowed the development of various models, including the aforementioned predictive probability models that estimate the likelihood of exceeding the PEL   120 for PAHs. Use of such databanks creates a unique opportunity to develop a host of models and methods for assessing various occupational exposures based on empirical evidence that removes ambiguity associated with traditional exposure metrics. The model, which was used to create the PPM-JEM, has potential application in other epidemiological research as a method of PAH exposure assessment. Both the model and use of databanks have direct implications in health and safety monitoring through targeted approaches of industries that are beyond a threshold for risk of exceeding the PEL; that is, we have a tool for calculating the risk of non-compliance. Furthermore, changes in exposure over time and industry can be viewed from the model and databanks. For example, we observed no indication of a trend in PAH exposure over time, but updates to the monitoring data can easily be implemented in the model, and any (presumably) decreases in exposure level or predicted exceedance fractions could be used to identify changes in policy, technologies, or monitoring.  The direct application of our model, and subsequent JEM, is presented in our evidence of an association between PAH exposure and breast cancer. Previous studies have indicated a relationship between PAH exposure and cancer of various sites, although most studies have focused on lung cancer due the route of PAH exposure being predominantly through inhalation. We believe that these results will not only help our understanding of breast cancer and the role PAH exposure plays in carcinogenesis, but also help inform health and policy makers about the risk of exposure from common work-environments. Much of the previous work with PAH exposure has focused on male workers in industrialized industries, e.g. aluminum smelting and coal gasification; however, as we have shown, employment in seemly innocuous industries such as restaurants can have non-trivial PAH exposure. Given that work in the food-service industry is   121 common, targeted interventions to reduce exposures below the PEL and increased screening of those with long-term employment in these industries could help minimize the burden to both the health care system and the patient if the disease is caught at an early stage.   The observation that low penetrance variants in the form of single nucleotide polymorphisms influence breast cancer risk helps us better understand the biological mechanism of breast cancer by identifying potential pathways that influence carcinogenesis. Moreover, not only do we see how genes involved in PAH metabolism can activate or detoxify carcinogenic intermediates, but also how some of these genes other metabolic pathways, i.e. CYP-genes in estrogen hormone-metabolism. We have contributed to the literature by identifying four SNPs to target in future research and have presented evidence suggesting modification of exposure by genetic susceptibilities. The existence of gene-environment interactions, which we observed through three SNPs that modified the effects of occupational PAH exposure on breast cancer risk, adds to our understanding of the dependence between these risk factors and their joint effect on breast cancer. Identifying these GxEs expands the notion that attention needs to be given to how these factors interact, as ignoring such effects can lead to bias and misinterpretation of the results.  5.4 Concluding Remarks This research has developed several predictive probability models that can be used for a variety of epidemiological studies exploring the effect of PAH exposure on complex diseases. To examine the effect of PAH exposure on breast cancer risk, a study was done in a large population-based case-control study and the results support the notion that prolonged occupational exposure to PAHs is associated with increased breast cancer risk, especially among   122 premenopausal women and those with family history of breast cancer. The existence of gene-environment interactions, which we observed through three SNPs displaying modifying effects on PAH-mediated breast cancer risk, supports our hypothesis that genetic susceptibility and PAH exposure can interact to influence breast cancer risk. Our research adds evidence for the role PAHs and gene variants involved in their metabolism have on breast cancer risk, and will lead to both improved exposure and preventative measures, as well as a greater understanding of the etiology of breast cancer.   123 Bibliography 1 Bjørseth, A.: Handbook of Poylycyclic Aromatic Hydrocarbons, Vol 1. New York: Marcel Dekker, 1983. 2 Harvey, R.G.: Polycyclic Aromatic Hydrocarbons, Chemistry and Carcinogenicity, 1991. 3 ATSDR: "Toxicological profile for polycyclic aromatic hydrocarbons": Agency for Toxic Substances and  Disease Registry, 1995. 4 IARC: Polynuclear Aromatic Compounds, Part 1: Chemical, Environmental and Experimental Data. In IARC Monographs on the Evaluation of Carcinogenic Risks to Humans, pp. 477 p. Lyon, France: International Agency for Research on Cancer, 1983. 5 Ward, M.H., R. Sinha, E.F. Heineman, N. Rothman, R. Markin, D.D. Weisenburger et al.: Risk of Adenocarcinoma of the Stomach and Esophagus with Meat Cooking Method and Doneness Preference. International Journal 71(1): 14-19 (1997). 6 Phillips, D.H.: Polycyclic aromatic hydrocarbons in the diet. Mutation Research/Genetic Toxicology and Environmental Mutagenesis 443(1): 139-147 (1999). 7 Samanta, S.K., O.V. Singh, and R.K. Jain: Polycyclic aromatic hydrocarbons: environmental pollution and bioremediation. TRENDS in Biotechnology 20(6): 243-248 (2002). 8 WHO: "Polynuclear aromatic hydrocarbons in drinking-water. Background document for preparation of WHO Guidelines for drinking-water quality", W.H. Organization (ed.), pp. 428-430, 2003. 9 National Cancer Institute: "CHARRED: Computerized Heterocyclic Amines Resource for Research in Epidemiology of Disease (Version 1.7)." [Online] Available at https://dceg.cancer.gov/tools/design/charred (Accessed December 16, 2016). 10 Keith, L., and W. Telliard: ES&T special report: priority pollutants: Ia perspective view. Environmental Science & Technology 13(4): 416-423 (1979). 11 Keith, L.H.: The Source of US EPA's Sixteen PAH Priority Pollutants. Polycyclic Aromatic Compounds 35(2-4): 147-160 (2015). 12 Environmental Protection Agency: "Section 401.15 - Toxic polluants". In Code of Federal Regulation: Title 40 - Protection of Environment, 1978. 13 Environmental Protection Agency: "Suspect Carcinogens in Water Supplies Interim Report": Office of Research & Development, 1975.   124 14 IARC: Some non-heterocyclic polycyclic aromatic hydrocarbons and some related exposures. In IARC Monographs on the Evaluation of Carcinogenic Risks to Humans. Lyon, France: International Agency for Research on Cancer, 2010. 15 National Center for Biotechnology Information: "PubChem Compound Database." [Online] Available at https://www.ncbi.nlm.nih.gov/pccompound (Accessed September 22, 2016). 16 Yamasaki, H., K. Kuwata, and H. Miyamoto: Effects of Ambient Temperature on Aspects of Airborne Polycyclic Aromatic Hydrocarbons. Environmental Science and Technology 16(4): 189-194 (1982). 17 Talaska, G., P. Underwood, A. Maier, J. Lewtas, and N. Rothman: Polycyclic aromatic hydrocarbons (PAHs), nitro-PAHs and related environmental compounds: biological markers of exposure and effects. Environ Health Perspect 104(5): 901-906 (1996). 18 Occupational Safety and Health Administration: "Chemical Sampling - Coal Tar Pitch Volatiles, Coke Oven Emissions." [Online] Available at https://www.osha.gov/dts/sltc/methods/organic/org058/org058.html (Accessed September 22, 2016). 19 Mumtaz, M.M., J.D. George, K.W. Gold, W. Cibulas, and C.T. DeRosa: ATSDR evaluation of health effects of chemicals. IV. Polycyclic aromatic hydrocarbons (PAHs): understanding a complex problem. Toxicol Ind Health 12(6): 742-971 (1996). 20 Lindstedt, G., and J. Sollenberg: Polycyclic aromatic hydrocarbons in the occupational environment: with special reference to benzo[a]pyrene measurements in Swedish industry. Scand J Work Environ Health 8(1): 1-19 (1982). 21 IARC: Polynuclear Aromatic Compounds: Part 3, Industrial Exposures in Aluminium Production, Coal Gasification, Coke Production, and Iron and Steel Foundings: International Agency for Research on Cancer, 1984. 22 Bjørseth, A., and G. Becher: PAH in work atmospheres : occurrence and determination. Boca Raton, Fla.: CRC Press, 1986. 23 Sauvain, J.-J., T.V. Duc, and M. Guillemin: Exposure to carcinogenic polycyclic aromatic compounds and health risk assessment for diesel-exhaust exposed workers. Int Arch Occup Environ Health 76(6): 443-455 (2003). 24 Chepiga, T., M. Morton, P. Murphy, J. Avalos, B. Bombick, D. Doolittle et al.: A comparison of the mainstream smoke chemistry and mutagenicity of a representative sample of the US cigarette market with two Kentucky reference cigarettes (K1R4F and K1R5F). Food and Chemical Toxicology 38(10): 949-962 (2000).   125 25 Rustemeier, K., R. Stabbert, H.-J. Haussmann, E. Roemer, and E. Carmines: Evaluation of the potential effects of ingredients added to cigarettes. Part 2: chemical composition of mainstream smoke. Food and Chemical Toxicology 40(1): 93-104 (2002). 26 Roemer, E., R. Stabbert, K. Rustemeier, D. Veltel, T. Meisgen, W. Reininghaus et al.: Chemical composition, cytotoxicity and mutagenicity of smoke from US commercial and reference cigarettes smoked under two sets of machine smoking conditions. Toxicology 195(1): 31-52 (2004). 27 Vainiotalo, S., and K. Matveinen: Cooking fumes as a hygienic problem in the food and catering industries. The American Industrial Hygiene Association Journal 54(7): 376-382 (1993). 28 Siegmann, K., and K. Sattler: Aerosol from hot cooking oil, a possible health hazard. Journal of Aerosol Science 27: S493-S494 (1996). 29 IARC: Household Use of Solid Fuels and High-temperature Frying. In IARC Monographs on the Evaluation of Carcinogenic Risks to Humans, pp. 1-458, 2010. 30 El-Bayoumy, K.: Environmental carcinogens that may be involved in human breast cancer etiology. Chemical research in toxicology 5(5): 585-590 (1992). 31 IARC: Occupational Exposures in Petroleum Refining; Crude Oil and Major Petroleum Fuels. In IARC Monographs on the Evaluation of Carcinogenic Risks to Humans, 1989. 32 Apostoli, P., M. Crippa, M.E. Fracasso, D. Cottica, and L. Alessio: Increases in polycyclic aromatic hydrocarbon content and mutagenicity in a cutting fluid as a consequence of its use. Int Arch Occup Environ Health 64(7): 473-477 (1993). 33 Futagaki, S.K.: "Petroleum refinery workers exposure to PAHs at fluid catalytic cracker, coker, and asphalt processing units", 1983. 34 Lemke, L.D., L.E. Lamerato, X. Xu, J.C. Booza, J.J. Reiners, D.M. Raymond III et al.: Geospatial relationships of air pollution and acute asthma events across the Detroit–Windsor international border: Study design and preliminary results. Journal of Exposure Science and Environmental Epidemiology 24(4): 346-357 (2014). 35 Halsall, C.J., P.J. Coleman, B.J. Davis, V. Burnett, K.S. Waterhouse, P. Harding-Jones et al.: Polycyclic Aromatic Hydrocarbons in U.K. Urban Air. Environmental Science and Technology 28(13): 2380-2386 (1994). 36 IARC: Polynuclear Aromatic Compounds, Part 2: Carbon Blacks, Mineral Oils (Lubricant Base Oils and Derived Products) and Some Nitroarenes. In IARC Monographs on the Evaluation of Carcinogenic Risks to Humans, 1984.   126 37 Armstrong, B., C. Tremblay, D. Baris, and G. Theriault: Lung cancer mortality and polynuclear aromatic hydrocarbons: a case-cohort study of aluminum production workers in Arvida, Quebec, Canada. Am J Epidemiol 139(3): 250-262 (1994). 38 Armstrong, B.G., C.G. Tremblay, D. Cyr, and G.P. Theriault: Estimating the relationship between exposure to tar volatiles and the incidence of bladder cancer in aluminum smelter workers. Scand J Work Environ Health 12(5): 486-493 (1986). 39 Panek, J., and J. Grasser: Practical experience gained during the first twenty years of operation of the great plains gasification plant and implications for future projects. US Department of Energy-Office of Fossil Energy, Washington (2006). 40 World Health Organization: Air quality guidelines for Europe(1987). 41 Kreyberg, L.: 3: 4-Benzpyrene in industrial air pollution: Some reflexions. British journal of cancer 13(4): 618 (1959). 42 Lawther, P., B. Commins, and R. Waller: A study of the concentrations of polycyclic aromatic hydrocarbons in gas works retort houses. Br J Ind Med 22(1): 13-20 (1965). 43 Mašek, V.: Benzo (a) pyrene in the workplace atmosphere of coal and pitch coking plants. Journal of Occupational and Environmental Medicine 13(4): 193-198 (1971). 44 Verma, D., D. Muir, S. Cunliffe, J. Julian, J. Vogt, J. Rosenfeld et al.: Polycyclic aromatic hydrocarbons in Ontario foundry environments. Annals of occupational hygiene 25(1): 17-25 (1982). 45 Schimberg, R.W., P. Pfäffli, and A. Tossavainen: Polycyclic aromatic hydrocarbons in foundries. Journal of Toxicology and Environmental Health, Part A Current Issues 6(5-6): 1187-1194 (1980). 46 Faulds, A., Z. Waszczylo, and K. Westaway: Polynuclear aromatic hydrocarbons in the underground mine environment. CIM Bull 74(835): 84-90 (1981). 47 IARC: Diesel and gasoline engine exhausts and some nitroarenes. In IARC Monographs on the Evaluation of Carcinogenic Risks to Humans, pp. 1-458, 1989. 48 United States Environmental Protection Agency: Health assessment document for diesel engine exhaust: National Center for Environmental Assessment, 2002. 49 Knecht, U., H. Elliehausen, W. Judas, and H. Woitowitz: Polycyclic aromatic hydrocarbons (PAH) in abraded particles of brake and clutch linings. International journal of environmental analytical chemistry 28(3): 227-236 (1987).   127 50 National Research Council: Health Effects of Exposure to Diesel Exhaust: The Report of the Health Effects Panel of the Diesel Impacts Study Committee, National Research Council: National Academy Press, 1981. 51 Mi, H.-H., W.-J. Lee, P.-J. Tsai, and C.-B. Chen: A comparison on the emission of polycyclic aromatic hydrocarbons and their corresponding carcinogenic potencies from a vehicle engine using leaded and lead-free gasoline. Environ Health Perspect 109(12): 1285 (2001). 52 Speizer, F.E.: Overview of the risk of respiratory cancer from airborne contaminants. Environ Health Perspect 70: 9 (1986). 53 Commins, B., R. Waller, and P. Lawther: Air pollution in diesel bus garages. Br J Ind Med 14(4): 232 (1957). 54 Waller, R.: Trends in lung cancer in London in relation to exposure to diesel fumes. Environment International 5(4): 479-483 (1981). 55 Colmsjö, A.L., Y.U. Zebühr, C.E. Östman, Å. Wädding, and H. Söderström: Polynuclear aromatic compounds in the ambient air of Stockholm. Chemosphere 15(2): 169-182 (1986). 56 Hammond, E.C., I.J. Selikoff, P.L. Lawther, and H. Seidman: Inhalation of benzpyrene and cancer in man. Annals of the New York Academy of Sciences 271(1): 116-124 (1976). 57 Darby, F., A. Willis, and R. Winchester: Occupational health hazards from road construction and sealing work. Annals of occupational hygiene 30(4): 445-454 (1986). 58 Burstyn, I., H. Kromhout, T. Kauppinen, P. Heikkilä, and P. Boffetta: Statistical modelling of the determinants of historical exposure to bitumen and polycyclic aromatic hydrocarbons among paving workers. Annals of occupational hygiene 44(1): 43-56 (2000). 59 IARC: Polynuclear aromatic compounds, part 4: bitumens, coal-tars and derived products, shale-oils and soots. In IARC monographs on the evaluation of the carcinogenic risk of chemicals to humans. Lyon, France: International Agency for Research on Cancer, 1985. 60 Sawicki, E., F.T. Fox, W. Elbert, T. Hauser, and J. Meeker: Polynuclear aromatic hydrocarbon composition of air polluted by coal-tar pitch fumes. Am Ind Hyg Assoc J 23(6): 482-486 (1962). 61 Choosong, T., J. Chomanee, P. Tekasakul, S. Tekasakul, Y. Otani, M. Hata et al.: Workplace environment and personal exposure of PM and PAHs to workers in natural rubber sheet factories contaminated by wood burning smoke. Aerosol and Air Quality Resarch 10(1): 8-21 (2010).   128 62 Burstyn, I., and H. Kromhout: Are the members of a paving crew uniformly exposed to bitumen fume, organic vapor, and benzo (a) pyrene? Risk Analysis 20(5): 653-664 (2000). 63 Hansen, E.S.: Cancer incidence in an occupational cohort exposed to bitumen fumes. Scand J Work Environ Health: 101-105 (1989). 64 Burstyn, I., H. Kromhout, and P. Boffetta: Literature review of levels and determinants of exposure to potential carcinogens and other agents in the road construction industry. AIHAJ-American Industrial Hygiene Association 61(5): 715-726 (2000). 65 Nisbet, I.C., and P.K. LaGoy: Toxic equivalency factors (TEFs) for polycyclic aromatic hydrocarbons (PAHs). Regulatory toxicology and pharmacology 16(3): 290-300 (1992). 66 Petry, T., P. Schmid, and C. Schlatter: The use of toxic equivalency factors in assessing occupational and environmental health risk associated with exposure to airborne mixtures of polycyclic aromatic hydrocarbons (PAHs). Chemosphere 32(4): 639-648 (1996). 67 Elovaara, E., P. Heikkilä, L. Pyy, P. Mutanen, and V. Riihimäki: Significance of dermal and respiratory uptake in creosote workers: exposure to polycyclic aromatic hydrocarbons and urinary excretion of 1-hydroxypyrene. Occup Environ Med 52(3): 196-203 (1995). 68 Li, C.-T., W.-J. Lee, C.-H. Wu, and Y.-T. Wang: PAH emission from waste ion-exchange resin incineration. Science of the total environment 155(3): 253-265 (1994). 69 Li, C.-T., W.-J. Lee, H.-H. Mi, and C.-C. Su: PAH emission from the incineration of waste oily sludge and PE plastic mixtures. Science of the total environment 170(3): 171-183 (1995). 70 Mi, H.-H., C.-F. Chiang, C.-C. Lai, L.-C. Wang, and H.-H. Yang: Comparison of PAH emission from a municipal waste incinerator and mobile sources. Aerosol and Air Quality Resarch 1(1): 83-90 (2001). 71 Huang, C.-S., H.-D. Chern, K.-J. Chang, C.-W. Cheng, S.-M. Hsu, and C.-Y. Shen: Breast Cancer Risk Associated with Genotype Polymorphism of the Estrogen-metabolizing Genes CYP17, CYP1A1, and COMT A Multigenic Study on Cancer Susceptibility. Cancer research 59(19): 4870-4875 (1999). 72 CAREX Canada: "PAHs - Occupational Estimate." [Online] Available at http://www.carexcanada.ca/en/polycyclic_aromatic_hydrocarbons/occupational_estimate/ (Accessed July 17, 2012). 73 Shuguang, L., P. Dinhua, and W. Guoxiong: Analysis of polycyclic aromatic hydrocarbons in cooking oil fumes. Archives of Environmental Health: An International Journal 49(2): 119-122 (1994).   129 74 Ramdahl, T., I. Alfheim, S. Rustad, and T. Olsen: Chemical and biological characterization of emissions from small residential stoves burning wood and charcoal. Chemosphere 11(6): 601-611 (1982). 75 Kakareka, S.V., T.I. Kukharchyk, and V.S. Khomich: Study of PAH emission from the solid fuels combustion in residential furnaces. Environmental pollution 133(2): 383-387 (2005). 76 Fretheim, K.: Carcinogenic polycyclic aromatic hydrocarbons in Norwegian smoked meat sausages. J Agric Food Chem 24(5): 976-979 (1976). 77 Kuo, C.-Y., S.-H. Chang, Y.-C. Chien, F.-Y. Chiang, and Y.-C. Wei: Exposure to carcinogenic PAHs for the vendors of broiled food. Journal of Exposure Science and Environmental Epidemiology 16(5): 410-416 (2006). 78 Pan, C., C. Chan, Y. Huang, and K. Wu: Urinary 1-hydroxypyrene and malondialdehyde in male workers in Chinese restaurants. Occup Environ Med 65(11): 732-735 (2008). 79 Chiang, T.-A., P.-F. Wu, and Y.-C. Ko: Identification of carcinogens in cooking oil fumes. Environmental research 81(1): 18-22 (1999). 80 United States Census Bureau: "Equal Employment Opportunity (EEO) Tabulation 2006–2010 (5-year ACS data)." [Online] Available at http://www.census.gov/people/eeotabulation/data/eeotables20062010.html (Accessed September 28, 2016). 81 Lloyd, J.W.: Long-Term Mortality Study of Steelworkers: V. Respiratory Cancer in Coke Plant Workers. Journal of Occupational and Environmental Medicine 13(2): 53-68 (1971). 82 Koskela, R.-S., S. Hernberg, R. Kärävä, E. Järvinen, and M. Nurminen: A mortality study of foundry workers. Scand J Work Environ Health: 73-89 (1976). 83 Gibson, E.S., R. Martin, and J. Lockington: Lung Cancer Mortality in a Steel Foundry. Journal of Occupational and Environmental Medicine 19(12): 807-812 (1977). 84 Andjelkovich, D.A., R.M. Mathew, R.B. Richardson, and R.J. Levine: Mortality of iron foundry workers: I. Overall findings. Journal of Occupational and Environmental Medicine 32(6): 529-540 (1990). 85 Andjelkovich, D.A., C.M. Shy, M.H. Brown, D.B. Janszen, R.J. Levine, and R.B. Richardson: Mortality of Iron Foundry Workers: III. Lung Cancer Case-Control Study. Journal of Occupational and Environmental Medicine 36(12): 1301-1309 (1994).   130 86 Tremblay, C., B. Armstrong, G. Theriault, and J. Brodeur: Estimation of risk of developing bladder cancer among workers exposed to coal tar pitch volatiles in the primary aluminum industry. Am J Ind Med 27(3): 335-348 (1995). 87 Spinelli, J.J., P.A. Demers, N.D. Le, M.D. Friesen, M.F. Lorenzi, R. Fang et al.: Cancer risk in aluminum reduction plant workers (Canada). Cancer Causes & Control 17(7): 939-948 (2006). 88 Gelboin, H.V., and P.O.P. Ts'o: Polycyclic Hydrocarbons and Cancer, Volume 1: Environment, Chemistry and Metabolism. New York: Academic Press, 1978. 89 Gelboin, H.V.: Benzo [alpha] pyrene metabolism, activation and carcinogenesis: role and regulation of mixed-function oxidases and related enzymes. Physiological reviews 60(4): 1107-1166 (1980). 90 Hall, M., and P.L. Grover: Polycyclic aromatic hydrocarbon: metabolism, activation and tumor initiation. In Chemical Carcinogenesis and Mutagenesis Vol 1 (Cooper, C.S., Grover, P.L.). Springer-Verlag: 327-372 (1990). 91 Cavalieri, E.L., and E.G. Rogan: The approach to understanding aromatic hydrocarbon carcinogenesis. The central role of radical cations in metabolic activation. Pharmacology & Therapeutics 55(2): 183-194 (1992). 92 Cavalieri, E.L., and E.G. Rogan: Central role of radical cations in metabolic activation of polycyclic aromatic hydrocarbons. Xenobiotica 25(7): 677-688 (1995). 93 Penning, T.M., S.T. Ohnishi, T. Ohnishi, and R.G. Harvey: Generation of reactive oxygen species during the enzymatic oxidation of polycyclic aromatic hydrocarbon trans-dihydrodiols catalyzed by dihydrodiol dehydrogenase. Chemical research in toxicology 1: 84-92 (1996). 94 Penning, T.M., M.E. Burczynski, C.-F. Hung, K.D. McCoull, N.T. Palackal, and L.S. Tsuruda: Dihydrodiol dehydrogenases and polycyclic aromatic hydrocarbon activation: generation of reactive and redox active o-quinones. Chemical research in toxicology 12(1): 1-18 (1999). 95 Miller, E.C.: Some current perspectives on chemical carcinogenesis in humans and experimental animals: presidential address. Cancer research 38(6): 1479-1496 (1978). 96 Maugh, T.H.: Tracking Exposure to Toxic Substances: New ways to measure human exposure to carcinogens and mutagens provide a foundation for more precise assessment of risks. Science 226(4679): 1183-1184 (1984).   131 97 Harris, C.C.: Future directions in the use of DNA adducts as internal dosimeters for monitoring human exposure to environmental mutagens and carcinogens. Environ Health Perspect 62: 185 (1985). 98 Muller, P.: Scientific Criteria Document for Multimedia Standards Development, Polycyclic Aromatic Hydrocarbons (PAH). Part 1, Hazard Identification and Dose-response Assessment: Report: Standards Development Branch, Ontario Ministry of Environment and Energy, 1997. 99 Palackal, N.T., S.H. Lee, R.G. Harvey, I.A. Blair, and T.M. Penning: Activation of Polycyclic Aromatic Hydrocarbon trans-Dihydrodiol Proximate Carcinogens by Human Aldo-keto Reductase (AKR1C) Enzymes and Their Functional Overexpression in Human Lung Carcinoma (A549) Cells. Journal of Biological Chemistry 277(27): 24799-24808 (2002). 100 Cooper, C.S., P.L. Grover, and P. Sims: The metabolism and activation of benzo[a]pyrene. In Progress in Drug Metabolism (Bridges, J. W., Chasseaud, L. F.), pp. 295-396, 1983. 101 Hemminki, K.: DNA adducts: identification and biological significance. International Agency for Research on Cancer (1994). 102 Yu, D., J.A. Berlin, T.M. Penning, and J. Field: Reactive oxygen species generated by PAH o-quinones cause change-in-function mutations in p53. Chemical research in toxicology 15(6): 832-842 (2002). 103 Canadian Cancer Society: "Canadian Cancer Statistics 2015". In Canadian Cancer Statistics. Toronto, ON: Canadian Cancer Society’s Advisory Committee on Cancer Statistics, 2015. 104 American Cancer Society: "Breast Cancer Facts & Figures 2013-2014". In American Cancer Society. Atlanta, GA, 2013. 105 Longnecker, M.P., L. Bernstein, A. Paganini-Hill, S.M. Enger, and R.K. Ross: Risk factors for in situ breast cancer. Cancer Epidemiology Biomarkers & Prevention 5(12): 961-965 (1996). 106 Trentham-Dietz, A., P.A. Newcomb, B.E. Storer, and P.L. Remington: Risk factors for carcinoma in situ of the breast. Cancer Epidemiology Biomarkers & Prevention 9(7): 697-703 (2000). 107 Schottenfeld, D., and J.F. Fraumeni Jr: Cancer Epidemiology and Prevention: Oxford University Press, USA, 2006.   132 108 Ernster, V.L., R. Ballard-Barbash, W.E. Barlow, Y. Zheng, D.L. Weaver, G. Cutter et al.: Detection of ductal carcinoma in situ in women undergoing screening mammography. Journal of the National Cancer Institute 94(20): 1546-1554 (2002). 109 Public Health Agency of Canada: "Organized Breast Cancer Screening Programs in Canada - Report on Program Performance in 2005 and 2006". Ottawa, Ontario, 2011. 110 Hayes, D.: Atlas of Breast Cancer: Mosby, 2000. 111 Yang, X.R., M.E. Sherman, D.L. Rimm, J. Lissowska, L.A. Brinton, B. Peplonska et al.: Differences in risk factors for breast cancer molecular subtypes in a population-based study. Cancer Epidemiology Biomarkers & Prevention 16(3): 439-443 (2007). 112 Bernstein, L.: Epidemiology of endocrine-related risk factors for breast cancer. Journal of mammary gland biology and neoplasia 7(1): 3-15 (2002). 113 Colditz, G.A., and B. Rosner: Cumulative risk of breast cancer to age 70 years according to risk factor status: data from the Nurses' Health Study. Am J Epidemiol 152(10): 950-964 (2000). 114 Rosner, B., G.A. Colditz, and W.C. Willett: Reproductive risk factors in a prospective study of breast cancer: the Nurses' Health Study. Am J Epidemiol 139(8): 819-835 (1994). 115 Collaborative Group on Hormonal Factors in Breast Cancer: Breast cancer and breastfeeding: collaborative reanalysis of individual data from 47 epidemiological studies in 30 countries, including 50 302 women with breast cancer and 96 973 women without the disease. Lancet 360(9328): 187-195 (2002). 116 Key, T.J., P.K. Verkasalo, and E. Banks: Epidemiology of breast cancer. The lancet oncology 2(3): 133-140 (2001). 117 McPherson, K., C. Steel, and J. Dixon: Breast cancer—epidemiology, risk factors, and genetics. BMJ 321(7261): 624-628 (2000). 118 Gammon, M.D., J.B. Schoenberg, J.A. Britton, J.L. Kelsey, R.J. Coates, D. Brogan et al.: Recreational physical activity and breast cancer risk among women under age 45 years. Am J Epidemiol 147(3): 273-280 (1998). 119 Hankinson, S.E., W.C. Willett, J.E. Manson, D.J. Hunter, G.A. Colditz, M.J. Stampfer et al.: Alcohol, height, and adiposity in relation to estrogen and prolactin levels in postmenopausal women. Journal of the National Cancer Institute 87(17): 1297-1302 (1995).   133 120 Smith-Warner, S.A., D. Spiegelman, S.-S. Yaun, P.A. van den Brandt, A.R. Folsom, R.A. Goldbohm et al.: Alcohol and breast cancer in women: a pooled analysis of cohort studies. Jama 279(7): 535-540 (1998). 121 Hall, J.M., M.K. Lee, B. Newman, J.E. Morrow, L.A. Anderson, B. Huey et al.: Linkage of early-onset familial breast cancer to chromosome 17q21. Science 250(4988): 1684-1689 (1990). 122 Easton, D., D. Ford, and J. Peto: Inherited susceptibility to breast cancer. Cancer surveys 18: 95-113 (1992). 123 Easton, D.F., D.T. Bishop, and D. Ford: Genetic linkage analysis in familial breast and ovarian cancer: results from 214 families. The Breast Cancer Linkage Consortium. American Journal of Human Genetics 52(4): 678-701 (1993). 124 Struewing, J.P., P. Hartge, S. Wacholder, S.M. Baker, M. Berlin, M. McAdams et al.: The risk of cancer associated with specific mutations of BRCA1 and BRCA2 among Ashkenazi Jews. New England Journal of Medicine 336(20): 1401-1408 (1997). 125 Ford, D., D.F. Easton, M. Stratton, S. Narod, D. Goldgar, P. Devilee et al.: Genetic Heterogeneity and Penetrance Analysis of the BRCA1 and BRCA2 Genes in Breast Cancer Families. American Journal of Human Genetics 62(3): 676-689 (1998). 126 Egan, K.M., M.J. Stampfer, D. Hunter, S. Hankinson, B.A. Rosner, M. Holmes et al.: Active and passive smoking in breast cancer: prospective results from the Nurses’ Health Study. Epidemiology 13(2): 138-145 (2002). 127 Collaborative Group on Hormonal Factors in Breast Cancer: Alcohol, tobacco and breast cancer--collaborative reanalysis of individual data from 53 epidemiological studies, including 58,515 women with breast cancer and 95,067 women without the disease. British journal of cancer 87(11): 1234-1245 (2002). 128 Terry, P.D., and T.E. Rohan: Cigarette smoking and the risk of breast cancer in women a review of the literature. Cancer Epidemiology Biomarkers & Prevention 11(10): 953-971 (2002). 129 White, A.J., P.T. Bradshaw, A.H. Herring, S.L. Teitelbaum, J. Beyea, S.D. Stellman et al.: Exposure to multiple sources of polycyclic aromatic hydrocarbons and breast cancer incidence. Environment International 89: 185-192 (2016). 130 Li, D.H., M.Y. Wang, K. Dhingra, and W.N. Hittelman: Aromatic DNA Adducts in Adjacent Tissues of Breast Cancer Patients: Clues to Breast Cancer Etiology. Cancer research 56(2): 287-293 (1996).   134 131 Gammon, M.D., R.M. Santella, A.I. Neugut, S.M. Eng, S.L. Teitelbaum, A. Paykin et al.: Environmental toxins and breast cancer on Long Island. I. Polycyclic aromatic hydrocarbon DNA adducts. Cancer Epidemiology Biomarkers & Prevention 11(8): 677-685 (2002). 132 Rundle, A., D. Tang, H. Hibshoosh, A. Estabrook, F. Schnabel, W.F. Cao et al.: The relationship between genetic damage from polycyclic aromatic hydrocarbons in breast tissue and breast cancer. Carcinogenesis 21(7): 1281-1289 (2000). 133 Mane, S.S., D.M. Purnell, and I.C. Hsu: Genotoxic effects of five polycyclic aromatic hydrocarbons in human and rat mammary epithelial cells. Environmental and molecular mutagenesis 15(2): 78-82 (1990). 134 Eldridge, S.R., M.N. Gould, and B.E. Butterworth: Genotoxicity of Environmental Agents in Human Mammary Epithelial Cells. Cancer research 52(20): 5617-5621 (1992). 135 Calaf, G., and J. Russo: Transformation of human breast epithelial cells by chemical carcinogens. Carcinogenesis 14(3): 483-492 (1993). 136 Larsen, M.C., W.G. Angus, P.B. Brake, S.E. Eltom, K.A. Sukow, and C.R. Jefcoate: Characterization of CYP1B1 and CYP1A1 expression in human mammary epithelial cells: role of the aryl hydrocarbon receptor in polycyclic aromatic hydrocarbon metabolism. Cancer research 58(11): 2366-2374 (1998). 137 Modica, R., M. Fiume, A. Guaitani, and I. Bartosek: Comparative kinetics of benz (a) anthracene, chrysene and triphenylene in rats after oral administration: I. Study with single compounds. Toxicology letters 18(1): 103-109 (1983). 138 Henderson, B.E., and H.S. Feigelson: Hormonal carcinogenesis. Carcinogenesis 21(3): 427-433 (2000). 139 Bulun, S.E., K. Zeitoun, H. Sasano, and E.R. Simpson: Aromatase in aging women. In Seminars in reproductive endocrinology, pp. 349-358, 1998. 140 Reichman, M.E., J.T. Judd, C. Longcope, A. Schatzkin, B.A. Clevidence, P.P. Nair et al.: Effects of alcohol consumption on plasma and urinary hormone concentrations in premenopausal women. Journal of the National Cancer Institute 85(9): 722-727 (1993). 141 Ohtake, F., K.-i. Takeyama, T. Matsumoto, H. Kitagawa, Y. Yamamoto, K. Nohara et al.: Modulation of oestrogen receptor signalling by association with the activated dioxin receptor. Nature 423(6939): 545-550 (2003). 142 Chaloupka, K., V. Krishnan, and S. Safe: Polynuclear aromatic hydrocarbon carcinogens as antiestrogens in MCF-7 human breast cancer cells: role of the Ah receptor. Carcinogenesis 13(12): 2233-2239 (1992).   135 143 Santodonato, J.: Review of the estrogenic and antiestrogenic activity of polycyclic aromatic hydrocarbons: relationship to carcinogenicity. Chemosphere 34(4): 835-848 (1997). 144 Arcaro, K.F., P.W. O’Keefe, Y. Yang, W. Clayton, and J.F. Gierthy: Antiestrogenicity of environmental polycyclic aromatic hydrocarbons in human breast cancer cells. Toxicology 133(2): 115-127 (1999). 145 Cavalieri, E., D. Stack, P. Devanesan, R. Todorovic, I. Dwivedy, S. Higginbotham et al.: Molecular origin of cancer: catechol estrogen-3, 4-quinones as endogenous tumor initiators. Proceedings of the National Academy of Sciences 94(20): 10937-10942 (1997). 146 Liehr, J.G.: Is Estradiol a Genotoxic Mutagenic Carcinogen? 1. Endocrine reviews 21(1): 40-54 (2000). 147 Bennett, I.C., M. Gattas, and B.T. Teh: The genetic basis of breast cancer and its clinical implications. Australian and New Zealand journal of surgery 69(2): 95-105 (1999). 148 Swift, M., D. Morrell, R.B. Massey, and C.L. Chase: Incidence of cancer in 161 families affected by ataxia–telangiectasia. New England Journal of Medicine 325(26): 1831-1836 (1991). 149 Thompson, D., S. Duedal, J. Kirner, L. McGuffog, J. Last, A. Reiman et al.: Cancer risks and mortality in heterozygous ATM mutation carriers. Journal of the National Cancer Institute 97(11): 813-822 (2005). 150 Li, J., C. Yen, D. Liaw, K. Podsypanina, S. Bose, S.I. Wang et al.: PTEN, a putative protein tyrosine phosphatase gene mutated in human brain, breast, and prostate cancer. Science 275(5308): 1943-1947 (1997). 151 Foulkes, W.D., J. Rosenblatt, and P.O. Chappuis: The contribution of inherited factors to the clinicopathological features and behavior of breast cancer. Journal of mammary gland biology and neoplasia 6(4): 453-465 (2001). 152 Shimada, T., J. Watanabe, K. Kawajiri, T.R. Sutter, P.D. Inskip, E.M.J. Gillam et al.: Catalytic properties of polymorphic human cytochrome P450 1B1 variants. Carcinogenesis 20(8): 1607-1613 (1999). 153 Hanna, I.H., S. Dawling, N. Roodi, F.P. Guengerich, and F.F. Parl: Cytochrome P450 1B1 (CYP1B1) Pharmacogenetics: Association of Polymorphisms with Functional Differences in Estrogen Hydroxylation Activity. Cancer research 60(13): 3440-3444 (2000).   136 154 Badawi, A.F., E.L. Cavalieri, and E.G. Rogan: Role of human cytochrome P450 1A1, 1A2, 1B1, and 3A4 in the 2-, 4-, and 16[alpha]-hydroxylation of 17[beta]-estradiol. Metabolism 50(9): 1001-1003 (2001). 155 Li, D.N., A. Seidel, M.P. Pritchard, C.R. Wolf, and T. Friedberg: Polymorphisms in P450 CYP1B1 affect the conversion of estradiol to the potentially carcinogenic metabolite 4-hydroxyestradiol. Pharmacogenetics 10(4): 343-353 (2000). 156 Shimada, T., and Y. Fujii‐Kuriyama: Metabolic activation of polycyclic aromatic hydrocarbons to carcinogens by cytochromes P450 1A1 and1B1. Cancer science 95(1): 1-6 (2004). 157 Uno, S., T.P. Dalton, S. Derkenne, C.P. Curran, M.L. Miller, H.G. Shertzer et al.: Oral exposure to benzo [a] pyrene in the mouse: detoxication by inducible cytochrome P450 is more important than metabolic activation. Molecular pharmacology 65(5): 1225-1237 (2004). 158 Uno, S., T.P. Dalton, N. Dragin, C.P. Curran, S. Derkenne, M.L. Miller et al.: Oral benzo [a] pyrene in Cyp1 knockout mouse lines: CYP1A1 important in detoxication, CYP1B1 metabolism required for immune damage independent of total-body burden and clearance rate. Molecular pharmacology 69(4): 1103-1114 (2006). 159 Hunter, D.J.: Gene–environment interactions in human diseases. Nature Reviews Genetics 6(4): 287-298 (2005). 160 MacMahon, B.: Gene-environment Interaction in Human Disease. Journal of Psychiatric Research 6(Supplemental 1): 393-402 (1968). 161 Kelada, S.N., D.L. Eaton, S.S. Wang, N.R. Rothman, and M.J. Khoury: The role of genetic polymorphisms in environmental health. Environmental Health Perspective 8(1055): 1064 (2003). 162 Yu, W., M. Gwinn, M. Clyne, A. Yesupriya, and M.J. Khoury: A navigator for human genome epidemiology. Nature genetics 40(2): 124-125 (2008). 163 Centers for Disease Control and Prevention: "Public Health Genomics Knowledge Base (v1.2): HuGE Literature Finder." [Online] Available at https://phgkb.cdc.gov/HuGENavigator/startPagePubLit.do (Accessed September 27, 2016). 164 Aronson, K.J., S. Campbell, J. Faith, C. Friedenreich, M. Goldberg, M.-G. Hollm et al.: "Review of lifestyle and environmental risk factors for breast cancer: Report of the working group on the primary prevention of breast cancer", 2001.   137 165 Ford, D., D.F. Easton, and J. Peto: Estimates of the gene frequency of BRCA1 and its contribution to breast and ovarian cancer incidence. American Journal of Human Genetics 57(6): 1457-1462 (1995). 166 Thorlacius, S., J. Struewing, P. Hartage, G.H. Olafsdottir, H. Sigvaldason, L. Tryggvadottir et al.: Population-based study of risk of breast cancer in carriers of BRCA2 mutation. The lancet 352(9137): 1337-1339 (1998). 167 Mucci, L.A., S. Wedren, R.M. Tamimi, D. Trichopoulos, and H.O. Adami: The role of gene-environment interaction in the aetiology of human cancer: examples from cancers of the large bowel, lung and breast. Journal of Internal Medicine 249(6): 477-493 (2001). 168 Khoury, M.J.: Genetic epidemiology. In Modern Epidemiology, K.J. Rothman and S. Greenland (eds.), pp. 609-622. Philadelphia, PA: Lippincott-Raven, 1998. 169 Goldman, L.R., M. Gomez, S. Greenfield, L. Hall, B.S. Hulka, W.E. Kaye et al.: Use of exposure databases for status and trends analysis. Arch Environ Health 47(6): 430-438 (1992). 170 LaMontagne, A.D., R.F. Herrick, M.V. Van Dyke, J.W. Martyny, and A.J. Ruttenber: Exposure databases and exposure surveillance: promise and practice. AIHA J (Fairfax, Va) 63(2): 205-212 (2002). 171 Stewart, P.A., and C. Rice: A Source of Exposure Data for Occupational Epidemiology Studies. Applied Occupational and Environmental Hygiene 5(6): 359-363 (1990). 172 Lavoue, J., M.C. Friesen, and I. Burstyn: Workplace measurements by the U.S. Occupational Safety and Health Administration since 1979: Descriptive analysis and potential uses for exposure assessment. Ann Occup Hyg 57(5): 681-683 (2013). 173 Hamm, M.P., and I. Burstyn: Estimating occupational beryllium exposure from compliance monitoring data. Arch Environ Occup Health 66(2): 75-86 (2011). 174 Henneberger, P.K., S.K. Goe, W.E. Miller, B. Doney, and D.W. Groce: Industries in the United States with airborne beryllium exposure and estimates of the number of current workers potentially exposed. J Occup Environ Hyg 1(10): 648-659 (2004). 175 Lavoue, J., R. Vincent, and M. Gerin: Formaldehyde exposure in U.S. industries from OSHA air sampling data. J Occup Environ Hyg 5(9): 575-587 (2008). 176 Olsen, E., B. Laursen, and P.S. Vinzents: Bias and random errors in historical data of exposure to organic solvents. Am Ind Hyg Assoc J 52(5): 204-211 (1991). 177 ACGIH: TLVs and BEIs—Threshold limit values for chemical substances and physical agents biological exposure indices: ACGIH Worldwide ISBN, 2008.   138 178 Friesen, M.C., H.W. Davies, K. Teschke, S. Marion, and P.A. Demers: Predicting historical dust and wood dust exposure in sawmills: model development and validation. J Occup Environ Hyg 2(12): 650-658 (2005). 179 Occupational Safety and Health Administration: "Chemical Exposure Health Data." [Online] Available at https://www.osha.gov/opengov/healthsamples.html (Accessed November 11, 2011). 180 Occupational Safety and Health Administration: "SIC Manual." [Online] Available at http://www.osha.gov/pls/imis/sic_manual.html (Accessed September 17, 2014). 181 Cohen, J.: The Cost of Dichotomization. Applied Psychological Measurement 7(3): 249-253 (1983). 182 Sterne, J.A., I.R. White, J.B. Carlin, M. Spratt, P. Royston, M.G. Kenward et al.: Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls. BMJ 338: b2393 (2009). 183 Shentu, Y., and M. Xie: A note on dichotomization of continuous response variable in the presence of contamination and model misspecification. Stat Med 29(21): 2200-2214 (2010). 184 United States Census Bureau: "North American Industry Classification System - Concordances." [Online] Available at https://www.census.gov/eos/www/naics/concordances/concordances.html (Accessed December 20, 2014). 185 Burstyn, I., A. Slutsky, D.G. Lee, A.B. Singer, Y. An, and Y.L. Michael: Beyond Crosswalks: Reliability of Exposure Assessment Following Automated Coding of Free-Text Job Descriptions for Occupational Epidemiology. Ann Occup Hyg (2014). 186 Pinheiro, J.C., and D.M. Bates: Mixed-effects models in S and S-PLUS. New York: Springer, 2000. 187 Menard, S.: Applied logistic regression analysis: Sage, 2002. 188 Bates, D., M. Maechler, and B. Bolker: lme4: Linear mixed-effects models using S4 classes(2012). 189 Faraway, J.J.: Extending the linear model with R : generalized linear, mixed effects and nonparametric regression models. Boca Raton: Chapman & Hall/CRC, 2006. 190 Bland, J.M., and D.G. Altman: Statistical methods for assessing agreement between two methods of clinical measurement. Lancet 1(8476): 307-310 (1986).   139 191 Sauvain, J.J., T. Vu Duc, and M. Guillemin: Exposure to carcinogenic polycyclic aromatic compounds and health risk assessment for diesel-exhaust exposed workers. Int Arch Occup Environ Health 76(6): 443-455 (2003). 192 Fallentin, B., and O. Kamstrup: Simulation of past exposure in slag wool production. Ann Occup Hyg 37(4): 419-433 (1993). 193 Hueper, W.C.: Occupational cancer hazards found in industry. Ind Hyg Newsl 9(12): 7-9 (1949). 194 Kuljukka, T., R. Vaaranrinta, T. Veidebaum, M. Sorsa, and K. Peltonen: Exposure to PAH compounds among cokery workers in the oil shale industry. Environ Health Perspect 104 Suppl 3: 539-541 (1996). 195 Seidel, A., D. Dahmann, H. Krekeler, and J. Jacob: Biomonitoring of polycyclic aromatic compounds in the urine of mining workers occupationally exposed to diesel exhaust. Int J Hyg Environ Health 204(5-6): 333-338 (2002). 196 Fretheim, K.: Carcinogenic polycyclic aromatic hydrocarbons in Norwegian smoked meat sausages. J Agric Food Chem 24(5): 976-979 (1976). 197 Tuchsen, F., and L. Nordholm: Respiratory cancer in Danish bakers: a 10 year cohort study. Br J Ind Med 43(8): 516-521 (1986). 198 Zaebst, D.D., D.E. Clapp, L.M. Blade, D.A. Marlow, K. Steenland, R.W. Hornung et al.: Quantitative determination of trucking industry workers' exposures to diesel exhaust particles. Am Ind Hyg Assoc J 52(12): 529-541 (1991). 199 Burstyn, I.: Measurement error and model specification in determining how duration of tasks affects level of occupational exposure. Ann Occup Hyg 53(3): 265-270 (2009). 200 Checkoway, H., N. Pearce, and D. Kriebel: Research methods in occupational epidemiology. New York: Oxford University Press, 2004. 201 Kim, H.M., Y. Yasui, and I. Burstyn: Attenuation in risk estimates in logistic and Cox proportional-hazards models due to group-based exposure assessment strategy. Ann Occup Hyg 50(6): 623-635 (2006). 202 Gustafson, P.: Measurement error and misclassification in statistics and epidemiology : impacts and Bayesian adjustments. Boca Raton: Chapman & Hall/CRC, 2004. 203 Burstyn, I., J. Lavoue, and M. Van Tongeren: Aggregation of exposure level and probability into a single metric in job-exposure matrices creates bias. Ann Occup Hyg 56(9): 1038-1050 (2012).   140 204 Hopf, N.B., T. Carreon, and G. Talaska: Biological markers of carcinogenic exposure in the aluminum smelter industry--a systematic review. J Occup Environ Hyg 6(9): 562-581 (2009). 205 Tjoe Ny, E., D. Heederik, H. Kromhout, and F. Jongeneelen: The relationship between polycyclic aromatic hydrocarbons in air and in urine of workers in a Soderberg potroom. Am Ind Hyg Assoc J 54(6): 277-284 (1993). 206 Sims, P., and P. Grover: Epoxides in polycyclic aromatic hydrocarbon metabolism and carcinogenesis. Advances in cancer research 20: 165-274 (1974). 207 Conney, A.H.: Induction of microsomal enzymes by foreign chemicals and carcinogenesis by polycyclic aromatic hydrocarbons: GHA Clowes Memorial Lecture. Cancer research 42(12): 4875-4917 (1982). 208 Baird, W.M., L.A. Hooven, and B. Mahadevan: Carcinogenic polycyclic aromatic hydrocarbon‐DNA adducts and mechanism of action. Environmental and molecular mutagenesis 45(2‐3): 106-114 (2005). 209 Shimada, T.: Xenobiotic-metabolizing enzymes involved in activation and detoxification of carcinogenic polycyclic aromatic hydrocarbons. Drug metabolism and pharmacokinetics 21(4): 257-276 (2006). 210 Rendic, S., and F.J.D. Carlo: Human cytochrome P450 enzymes: a status report summarizing their reactions, substrates, inducers, and inhibitors. Drug metabolism reviews 29(1-2): 413-580 (1997). 211 Anzenbacher, P., and E. Anzenbacherova: Cytochromes P450 and metabolism of xenobiotics. Cellular and Molecular Life Sciences CMLS 58(5-6): 737-747 (2001). 212 Bonner, M.R., D. Han, J. Nie, P. Rogerson, J.E. Vena, P. Muti et al.: Breast cancer risk and exposure in early life to polycyclic aromatic hydrocarbons using total suspended particulates as a proxy measure. Cancer Epidemiology Biomarkers & Prevention 14(1): 53-60 (2005). 213 Gammon, M.D., S.K. Sagiv, S.M. Eng, S. Shantakumar, M.M. Gaudet, S.L. Teitelbaum et al.: Polycyclic aromatic hydrocarbon–DNA adducts and breast cancer: a pooled analysis. Archives of Environmental Health: An International Journal 59(12): 640-649 (2004). 214 Nie, J., J. Beyea, M.R. Bonner, D. Han, J.E. Vena, P. Rogerson et al.: Exposure to traffic emissions throughout life and risk of breast cancer: the Western New York Exposures and Breast Cancer (WEB) study. Cancer Causes & Control 18(9): 947-955 (2007).   141 215 Boffetta, P., N. Jourenkova, and P. Gustavsson: Cancer risk from occupational and environmental exposure to polycyclic aromatic hydrocarbons. Cancer Causes & Control 8(3): 444-472 (1997). 216 Bosetti, C., P. Boffetta, and C. La Vecchia: Occupational exposures to polycyclic aromatic hydrocarbons, and respiratory and urinary tract cancers: a quantitative review to 2005. Annals of Oncology 18(3): 431-446 (2007). 217 Petralia, S.A., J.E. Vena, J.L. Freudenheim, M. Dosemeci, A. Michalek, M.S. Goldberg et al.: Risk of premenopausal breast cancer in association with occupational exposure to polycyclic aromatic hydrocarbons and benzene. Scand J Work Environ Health: 215-221 (1999). 218 Hansen, J.: Elevated risk for male breast cancer after occupational exposure to gasoline and vehicular combustion products. Am J Ind Med 37(4): 349-352 (2000). 219 Labreche, F., M.S. Goldberg, M.-F. Valois, and L. Nadon: Postmenopausal breast cancer and occupational exposures. Occup Environ Med 67(4): 263-269 (2010). 220 Grundy, A., H. Richardson, I. Burstyn, C. Lohrisch, S.K. SenGupta, A.S. Lai et al.: Increased risk of breast cancer associated with long-term shift work in Canada. Occup Environ Med 70(12): 831-838 (2013). 221 Kobayashi, L.C., I. Janssen, H. Richardson, A.S. Lai, J.J. Spinelli, and K.J. Aronson: Moderate-to-vigorous intensity physical activity across the life course and risk of pre-and post-menopausal breast cancer. Breast cancer research and treatment 139(3): 851-861 (2013). 222 Grundy, A., J.M. Schuetz, A.S. Lai, R. Janoo-Gilani, S. Leach, I. Burstyn et al.: Shift work, circadian gene variants and risk of breast cancer. Cancer epidemiology 37(5): 606-612 (2013). 223 Shi, J., K.J. Aronson, A. Grundy, L.C. Kobayashi, I. Burstyn, J.M. Schuetz et al.: Polymorphisms of insulin-like growth factor 1 pathway genes and breast cancer risk. Frontiers in Oncology 6: 136 (2016). 224 Grundy, A., H. Richardson, J.M. Schuetz, I. Burstyn, J.J. Spinelli, A. Brooks‐Wilson et al.: DNA repair variants and breast cancer risk. Environmental and molecular mutagenesis 57(4): 269-281 (2016). 225 Friedenreich, C.M., K.S. Courneya, and H.E. Bryant: Influence of physical activity in different age and life periods on the risk of breast cancer. Epidemiology 12(6): 604-612 (2001).   142 226 Gomez, M., P. Cocco, M. Dosemeci, and P. Stewart: Occupational exposure to chlorinated aliphatic hydrocarbons: job exposure matrix. Am J Ind Med 26(2): 171-183 (1994). 227 Peters, S., R. Vermeulen, A. Cassidy, A.t. Mannetje, M. van Tongeren, P. Boffetta et al.: Comparison of exposure assessment methods for occupational carcinogens in a multi-centre lung cancer case–control study. Occup Environ Med 68(2): 148-153 (2011). 228 Slutsky, A., Y. An, T. Hu, and I. Burstyn: Automatic approaches to clustering occupational description data for prediction of probability of workplace exposure to beryllium. In Granular Computing (GrC), 2011 IEEE International Conference on, pp. 596-601: IEEE, 2011. 229 Peters, S., H. Kromhout, A.C. Olsson, H.-E. Wichmann, I. Brüske, D. Consonni et al.: Occupational exposure to organic dust increases lung cancer risk in the general population. Thorax: thoraxjnl-2011-200716 (2011). 230 Rothman, K.J., S. Greenland, and T.L. Lash: Modern epidemiology: Lippincott Williams & Wilkins, 2008. 231 Gwet, K.L.: Handbook of inter-rater reliability: The definitive guide to measuring the extent of agreement among raters: Advanced Analytics, LLC, 2014. 232 Pharoah, P.D., N.E. Day, S. Duffy, D.F. Easton, and B.A. Ponder: Family history and the risk of breast cancer: a systematic review and meta‐analysis. International Journal of Cancer 71(5): 800-809 (1997). 233 Ward, E., A. Jemal, V. Cokkinides, G.K. Singh, C. Cardinez, A. Ghafoor et al.: Cancer disparities by race/ethnicity and socioeconomic status. CA: a cancer journal for clinicians 54(2): 78-93 (2004). 234 Burstyn, I., F. de Vocht, and P. Gustafson: What do measures of agreement (κ) tell us about quality of exposure assessment? Theoretical analysis and numerical simulation. BMJ open 3(12): e003952 (2013). 235 Hoar, S.: Job exposure matrix methodology. Clinical Toxicology 21(1-2): 9-26 (1983). 236 Burstyn, I., J. Lavoué, and M. Van Tongeren: Aggregation of exposure level and probability into a single metric in job-exposure matrices creates bias. Annals of occupational hygiene 56(9): 1038-1050 (2012). 237 Williams, J.A., and D.H. Phillips: Mammary expression of xenobiotic metabolizing enzymes and their potential role in breast cancer. Cancer research 60(17): 4667-4677 (2000).   143 238 Kristensen, V.N., N. Harada, N. Yoshimura, E. Haraldsen, P. Lønning, B. Erikstein et al.: Genetic variants of CYP19 (aromatase) and breast cancer risk. Breast Cancer Res 2(Suppl 1): 1-2 (2000). 239 Zheng, W., D.-W. Xie, F. Jin, J.-R. Cheng, Q. Dai, W.-Q. Wen et al.: Genetic polymorphism of cytochrome P450–1B1 and risk of breast cancer. Cancer Epidemiology Biomarkers & Prevention 9(2): 147-150 (2000). 240 Ambrosone, C.B., J.L. Freudenheim, S. Graham, J.R. Marshall, J.E. Vena, J.R. Brasure et al.: Cigarette Smoking, N-Acetyltransferase 2 Genetic Polymorphisms, and Breast Cancer Risk. Journal of the American Medical Association 276(18): 1494-1501 (1996). 241 Ishibe, N., S.E. Hankinson, G.A. Colditz, D. Spiegelman, W.C. Willett, F.E. Speizer et al.: Cigarette smoking, cytochrome P450 1A1 polymorphisms, and breast cancer risk in the Nurses' Health Study. Cancer research 58(4): 667-671 (1998). 242 Zheng, W., A.C. Deitz, D.R. Campbell, W.-Q. Wen, J.R. Cerhan, T.A. Sellers et al.: N-acetyltransferase 1 genetic polymorphism, cigarette smoking, well-done meat intake, and breast cancer risk. Cancer Epidemiology Biomarkers & Prevention 8(3): 233-239 (1999). 243 Rundle, A., D. Tang, J. Zhou, S. Cho, and F. Perera: The association between glutathione S-transferase M1 genotype and polycyclic aromatic hydrocarbon-DNA adducts in breast tissue. Cancer Epidemiology Biomarkers & Prevention 9(10): 1079-1085 (2000). 244 Firozi, P.F., M.L. Bondy, A.A. Sahin, P. Chang, F. Lukmanji, E.S. Singletary et al.: Aromatic DNA adducts and polymorphisms of CYP1A1, NAT2, and GSTM1 in breast cancer. Carcinogenesis 23(2): 301-306 (2002). 245 Terry, M.B., M.D. Gammon, F.F. Zhang, S.M. Eng, S.K. Sagiv, A.B. Paykin et al.: Polymorphism in the DNA repair gene XPD, polycyclic aromatic hydrocarbon-DNA adducts, cigarette smoking, and breast cancer risk. Cancer Epidemiology Biomarkers & Prevention 13(12): 2053-2058 (2004). 246 Shen, J., M.D. Gammon, M.B. Terry, L. Wang, Q. Wang, F. Zhang et al.: Polymorphisms in XRCC1 modify the association between polycyclic aromatic hydrocarbon-DNA adducts, cigarette smoking, dietary antioxidants, and breast cancer risk. Cancer Epidemiology Biomarkers & Prevention 14(2): 336-342 (2005). 247 Crew, K.D., M.D. Gammon, M.B. Terry, F.F. Zhang, L.B. Zablotska, M. Agrawal et al.: Polymorphisms in nucleotide excision repair genes, polycyclic aromatic hydrocarbon-DNA adducts, and breast cancer risk. Cancer Epidemiology Biomarkers & Prevention 16(10): 2033-2041 (2007).   144 248 Nilsson, S., S. Mäkelä, E. Treuter, M. Tujague, J. Thomsen, G. Andersson et al.: Mechanisms of estrogen action. Physiological reviews 81(4): 1535-1565 (2001). 249 de Bakker, P.I., R. Yelensky, I. Pe'er, S.B. Gabriel, M.J. Daly, and D. Altshuler: Efficiency and power in genetic association studies. Nature genetics 37(11): 1217-1223 (2005). 250 Barrett, J.C., B. Fry, J. Maller, and M.J. Daly: Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics 21(2): 263-265 (2005). 251 Purcell, S., B. Neale, K. Todd-Brown, L. Thomas, M.A. Ferreira, D. Bender et al.: PLINK: a tool set for whole-genome association and population-based linkage analyses. The American Journal of Human Genetics 81(3): 559-575 (2007). 252 Abecasis, G.R., S.S. Cherny, W. Cookson, and L.R. Cardon: GRR: graphical representation of relationship errors. Bioinformatics 17(8): 742-743 (2001). 253 Consortium, I.H.: Integrating common and rare genetic variation in diverse human populations. Nature 467(7311): 52-58 (2010). 254 Devlin, B., K. Roeder, and L. Wasserman: Genomic control for association studies: a semiparametric test to detect excess-haplotype sharing. Biostatistics 1(4): 369-387 (2000). 255 Devlin, B., K. Roeder, and L. Wasserman: Genomic control, a new approach to genetic-based association studies. Theoretical population biology 60(3): 155-166 (2001). 256 Devlin, B., S.-A. Bacanu, and K. Roeder: Genomic control to the extreme. Nature genetics 36(11): 1129-1130 (2004). 257 Devlin, B., and K. Roeder: Genomic control for association studies. Biometrics 55(4): 997-1004 (1999). 258 Thomas, D.C., and J.S. Witte: Point: population stratification: a problem for case-control studies of candidate-gene associations? Cancer Epidemiology Biomarkers & Prevention 11(6): 505-512 (2002). 259 Anderson, C.A., F.H. Pettersson, G.M. Clarke, L.R. Cardon, A.P. Morris, and K.T. Zondervan: Data quality control in genetic case-control association studies. Nature protocols 5(9): 1564-1573 (2010). 260 Schuetz, J.M., D. Daley, J. Graham, B.R. Berry, R.P. Gallagher, J.M. Connors et al.: Genetic variation in cell death genes and risk of non-Hodgkin lymphoma. PLoS One 7(2): e31560 (2012).   145 261 Benjamini, Y., and Y. Hochberg: Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society. Series B (Methodological): 289-300 (1995). 262 Zhang, L., Y. Jin, M. Huang, and T.M. Penning: The Role of Human Aldo-Keto Reductases (AKRs) in the Metabolic Activation and Detoxication of Polycyclic Aromatic Hydrocarbons: Interconversion of PAH-catechols and PAH o-Quinones. Frontiers in Pharmacology 3(2012). 263 Reding, K.W., C.I. Li, N.S. Weiss, C. Chen, C.S. Carlson, D. Duggan et al.: Genetic variation in the progesterone receptor and metabolism pathways and hormone therapy in relation to breast cancer risk. Am J Epidemiol: kwp298 (2009). 264 Lin, H.-K., S. Steckelbroeck, K.-M. Fung, A.N. Jones, and T.M. Penning: Characterization of a monoclonal antibody for human aldo-keto reductase AKR1C3 (type 2  α-hydroxysteroid dehydrogenase/type 5 17β-hydroxysteroid dehydrogenase); immunohistochemical detection in breast and prostate. Steroids 69(13): 795-801 (2004). 265 Byrns, M.C., S. Steckelbroeck, and T.M. Penning: An indomethacin analogue, N-(4-chlorobenzoyl)-melatonin, is a selective inhibitor of aldo-keto reductase 1C  (type 2  α-HSD, type 5 17β-HSD, and prostaglandin F synthase), a potential target for the treatment of hormone dependent and hormone independent malignancies. Biochemical pharmacology 75(2): 484-493 (2008). 266 Penning, T.M., and M.C. Byrns: Steroid Hormone Transforming Aldo‐Keto Reductases and Cancer. Annals of the New York Academy of Sciences 1155(1): 33-42 (2009). 267 Hein, R., S. Abbas, P. Seibold, R. Salazar, D. Flesch-Janys, and J. Chang-Claude: Polymorphism Thr160Thr in SRD5A1, involved in the progesterone metabolism, modifies postmenopausal breast cancer risk associated with menopausal hormone therapy. Breast cancer research and treatment 131(2): 653-661 (2012). 268 Lord, S.J., W.J. Mack, D. Van Den Berg, M.C. Pike, S.A. Ingles, C.A. Haiman et al.: Polymorphisms in genes involved in estrogen and progesterone metabolism and mammographic density changes in women randomized to postmenopausal hormone therapy: results from a pilot study. Breast Cancer Research 7(3): 1 (2005). 269 McCormack, V.A., and I. dos Santos Silva: Breast density and parenchymal patterns as markers of breast cancer risk: a meta-analysis. Cancer Epidemiology Biomarkers & Prevention 15(6): 1159-1169 (2006). 270 Justenhoven, C., U. Hamann, C.B. Pierl, C. Baisch, V. Harth, S. Rabstein et al.: CYP2C19* 17 is associated with decreased breast cancer risk. Breast cancer research and treatment 115(2): 391-396 (2009).   146 271 Justenhoven, C., O. Obazee, S. Winter, F.J. Couch, J.E. Olson, P. Hall et al.: The postmenopausal hormone replacement therapy-related breast cancer risk is decreased in women carrying the CYP2C19* 17 variant. Breast Cancer Res Treat 131(1): 347-50 (2012). 272 Chen, G.G., Q. Zeng, and G.M. Tse: Estrogen and its receptors in cancer. Medicinal research reviews 28(6): 954-974 (2008). 273 Wang, J., R. Higuchi, F. Modugno, J. Li, N. Umblas, J. Lee et al.: Estrogen receptor alpha haplotypes and breast cancer risk in older Caucasian women. Breast cancer research and treatment 106(2): 273-280 (2007). 274 Hein, D.W.: N-Acetyltransferase genetics and their role in predisposition to aromatic and heterocyclic amine-induced carcinogenesis. Toxicology letters 112: 349-356 (2000). 275 Hein, D.W., M.A. Doll, A.J. Fretland, M.A. Leff, S.J. Webb, G.H. Xiao et al.: Molecular genetics and epidemiology of the NAT1 and NAT2 acetylation polymorphisms. Cancer Epidemiology Biomarkers & Prevention 9(1): 29-42 (2000). 276 Burczynski, M.E., R.G. Harvey, and T.M. Penning: Expression and characterization of four recombinant human dihydrodiol dehydrogenase isoforms: oxidation of trans-7, 8-dihydroxy-7, 8-dihydrobenzo [a] pyrene to the activated o-quinone metabolite benzo [a] pyrene-7, 8-dione. Biochemistry 37(19): 6781-6790 (1998). 277 Burczynski, M.E., G.R. Sridhar, N.T. Palackal, and T.M. Penning: The reactive oxygen species-and Michael acceptor-inducible human aldo-keto reductase AKR1C1 reduces the α, β-unsaturated aldehyde 4-hydroxy-2-nonenal to 1, 4-dihydroxy-2-nonene. Journal of Biological Chemistry 276(4): 2890-2897 (2001). 278 Jin, Y., and T.M. Penning: Aldo-keto reductases and bioactivation/detoxication. Annu. Rev. Pharmacol. Toxicol. 47: 263-292 (2007). 279 Dong, L.M., J.D. Potter, E. White, C.M. Ulrich, L.R. Cardon, and U. Peters: Genetic susceptibility to cancer: the role of polymorphisms in candidate genes. Jama 299(20): 2423-2436 (2008). 280 Cox, D.G., J. Buring, S.E. Hankinson, and D.J. Hunter: A polymorphism in the  ′ untranslated region of the gene encoding prostaglandin endoperoxide synthase 2 is not associated with an increase in breast cancer risk: a nested case-control study. Breast Cancer Res 9(1): R3 (2007). 281 Brueggemeier, R.W., J.A. Richards, and T.A. Petrel: Aromatase and cyclooxygenases: enzymes in breast cancer. J. Steroid Biochem. Mol. Biol. 86(3): 501-507 (2003). 282 Hwang, D., J. Byrne, D. Scollard, and E. Levine: Expression of cyclooxygenase-1 and cyclooxygenase-2 in human breast cancer. J Natl Cancer Inst 90(6): 455-460 (1998).   147 283 Soslow, R.A., A.J. Dannenberg, D. Rush, B. Woerner, K.N. Khan, J. Masferrer et al.: COX‐2 is expressed in human pulmonary, colonic, and mammary tumors. Cancer 89(12): 2637-2645 (2000). 284 Liu, C.H., S.-H. Chang, K. Narko, O.C. Trifan, M.-T. Wu, E. Smith et al.: Overexpression of cyclooxygenase-2 is sufficient to induce tumorigenesis in transgenic mice. Journal of Biological Chemistry 276(21): 18563-18569 (2001). 285 Gustafson, P., and I. Burstyn: Bayesian inference of gene–environment interaction from incomplete data: What happens when information on environment is disjoint from data on gene and disease? Stat Med 30(8): 877-889 (2011). 286 Burstyn, I., H.-M. Kim, Y. Yasui, and N.M. Cherry: The virtues of a deliberately mis-specified disease model in demonstrating a gene-environment interaction. Occup Environ Med 66(6): 374-380 (2009). 287 Wang, S., M. Zhao, J. Xing, Y. Wu, Y. Zhou, Y. Lei et al.: Quantifying the air pollutants emission reduction during the 2008 Olympic Games in Beijing. Environmental Science & Technology 44(7): 2490-2496 (2010). 288 Sarazin, P., L. Kincl, I. Burstyn, and J. Lavoué: 0385 Bias in Exposure Assessment from Worst-Case Selection of Workplaces in OSHA’s  ntegrated anagement  nformation System Databank IMIS. Occup Environ Med 71(Suppl 1): A49-A49 (2014). 289 US Department of Labor: "Most Common Occupations for Women." [Online] Available at https://www.dol.gov/wb/stats/most_common_occupations_for_women.htm (Accessed September 27, 2016). 290 Attia, J., A. Thakkinstian, and C. D'Este: Meta-analyses of molecular association studies: methodologic lessons for genetic epidemiology. J. Clin. Epidemiol 56(4): 297-303 (2003). 291 Kraft, P., Y.-C. Yen, D.O. Stram, J. Morrison, and W.J. Gauderman: Exploiting gene-environment interaction to detect genetic associations. Human heredity 63(2): 111-19 (2007). 292 Williamson, E., A.L. Ponsonby, J. Carlin, and T. Dwyer: Effect of including environmental data in investigations of gene‐disease associations in the presence of qualitative interactions. Genetic Epidemiology 34(6): 552-560 (2010). 293 Luo, H., I. Burstyn, and P. Gustafson: Investigations of Gene–Disease Associations: Costs and Benefits of Environmental Data. Epidemiology 24(4): 562-568 (2013). 294 Liu, J., P. Gustafson, N. Cherry, and I. Burstyn: Bayesian analysis of a matched case–control study with expert prior information on both the misclassification of exposure and the exposure–disease association. Stat Med 28(27): 3411-3423 (2009).  148 Appendices Supplementary Table A1: Descriptive Statistics of Concentrations of Coal Tar Pitch Volatiles (mg·m-3) in OSHA data by regrouped SIC Code   TWA Measurements All Data SIC Label N1 Mean (SD)1 Median (IQR)1 Geo Mean (Geo SD)1 N2 EF2 13 Oil and Gas Extraction 2 0.60 (0.29) 0.60 (0.20) 0.56 (1.42) 2 100.0 1611 Highway and Street Construction, Except Elevated Highways 21 0.11 (0.10) 0.06 (0.05) 0.09 (2.08) 32 12.5 1622 Bridge, Tunnel, and Elevated Highway Construction 8 0.25 (0.38) 0.11 (0.10) 0.13 (2.59) 9 22.2 1623 Water, Sewer, Pipeline, and Communications and Power Line Construction 33 1.84 (2.40) 1.38 (1.51) 0.90 (3.78) 33 87.9 1629 Heavy Construction, Not Elsewhere Classified 5 0.13 (0.05) 0.10 (0.04) 0.12 (1.40) 12 8.3 1711 Plumbing, Heating, and Air-Conditioning 8 1.36 (1.68) 0.59 (1.53) 0.65 (3.54) 8 87.5 1721 Painting and Paper Hanging 9 0.34 (0.75) 0.08 (0.07) 0.09 (4.14) 12 16.7 174 Masonry, Stonework, Tile Setting, and Plastering 7 0.15 (0.09) 0.13 (0.14) 0.13 (1.84) 8 37.5 1761 Roofing, Siding, and Sheet Metal Work 192 0.77 (1.64) 0.26 (0.53) 0.26 (4.61) 245 44.5 1771 Concrete Work 6 0.14 (0.10) 0.12 (0.04) 0.12 (1.66) 7 14.3 1795 Wrecking and Demolition 5 0.52 (0.40) 0.40 (0.71) 0.38 (2.36) 9 44.4 1799 Special Trade Contractors, NEC 14 4.96 (9.17) 0.50 (4.07) 1.01 (6.02) 16 68.8 17 Construction Special Trade Contractors 3 0.16 (0.20) 0.09 (0.19) 0.08 (3.93) 6 16.7 2011 Meat Packing Plants 9 0.67 (0.74) 0.45 (0.30) 0.49 (2.03) 12 75.0 20 Food and Kindred Products 7 0.48 (0.48) 0.37 (0.56) 0.26 (3.55) 11 36.4 2299 Textile good, NEC 3 0.07 (0.01) 0.06 (0.01) 0.07 (1.15) 8 0.0 2396 Automotive Trimmings, Apparel Findings, and Related Products 5 0.12 (0.06) 0.09 (0.08) 0.11 (1.53) 5 20.0 2399 Fabricated Textile Products, NEC 1 0.16 (0.00) 0.16 (0.00) 0.16 (1.00) 5 0.0 2421 Sawmills and Planing Mills, General 4 0.79 (0.33) 0.66 (0.26) 0.75 (1.38) 8 50.0 2491 Wood Preserving 16 0.08 (0.05) 0.06 (0.06) 0.06 (1.96) 34 2.9 2493 Reconstituted Wood Products 5 0.30 (0.27) 0.23 (0.40) 0.16 (4.09) 5 60.0 24 Lumber and Wood Products, Except Furniture 2 0.08 (0.02) 0.08 (0.02) 0.08 (1.22) 6 0.0 25 Furniture and Fixtures 4 0.13 (0.15) 0.07 (0.08) 0.08 (2.47) 5 20.0 2759 Commercial Printing, NEC 22 0.30 (0.48) 0.12 (0.28) 0.12 (3.82) 23 26.1 2819 Industrial Inorganic Chemicals, NEC 8 0.31 (0.32) 0.16 (0.29) 0.20 (2.53) 20 15.0 2824 Manmade Organic Fibers, Except Cellulosic 7 0.68 (0.25) 0.74 (0.23) 0.63 (1.60) 7 100.0 2865 Cyclic Organic Crudes and Intermediates, and Organic Dyes and Pigments 13 0.47 (0.81) 0.16 (0.24) 0.18 (3.54) 22 18.2 2891 Adhesives and Sealants 3 0.26 (0.02) 0.27 (0.02) 0.26 (1.06) 10 30.0 2895 Carbon Black 13 0.21 (0.25) 0.07 (0.26) 0.10 (3.53) 13 38.5 28 Chemicals and Allied Products 10 0.53 (0.72) 0.22 (0.55) 0.23 (3.99) 17 29.4 2911 Petroleum Refining 17 0.09 (0.11) 0.06 (0.06) 0.06 (2.74) 21 9.5 2951 Asphalt Paving Mixtures and Blocks 18 0.20 (0.21) 0.07 (0.33) 0.07 (7.73) 21 33.3 2952 Asphalt Felts and Coatings 35 0.40 (0.58) 0.24 (0.42) 0.21 (3.19) 42 42.9 2999 Products of Petroleum and Coal, Not Elsewhere Classified 14 0.42 (0.62) 0.21 (0.42) 0.22 (2.91) 21 38.1   149   TWA Measurements All Data SIC Label N1 Mean (SD)1 Median (IQR)1 Geo Mean (Geo SD)1 N2 EF2 3011 Tires and Inner Tubes 60 0.55 (0.69) 0.28 (0.55) 0.27 (3.67) 67 49.3 3053 Gaskets, Packing, and Sealing Devices 6 0.44 (0.29) 0.41 (0.38) 0.31 (2.74) 8 62.5 3069 Fabricated Rubber Products, Not Elsewhere Classified 48 0.71 (0.79) 0.40 (0.63) 0.41 (3.00) 56 66.1 308 Miscellaneous Plastics Products 1 0.07 (0.00) 0.07 (0.00) 0.07 (1.00) 5 0.0 30 Rubber and Miscellaneous Plastics Products 5 0.48 (0.43) 0.31 (0.06) 0.38 (1.82) 5 100.0 322 Glass and Glassware, Pressed or Blown 4 1.25 (1.19) 1.27 (1.95) 0.64 (3.83) 5 60.0 3255 Clay Refractories 7 0.33 (0.46) 0.10 (0.23) 0.17 (2.77) 8 25.0 3272 Concrete Products, Except Block and Brick 6 1.01 (1.03) 0.77 (1.69) 0.35 (7.01) 11 36.4 3291 Abrasive Products 3 0.33 (0.24) 0.33 (0.24) 0.26 (2.17) 11 18.2 3292 Asbestos Products 7 0.27 (0.38) 0.13 (0.20) 0.13 (3.44) 7 42.9 3296 Mineral Wood 1 0.03 (0.00) 0.03 (0.00) 0.03 (1.00) 5 0.0 3297 Nonclay Refractories 23 0.28 (0.26) 0.18 (0.31) 0.19 (2.51) 26 42.3 3312 Steel Works, Blast Furnaces (Including Coke Ovens), and Rolling Mills 65 0.25 (0.35) 0.13 (0.17) 0.14 (3.00) 78 29.5 3313 Electrometallurgical Products, Except Steel 5 0.12 (0.06) 0.15 (0.10) 0.10 (1.71) 6 0.0 3317 Steel Pipe and Tubes 1 0.12 (0.00) 0.12 (0.00) 0.12 (1.00) 8 0.0 3321 Gray and Ductile Iron Foundries 254 0.42 (0.68) 0.19 (0.31) 0.20 (3.30) 301 39.5 3322 Malleable Iron Foundries 7 0.09 (0.07) 0.10 (0.06) 0.08 (1.93) 11 9.1 3324 Steel Investment Foundries 7 0.11 (0.05) 0.13 (0.07) 0.10 (1.68) 7 0.0 3325 Steel Foundries, Not Elsewhere Classified 21 0.13 (0.10) 0.09 (0.11) 0.09 (2.32) 28 14.3 3334 Primary Production of Aluminum 193 0.27 (0.44) 0.13 (0.19) 0.13 (3.03) 229 26.6 3341 Secondary Smelting and Refining of Nonferrous Metals 7 0.12 (0.13) 0.09 (0.03) 0.08 (2.63) 12 8.3 3351 Rollings, Drawing, and Extruding of Copper 6 0.26 (0.19) 0.18 (0.29) 0.21 (1.89) 6 33.3 3353 Aluminum Sheet, Plate, and Foil 22 0.23 (0.12) 0.22 (0.20) 0.19 (2.03) 23 52.2 3355 Aluminum Rolling and Drawing, Not Elsewhere Classified 25 0.35 (0.41) 0.18 (0.26) 0.18 (3.25) 42 26.2 3365 Aluminum Foundries 42 0.25 (0.31) 0.14 (0.30) 0.13 (3.84) 53 32.1 3366 Copper Foundries 25 0.29 (0.29) 0.20 (0.20) 0.18 (2.63) 37 35.1 3369 Nonferrous Foundries, Except Aluminum and Copper 4 0.08 (0.03) 0.09 (0.02) 0.08 (1.41) 6 0.0 3398 Metal Heat Treating 16 0.19 (0.10) 0.18 (0.06) 0.16 (2.04) 18 38.9 33 Primary Metal Industries 6 0.50 (0.74) 0.17 (0.41) 0.20 (4.24) 6 50.0 3423 Hand and Edge Tools, Except Machine Tools and Handsaws 1 1.10 (0.00) 1.10 (0.00) 1.10 (1.00) 9 11.1 3441 Fabricated Structural Metal 12 0.39 (0.28) 0.27 (0.38) 0.29 (2.32) 12 75.0 3443 Fabricated Plate Work (Boiler Shops) 7 6.66 (17.11) 0.17 (0.27) 0.30 (9.13) 10 30.0 3444 Sheet Metal Work 3 0.36 (0.21) 0.43 (0.20) 0.31 (1.84) 5 40.0 3462 Iron and Steel Forgings 24 0.69 (1.51) 0.28 (0.44) 0.23 (4.55) 39 38.5 3463 Nonferrous Forgings 5 0.21 (0.18) 0.14 (0.25) 0.15 (2.26) 7 28.6 3465 Automotive Stampings 7 0.26 (0.17) 0.20 (0.26) 0.21 (2.13) 8 37.5 3479 Coating, Engraving, and Allied Services, Not Elsewhere Classified 46 1.41 (6.50) 0.14 (0.31) 0.17 (5.28) 61 23.0 3483 Ammunition, Except for Small Arms 4 0.22 (0.13) 0.21 (0.11) 0.20 (1.69) 5 40.0 3496 Miscellaneous Fabricated Wire Products 4 0.38 (0.13) 0.33 (0.11) 0.36 (1.30) 5 80.0 3498 Fabricated Pipe and Pipe Fittings 24 1.44 (2.24) 0.42 (1.59) 0.43 (5.70) 26 53.8   150   TWA Measurements All Data SIC Label N1 Mean (SD)1 Median (IQR)1 Geo Mean (Geo SD)1 N2 EF2 349 Miscellaneous Fabricated Metal Products 9 0.59 (0.62) 0.36 (0.65) 0.35 (2.99) 11 54.5 34 Fabricated Metal Products, Except Machinery and Transport. Equipment 15 1.08 (2.26) 0.23 (0.61) 0.22 (5.84) 18 44.4 3519 Internal Combustion Engines, NEC 4 0.11 (0.04) 0.12 (0.04) 0.10 (1.50) 5 0.0 3523 Farm Machinery and Equipment 8 0.09 (0.03) 0.09 (0.04) 0.09 (1.42) 9 0.0 3531 Construction Machinery and Equipment 5 0.10 (0.06) 0.08 (0.04) 0.09 (1.60) 10 0.0 353 Construction, Mining, and Materials Handling 5 0.23 (0.23) 0.14 (0.10) 0.17 (2.03) 5 20.0 355 Special Industry Machinery, Except Metalworking 9 0.84 (1.05) 0.25 (1.06) 0.33 (4.60) 10 50.0 3562 Ball and Roller Bearings 5 0.22 (0.12) 0.22 (0.10) 0.19 (1.67) 5 60.0 356 General Industrial Machinery And Equipment 7 0.12 (0.06) 0.12 (0.08) 0.11 (1.63) 8 12.5 3589 Service Industry Machinery, NEC 7 1.17 (0.97) 1.38 (1.59) 0.72 (3.00) 8 75.0 358 Refrigeration And Service Industry Machinery 2 0.79 (1.00) 0.79 (0.71) 0.36 (4.17) 5 20.0 35 Industrial and Commercial Machinery and Computer Equipment 6 0.64 (0.47) 0.54 (0.57) 0.48 (2.30) 7 71.4 3612 Power, Distribution, and Specialty Transformers 13 0.46 (0.36) 0.30 (0.39) 0.35 (2.18) 15 60.0 3624 Carbon and Graphite Products 78 0.36 (0.51) 0.19 (0.32) 0.19 (3.00) 89 41.6 3669 Communications Equipment, NEC 1 0.74 (0.00) 0.74 (0.00) 0.74 (1.00) 7 14.3 369 Miscellaneous Electrical Machinery, Equipment, and Supplies 7 0.12 (0.04) 0.12 (0.04) 0.12 (1.44) 10 0.0 36 Electronic & Other Electrical Equip. and Components, Except Comp. Equip. 3 0.48 (0.35) 0.47 (0.35) 0.38 (2.11) 6 33.3 3714 Motor Vehicle Parts and Accessories 3 0.41 (0.20) 0.47 (0.20) 0.36 (1.66) 10 20.0 3731 Ship Building and Repairing 14 0.42 (0.48) 0.17 (0.71) 0.18 (4.21) 16 37.5 37 Transport Equipment 6 0.83 (0.71) 0.58 (0.54) 0.60 (2.37) 11 45.5 3949 Sporting and Athletic Goods, NEC 3 0.24 (0.21) 0.16 (0.20) 0.18 (2.10) 5 20.0 39 Miscellaneous Manufacturing Industries 8 0.34 (0.30) 0.24 (0.22) 0.25 (2.21) 14 35.7 Division D Manufacturing (Major Groups: 20-39) 19 0.55 (0.95) 0.15 (0.56) 0.21 (3.85) 22 36.4 4011 Railroads, Line-Haul Operating 1 0.02 (0.00) 0.02 (0.00) 0.02 (1.00) 15 0.0 4111 Local and Suburban Transit 3 0.37 (0.63) 0.00 (0.55) 0.02 (16.17) 7 14.3 4212 Local Trucking Without Storage 5 0.05 (0.06) 0.03 (0.02) 0.02 (3.19) 12 0.0 4213 Trucking, Except Local 7 1.07 (1.50) 0.24 (1.39) 0.30 (7.09) 8 75.0 42 Motor Freight Transportation and Warehousing 5 0.09 (0.02) 0.08 (0.01) 0.09 (1.24) 8 0.0 45 Transportation by Air 2 0.08 (0.04) 0.08 (0.03) 0.07 (1.52) 7 0.0 4911 Electric Services 3 0.35 (0.46) 0.10 (0.41) 0.18 (3.08) 9 11.1 4953 Refuse Systems 3 0.20 (0.09) 0.24 (0.08) 0.18 (1.59) 6 33.3 Division E Transport, Comm, Electric, Gas, and Sanitary Services (Major Grps: 40-49) 8 0.31 (0.36) 0.16 (0.44) 0.14 (3.94) 11 36.4 508 Machinery, Equipment, and Supplies 6 0.95 (1.56) 0.29 (0.39) 0.44 (3.05) 7 71.4 50 Wholesale Trade-durable Goods 2 0.02 (0.01) 0.02 (0.01) 0.01 (1.97) 8 0.0 5169 Chemicals and Allied Products, Not Elsewhere Classified 10 0.08 (0.12) 0.04 (0.03) 0.05 (2.42) 12 8.3 51 Wholesale Trade-non-durable Goods 2 0.51 (0.53) 0.51 (0.38) 0.35 (2.53) 2 50.0 5812 Eating Places 7 0.82 (0.77) 0.60 (1.35) 0.40 (4.26) 7 71.4 Division G Retail Trade (Major Groups: 52-59) 3 0.02 (0.04) 0.00 (0.03) 0.00 (14.47) 5 0.0 6331 Fire, Marine, and Casualty Insurance 4 0.08 (0.10) 0.04 (0.07) 0.05 (2.63) 5 20.0 7349 Building Cleaning and Maintenance Services, NEC 2 0.07 (0.04) 0.07 (0.03) 0.06 (1.58) 5 0.0   151   TWA Measurements All Data SIC Label N1 Mean (SD)1 Median (IQR)1 Geo Mean (Geo SD)1 N2 EF2 73 Business Services 2 0.08 (0.03) 0.08 (0.02) 0.08 (1.29) 5 0.0 753 Automotive Repair Shops 5 0.32 (0.36) 0.17 (0.01) 0.23 (2.10) 10 10.0 75 Automotive Repair, Services, and Parking 3 0.06 (0.02) 0.05 (0.01) 0.06 (1.25) 6 0.0 76 Miscellaneous Repair Services 3 0.22 (0.25) 0.08 (0.22) 0.14 (2.49) 8 12.5 80 Health Services 0 NA NA NA 5 0.0 Division I Services (Major Groups: 70-89) 0 NA NA NA 5 0.0 9224 Fire Protection 0 NA NA NA 6 0.0 9311 Public Finance, Taxation, and Monetary Policy 2 0.15 (0.17) 0.15 (0.12) 0.08 (3.31) 11 9.1 9431 Administration of Public Health Programs 0 NA NA NA 10 0.0 9651 Regulation, Licensing, and Inspection of Miscellaneous Commercial Sectors 2 0.12 (0.02) 0.12 (0.02) 0.12 (1.13) 7 0.0 Division J Public Administration (Major Groups: 91-99) 3 0.02 (0.01) 0.02 (0.00) 0.02 (1.21) 9 0.0 1 TWA=8-hour time weighted average with Non-detects excluded 2 EF=Percentage of values above the OSHA PEL (0.2 mg·m-3)     152 Supplementary Table A2: Descriptive Statistics of Concentrations of Coal Tar Pitch Volatiles (mg·m-3) in OSHA data by regrouped NAICS Codes   TWA Measurements All Data NAICS Label N1 Mean (SD)1 Median (IQR)1 Geo Mean (Geo SD)1 N2 EF2 21 Mining, Quarrying, and Oil and Gas Extraction 2 0.60 (0.29) 0.60 (0.20) 0.56 (1.42) 2 100.0 21111 Oil and Gas Extraction 2 0.10 (0.04) 0.10 (0.03) 0.09 (1.37) 8 0.0 21232 Sand, Gravel, Clay, and Ceramic and Refractory Minerals Mining and Quarrying 8 0.50 (0.75) 0.15 (0.44) 0.19 (3.94) 8 37.5 22 Utilities NA NA NA NA 1 0.0 22111 Electric Power Generation 4 0.31 (0.38) 0.15 (0.29) 0.19 (2.65) 10 20.0 2362 Nonresidential Building Construction 6 0.25 (0.21) 0.15 (0.29) 0.19 (2.14) 7 42.9 23711 Water and Sewer Line and Related Structures Construction 25 1.58 (2.28) 0.76 (1.47) 0.72 (3.89) 29 72.4 23712 Oil and Gas Pipeline and Related Structures Construction 8 2.67 (2.74) 2.07 (1.29) 1.78 (2.59) 9 88.9 23731 Highway, Street, and Bridge Construction 26 0.12 (0.10) 0.08 (0.10) 0.09 (2.01) 37 13.5 23799 Other Heavy and Civil Engineering Construction 4 0.36 (0.55) 0.11 (0.34) 0.15 (3.60) 7 14.3 238 Specialty Trade Contractors 6 0.11 (0.06) 0.11 (0.05) 0.10 (1.59) 7 14.3 2381 Foundation, Structure, and Building Exterior Contractors 8 0.17 (0.10) 0.12 (0.16) 0.14 (1.76) 9 33.3 23816 Roofing Contractors 192 0.77 (1.64) 0.26 (0.53) 0.26 (4.61) 245 44.5 2382 Building Equipment Contractors 5 0.13 (0.12) 0.08 (0.22) 0.08 (3.11) 6 33.3 23822 Plumbing, Heating, and Air-Conditioning Contractors 10 1.15 (1.56) 0.49 (0.84) 0.51 (3.64) 10 80.0 23832 Painting and Wall Covering Contractors 9 0.34 (0.75) 0.08 (0.07) 0.09 (4.14) 12 16.7 23839 Other Building Finishing Contractors 8 6.43 (11.97) 0.30 (4.74) 0.70 (7.96) 9 55.6 23891 Site Preparation Contractors 6 0.50 (0.36) 0.40 (0.53) 0.38 (2.19) 12 41.7 31_33 Manufacturing 26 0.57 (0.82) 0.31 (0.58) 0.26 (3.90) 40 42.5 311 Food Manufacturing 16 0.59 (0.63) 0.44 (0.40) 0.37 (2.83) 23 56.5 313 Textile Mills 5 0.28 (0.30) 0.08 (0.45) 0.16 (2.94) 7 28.6 314 Textile Product Mills 4 0.46 (0.44) 0.38 (0.63) 0.28 (2.89) 4 50.0 31499 All Other Textile Product Mills 1 0.16 (0.00) 0.16 (0.00) 0.16 (1.00) 9 0.0 32111 Sawmills and Wood Preservation 20 0.22 (0.33) 0.10 (0.13) 0.10 (3.24) 43 11.6 32121 Veneer, Plywood, and Engineered Wood Product Manufacturing 5 0.30 (0.27) 0.23 (0.40) 0.16 (4.09) 5 60.0 3231 Printing and Related Support Activities 4 0.16 (0.06) 0.18 (0.08) 0.15 (1.49) 4 50.0 32311 Printing 22 0.30 (0.48) 0.12 (0.28) 0.12 (3.82) 23 26.1 32411 Petroleum Refineries 17 0.09 (0.11) 0.06 (0.06) 0.06 (2.74) 21 9.5 32412 Asphalt Paving, Roofing, and Saturated Materials Manufacturing 53 0.33 (0.49) 0.18 (0.36) 0.14 (5.01) 63 39.7 32419 Other Petroleum and Coal Products Manufacturing 15 0.40 (0.61) 0.21 (0.35) 0.22 (2.82) 22 36.4 32511 Petrochemical Manufacturing 8 0.45 (0.99) 0.09 (0.10) 0.13 (3.51) 11 9.1 32518 Other Basic Inorganic Chemical Manufacturing 20 0.26 (0.27) 0.16 (0.29) 0.14 (3.39) 21 42.9 32519 Other Basic Organic Chemical Manufacturing 5 0.49 (0.47) 0.31 (0.50) 0.30 (2.87) 14 21.4 32521 Resin and Synthetic Rubber Manufacturing 7 0.64 (0.81) 0.34 (0.38) 0.39 (2.53) 7 71.4 32522 Artificial and Synthetic Fibers and Filaments Manufacturing 7 0.68 (0.25) 0.74 (0.23) 0.63 (1.60) 7 100.0 32552 Adhesive Manufacturing 3 0.26 (0.02) 0.27 (0.02) 0.26 (1.06) 10 30.0 32599 All Other Chemical Product and Preparation Manufacturing 2 0.43 (0.45) 0.43 (0.32) 0.29 (2.57) 7 14.3   153   TWA Measurements All Data NAICS Label N1 Mean (SD)1 Median (IQR)1 Geo Mean (Geo SD)1 N2 EF2 3261 Plastics Product Manufacturing 4 0.53 (0.69) 0.25 (0.48) 0.29 (3.04) 8 25.0 32621 Tire Manufacturing 62 0.55 (0.68) 0.28 (0.56) 0.27 (3.64) 69 49.3 32629 Other Rubber Product Manufacturing 41 0.76 (0.83) 0.43 (0.79) 0.42 (3.23) 48 64.6 32712 Clay Building Material and Refractories Manufacturing 31 0.29 (0.31) 0.15 (0.31) 0.18 (2.59) 35 37.1 32721 Glass and Glass Product Manufacturing 5 1.01 (1.16) 0.35 (2.11) 0.37 (5.10) 6 50.0 32733 Concrete Pipe, Brick, and Block Manufacturing 4 1.13 (1.26) 1.08 (2.07) 0.26 (9.82) 9 22.2 32791 Abrasive Product Manufacturing 3 0.33 (0.24) 0.33 (0.24) 0.26 (2.17) 11 18.2 32799 All Other Nonmetallic Mineral Product Manufacturing 2 0.06 (0.04) 0.06 (0.03) 0.05 (1.67) 6 0.0 33111 Iron and Steel Mills and Ferroalloy Manufacturing 52 0.21 (0.37) 0.11 (0.10) 0.11 (3.02) 63 19.0 33121 Iron and Steel Pipe and Tube Manufacturing from Purchased Steel 1 0.12 (0.00) 0.12 (0.00) 0.12 (1.00) 8 0.0 33122 Rolling and Drawing of Purchased Steel 22 0.27 (0.24) 0.21 (0.22) 0.18 (2.61) 26 42.3 33131 Alumina and Aluminum Production and Processing 245 0.27 (0.41) 0.14 (0.22) 0.14 (2.98) 304 28.3 33142 Copper Rolling, Drawing, Extruding, and Alloying 8 0.22 (0.18) 0.14 (0.16) 0.17 (1.96) 8 25.0 33151 Ferrous Metal Foundries 290 0.39 (0.65) 0.16 (0.27) 0.18 (3.25) 348 35.6 33152 Nonferrous Metal Foundries 72 0.26 (0.29) 0.15 (0.23) 0.14 (3.32) 97 32.0 332 Fabricated Metal Product Manufacturing 4 0.30 (0.49) 0.06 (0.26) 0.10 (3.93) 6 16.7 33211 Forging and Stamping 31 0.63 (1.35) 0.26 (0.40) 0.23 (4.19) 48 39.6 33221 Cutlery and Handtool Manufacturing 1 1.10 (0.00) 1.10 (0.00) 1.10 (1.00) 9 11.1 33231 Plate Work and Fabricated Structural Product Manufacturing 16 0.33 (0.27) 0.26 (0.28) 0.22 (2.67) 18 55.6 33232 Ornamental and Architectural Metal Products Manufacturing 5 0.25 (0.22) 0.13 (0.31) 0.15 (3.02) 7 28.6 33242 Metal Tank (Heavy Gauge) Manufacturing 3 15.34 (26.09) 0.40 (22.66) 1.40 (12.12) 6 33.3 33261 Spring and Wire Product Manufacturing 9 0.33 (0.23) 0.30 (0.22) 0.25 (2.22) 10 60.0 33281 Coating, Engraving, Heat Treating, and Allied Activities 70 1.40 (5.45) 0.18 (0.32) 0.21 (5.32) 87 32.2 33299 All Other Fabricated Metal Product Manufacturing 37 1.07 (1.89) 0.23 (1.29) 0.34 (4.81) 42 52.4 33311 Agricultural Implement Manufacturing 7 0.09 (0.04) 0.09 (0.05) 0.09 (1.45) 8 0.0 33312 Construction Machinery Manufacturing 4 0.10 (0.07) 0.08 (0.07) 0.09 (1.69) 9 0.0 33313 Mining and Oil and Gas Field Machinery Manufacturing 5 0.23 (0.23) 0.14 (0.10) 0.17 (2.03) 5 20.0 33329 Other Industrial Machinery Manufacturing 7 1.04 (1.12) 0.65 (1.54) 0.41 (5.31) 7 71.4 33331 Commercial and Service Industry Machinery Manufacturing 9 0.94 (0.95) 0.23 (1.46) 0.51 (3.20) 13 46.2 33341 Vent., Heating, AC, and Commercial Refrigeration Equipment Manufacturing 6 0.37 (0.56) 0.15 (0.08) 0.20 (2.58) 8 25.0 33361 Engine, Turbine, and Power Transmission Equipment Manufacturing 4 0.11 (0.04) 0.12 (0.04) 0.10 (1.50) 6 0.0 3339 Other General Purpose Machinery Manufacturing 6 0.53 (0.82) 0.12 (0.49) 0.19 (3.98) 6 33.3 33429 Other Communications Equipment Manufacturing 1 0.74 (0.00) 0.74 (0.00) 0.74 (1.00) 7 14.3 33531 Electrical Equipment Manufacturing 16 0.47 (0.34) 0.39 (0.45) 0.35 (2.17) 18 61.1 33591 Battery Manufacturing 3 0.11 (0.06) 0.11 (0.06) 0.10 (1.62) 6 0.0 33599 All Other Electrical Equipment and Component Manufacturing 74 0.33 (0.46) 0.18 (0.31) 0.19 (2.82) 85 40.0 33621 Motor Vehicle Body and Trailer Manufacturing 1 0.18 (0.00) 0.18 (0.00) 0.18 (1.00) 7 0.0 3363 Motor Vehicle Parts Manufacturing 9 0.60 (0.48) 0.57 (0.85) 0.35 (3.65) 10 60.0 33636 Motor Vehicle Seating and Interior Trim Manufacturing 5 0.12 (0.06) 0.09 (0.08) 0.11 (1.53) 5 20.0 33637 Motor Vehicle Metal Stamping 7 0.26 (0.17) 0.20 (0.26) 0.21 (2.13) 8 37.5   154   TWA Measurements All Data NAICS Label N1 Mean (SD)1 Median (IQR)1 Geo Mean (Geo SD)1 N2 EF2 33641 Aerospace Product and Parts Manufacturing 3 0.57 (0.49) 0.48 (0.49) 0.41 (2.41) 5 40.0 33661 Ship and Boat Building 15 0.42 (0.47) 0.17 (0.68) 0.19 (4.09) 17 41.2 337 Furniture and Related Product Manufacturing 4 0.13 (0.15) 0.07 (0.08) 0.08 (2.47) 6 16.7 3399 Other Miscellaneous Manufacturing 4 0.35 (0.34) 0.21 (0.20) 0.25 (2.11) 7 28.6 33992 Sporting and Athletic Goods Manufacturing 3 0.24 (0.21) 0.16 (0.20) 0.18 (2.10) 5 20.0 33999 All Other Miscellaneous Manufacturing 10 0.40 (0.28) 0.28 (0.40) 0.29 (2.58) 15 53.3 423 Merchant Wholesalers, Durable Goods 3 1.38 (2.36) 0.02 (2.05) 0.08 (16.42) 10 10.0 42383 Industrial Machinery and Equipment Merchant Wholesalers 5 0.32 (0.22) 0.25 (0.13) 0.28 (1.70) 5 80.0 42469 Other Chemical and Allied Products Merchant Wholesalers 10 0.08 (0.12) 0.04 (0.03) 0.05 (2.42) 12 8.3 44_45 Retail Trade 5 0.22 (0.38) 0.07 (0.14) 0.01 (30.58) 7 14.3 48_49 Transportation and Warehousing 8 0.32 (0.34) 0.19 (0.36) 0.19 (2.84) 14 28.6 48211 Rail Transportation 1 0.02 (0.00) 0.02 (0.00) 0.02 (1.00) 16 0.0 48411 General Freight Trucking, Local 5 0.05 (0.06) 0.03 (0.02) 0.02 (3.19) 12 0.0 48412 General Freight Trucking, Long-Distance 5 1.39 (1.72) 0.23 (2.68) 0.32 (10.12) 5 80.0 48511 Urban Transit Systems 3 0.37 (0.63) 0.00 (0.55) 0.02 (16.17) 7 14.3 488 Support Activities for Transportation 5 0.14 (0.14) 0.09 (0.03) 0.10 (2.32) 9 11.1 51 Information 3 0.06 (0.04) 0.05 (0.04) 0.05 (1.86) 3 0.0 52412 Direct Insurance (except Life, Health, and Medical) Carriers 4 0.08 (0.10) 0.04 (0.07) 0.05 (2.63) 5 20.0 532 Rental and Leasing Services 3 0.06 (0.02) 0.05 (0.01) 0.06 (1.25) 5 0.0 56 Administrative and Support and Waste Management and Remediation Services 3 0.20 (0.09) 0.24 (0.08) 0.18 (1.59) 6 33.3 561 Administrative and Support Services 2 0.07 (0.04) 0.07 (0.03) 0.06 (1.58) 7 0.0 61 Educational Services NA NA NA NA 2 0.0 62 Health Care and Social Assistance NA NA NA NA 5 0.0 72 Accommodation and Food Services 2 0.95 (0.49) 0.95 (0.35) 0.88 (1.47) 3 66.7 72211 Full-Service Restaurants 5 0.77 (0.90) 0.20 (1.60) 0.29 (4.89) 5 60.0 81 Other Services (except Public Administration) NA NA NA NA 3 0.0 811 Repair and Maintenance 1 0.07 (0.00) 0.07 (0.00) 0.07 (1.00) 6 0.0 81111 Automotive Mechanical and Electrical Repair and Maintenance 3 0.16 (0.03) 0.17 (0.03) 0.15 (1.20) 8 0.0 92 Public Administration 3 0.02 (0.01) 0.02 (0.00) 0.02 (1.21) 9 0.0 92113 Public Finance Activities 2 0.15 (0.17) 0.15 (0.12) 0.08 (3.31) 11 9.1 92216 Fire Protection NA NA NA NA 6 0.0 92312 Administration of Public Health Programs NA NA NA NA 10 0.0 92615 Regulation, Licensing, and Inspection of Miscellaneous Commercial Sectors 2 0.12 (0.02) 0.12 (0.02) 0.12 (1.13) 7 0.0 1 TWA=8-hour time weighted average with Non-detects excluded 2 EF=Percentage of values above the OSHA PEL (0.200 mg·m-3)      155 Supplementary Table A3: Regrouped SIC coefficients and Predicted Probabilities of exceeding Permissible Exposure Limit (0.2 mg·m-3) for Polycyclic Aromatic Hydrocarbons assessed as Concentrations of Coal Tar Pitch Volatiles in 1980 from IMIS databank SIC Label b SE(b)   SE(  ) 133 Oil and Gas Extraction ----- ----- 1.00 0.00 1611 Highway and Street Construction, Except Elevated Highways -1.91 0.89 0.05 0.04 1622 Bridge, Tunnel, and Elevated Highway Construction -0.63 1.33 0.16 0.18 1623 Water, Sewer, Pipeline, and Communications and Power Line Construction 3.83 1.02 0.94 0.05 1629 Heavy Construction, Not Elsewhere Classified -2.44 1.50 0.03 0.05 17 Construction Special Trade Contractors -1.28 1.54 0.09 0.13 1711 Plumbing, Heating, and Air-Conditioning 3.35 1.48 0.91 0.12 1721 Painting and Paper Hanging -1.19 1.06 0.10 0.10 174 Masonry, Stonework, Tile Setting, and Plastering 0.54 1.20 0.38 0.29 1761 Roofing, Siding, and Sheet Metal Work 0.71 0.27 0.42 0.06 1771 Concrete Work -1.54 1.59 0.07 0.11 1795 Wrecking and Demolition 0.45 1.10 0.36 0.26 1799 Special Trade Contractors, NEC 1.92 0.86 0.71 0.18 D Manufacturing (Major Groups: 20-39) 0.07 0.66 0.28 0.13 20 Food and Kindred Products 0.40 0.96 0.35 0.22 2011 Meat Packing Plants 1.95 0.97 0.72 0.20 22994 Textile good, NEC ----- ----- 0.00 0.00 2396 Automotive Trimmings, Apparel Findings, and Related Products -0.55 1.87 0.17 0.27 23994 Fabricated Textile Products, NEC ----- ----- 0.00 0.00 244 Lumber and Wood Products, Except Furniture ----- ----- 0.00 0.00 2421 Sawmills and Planing Mills, General 0.20 1.19 0.31 0.25 2491 Wood Preserving -3.40 1.45 0.01 0.02 2493 Reconstituted Wood Products 0.93 1.37 0.48 0.34 25 Furniture and Fixtures -1.28 1.53 0.09 0.13 2759 Commercial Printing, NEC 0.00 0.00 0.15 0.17 28 Chemicals and Allied Products -0.59 0.83 0.17 0.12 2819 Industrial Inorganic Chemicals, NEC -1.24 1.04 0.09 0.09 28243 Manmade Organic Fibers, Except Cellulosic ----- ----- 1.00 0.00 2865 Cyclic Organic Crudes and Intermediates, and Organic Dyes and Pigments -0.80 0.85 0.14 0.10 2891 Adhesives and Sealants 0.06 1.19 0.28 0.24 2895 Carbon Black 0.34 1.07 0.34 0.24 2911 Petroleum Refining -2.02 1.18 0.05 0.05 2951 Asphalt Paving Mixtures and Blocks -0.82 0.88 0.14 0.10 2952 Asphalt Felts and Coatings 0.10 0.55 0.29 0.11 2999 Products of Petroleum and Coal, Not Elsewhere Classified 0.76 0.94 0.44 0.23 303 Rubber and Miscellaneous Plastics Products ----- ----- 1.00 0.00 3011 Tires and Inner Tubes 0.87 0.49 0.46 0.12 3053 Gaskets, Packing, and Sealing Devices 1.33 1.14 0.58 0.28   156 SIC Label b SE(b)   SE(  ) 3069 Fabricated Rubber Products, Not Elsewhere Classified 1.62 0.50 0.65 0.12 3084 Miscellaneous Plastics Products ----- ----- 0.00 0.00 322 Glass and Glassware, Pressed or Blown 1.73 1.40 0.67 0.31 3255 Clay Refractories -0.48 1.13 0.18 0.17 3272 Concrete Products, Except Block and Brick 0.18 0.92 0.30 0.19 3291 Abrasive Products -1.05 1.68 0.11 0.17 3292 Asbestos Products 0.25 1.20 0.32 0.26 32964 Mineral Wood ----- ----- 0.00 0.00 3297 Nonclay Refractories -0.28 0.83 0.22 0.14 33 Primary Metal Industries 1.69 1.25 0.66 0.28 3312 Steel Works, Blast Furnaces (Including Coke Ovens), and Rolling Mills -0.75 0.54 0.15 0.07 33134 Electrometallurgical Products, Except Steel ----- ----- 0.00 0.00 33174 Steel Pipe and Tubes ----- ----- 0.00 0.00 3321 Gray and Ductile Iron Foundries 0.02 0.25 0.27 0.05 3322 Malleable Iron Foundries -2.05 1.45 0.04 0.06 33244 Steel Investment Foundries ----- ----- 0.00 0.00 3325 Steel Foundries, Not Elsewhere Classified -1.77 0.95 0.06 0.05 3334 Primary Production of Aluminum -0.56 0.40 0.17 0.06 3341 Secondary Smelting and Refining of Nonferrous Metals -2.31 1.51 0.03 0.05 3351 Rollings, Drawing, and Extruding of Copper 0.23 1.71 0.31 0.37 3353 Aluminum Sheet, Plate, and Foil 0.78 1.00 0.44 0.25 3355 Aluminum Rolling and Drawing, Not Elsewhere Classified 0.21 1.07 0.31 0.23 3365 Aluminum Foundries -0.02 0.60 0.26 0.12 3366 Copper Foundries -0.03 0.56 0.26 0.11 33694 Nonferrous Foundries, Except Aluminum and Copper ----- ----- 0.00 0.00 3398 Metal Heat Treating 0.33 0.88 0.34 0.20 34 Fabricated Metal Products, Except Machinery and Transportation Equipment 0.58 0.72 0.39 0.17 3423 Hand and Edge Tools, Except Machine Tools and Handsaws -1.51 1.47 0.07 0.10 3441 Fabricated Structural Metal 2.00 1.15 0.73 0.23 3443 Fabricated Plate Work (Boiler Shops) -0.16 0.95 0.24 0.17 3444 Sheet Metal Work 0.32 1.37 0.33 0.31 3462 Iron and Steel Forgings 0.36 0.67 0.34 0.15 3463 Nonferrous Forgings -0.39 1.70 0.20 0.27 3465 Automotive Stampings -0.13 1.20 0.24 0.22 3479 Coating, Engraving, and Allied Services, Not Elsewhere Classified -0.72 0.51 0.15 0.06 3483 Ammunition, Except for Small Arms 0.15 1.74 0.30 0.37 349 Miscellaneous Fabricated Metal Products 0.98 0.88 0.49 0.22 3496 Miscellaneous Fabricated Wire Products 2.50 1.56 0.81 0.24 3498 Fabricated Pipe and Pipe Fittings 1.04 0.72 0.51 0.18 35 Industrial and Commercial Machinery and Computer Equipment 1.89 1.27 0.71 0.27 35194 Internal Combustion Engines, NEC ----- ----- 0.00 0.00   157 SIC Label b SE(b)   SE(  ) 35234 Farm Machinery and Equipment ----- ----- 0.00 0.00 353 Construction, Mining, and Materials Handling -0.60 1.55 0.17 0.22 35314 Construction Machinery and Equipment ----- ----- 0.00 0.00 355 Special Industry Machinery, Except Metalworking 0.93 0.99 0.48 0.25 356 General Industrial Machinery And Equipment -1.70 1.57 0.06 0.09 3562 Ball and Roller Bearings 1.82 1.47 0.69 0.32 358 Refrigeration And Service Industry Machinery -0.85 1.56 0.13 0.18 3589 Service Industry Machinery, NEC 2.34 1.20 0.79 0.20 36 Electronic and Other Electrical (Except Computer) Equipment and Components -0.09 1.21 0.25 0.23 3612 Power, Distribution, and Specialty Transformers 1.78 0.92 0.68 0.20 3624 Carbon and Graphite Products 0.47 0.47 0.37 0.11 3669 Communications Equipment, NEC -1.64 1.55 0.07 0.10 3694 Miscellaneous Electrical Machinery, Equipment, and Supplies ----- ----- 0.00 0.00 37 Transport Equipment 0.51 0.87 0.38 0.20 3714 Motor Vehicle Parts and Accessories -0.80 1.23 0.14 0.15 3731 Ship Building and Repairing 0.52 0.77 0.38 0.18 39 Miscellaneous Manufacturing Industries -0.28 0.92 0.22 0.16 3949 Sporting and Athletic Goods, NEC -0.80 1.66 0.14 0.20 E Transport, Comm, Electric, Gas, and Sanitary Services (Major Grps: 40-49) 0.12 0.92 0.29 0.19 40114 Railroads, Line-Haul Operating ----- ----- 0.00 0.00 4111 Local and Suburban Transit -1.17 1.48 0.10 0.14 424 Motor Freight Transportation and Warehousing ----- ----- 0.00 0.00 42124 Local Trucking Without Storage ----- ----- 0.00 0.00 4213 Trucking, Except Local 2.32 1.17 0.79 0.20 454 Transportation by Air ----- ----- 0.00 0.00 4911 Electric Services -1.81 1.59 0.06 0.09 4953 Refuse Systems -0.25 1.36 0.22 0.24 504 Wholesale Trade-durable Goods ----- ----- 0.00 0.00 508 Machinery, Equipment, and Supplies 2.15 1.24 0.76 0.23 51 Wholesale Trade-non-durable Goods 1.04 2.04 0.50 0.51 5169 Chemicals and Allied Products, Not Elsewhere Classified -1.92 1.75 0.05 0.08 G4 Retail Trade (Major Groups: 52-59) ----- ----- 0.00 0.00 5812 Eating Places 2.55 1.35 0.82 0.20 6331 Fire, Marine, and Casualty Insurance -0.99 1.87 0.12 0.20 I4 Services (Major Groups: 70-89) ----- ----- 0.00 0.00 734 Business Services ----- ----- 0.00 0.00 73494 Building Cleaning and Maintenance Services, NEC ----- ----- 0.00 0.00 754 Automotive Repair, Services, and Parking ----- ----- 0.00 0.00 753 Automotive Repair Shops -1.78 1.45 0.06 0.08 76 Miscellaneous Repair Services -1.66 1.50 0.06 0.09 804 Health Services ----- ----- 0.00 0.00   158 SIC Label b SE(b)   SE(  ) J4 Public Administration (Major Groups: 91-99) ----- ----- 0.00 0.00 92244 Fire Protection ----- ----- 0.00 0.00 9311 Public Finance, Taxation, and Monetary Policy -2.24 1.50 0.04 0.05 94314 Administration of Public Health Programs ----- ----- 0.00 0.00 96514 Regulation, Licensing, and Inspection of Miscellaneous Commercial Sectors ----- ----- 0.00 0.00 3 All measurements exceeded PEL (0.2 mg·m-3), inclusive. 4 None of the measurements exceeded PEL (0.200 mg·m-3), inclusive.      159 Supplementary Table A4: Regrouped NAICS coefficients and Predicted Probabilities of exceeding Permissible Exposure Limit (0.2 mg·m-3) for Polycyclic Aromatic Hydrocarbons assessed as Concentrations of Coal Tar Pitch Volatiles in 1980 from IMIS databank  NAICS Description b SE(b)   SE(  ) 213 Mining, Quarrying, and Oil and Gas Extraction ----- ----- 1.00 0.00 211114 Oil and Gas Extraction ----- ----- 0.00 0.00 21232 Sand, Gravel, Clay, and Ceramic and Refractory Minerals Mining and Quarrying 0.40 1.82 0.32 0.40 224 Utilities ----- ----- 0.00 0.00 22111 Electric Power Generation -0.58 1.27 0.15 0.16 2362 Nonresidential Building Construction 0.90 1.42 0.44 0.35 23711 Water and Sewer Line and Related Structures Construction 2.09 0.98 0.72 0.20 23712 Oil and Gas Pipeline and Related Structures Construction 3.38 1.58 0.90 0.14 23731 Highway, Street, and Bridge Construction -1.81 0.85 0.05 0.04 23799 Other Heavy and Civil Engineering Construction -1.94 1.71 0.04 0.07 238 Specialty Trade Contractors -1.59 1.70 0.06 0.10 2381 Foundation, Structure, and Building Exterior Contractors 0.15 1.21 0.27 0.24 23816 Roofing Contractors 0.80 0.30 0.41 0.07 2382 Building Equipment Contractors -0.14 1.59 0.22 0.27 23822 Plumbing, Heating, and Air-Conditioning Contractors 2.90 1.20 0.85 0.15 23832 Painting and Wall Covering Contractors -1.20 1.13 0.09 0.09 23839 Other Building Finishing Contractors 1.25 1.13 0.53 0.28 23891 Site Preparation Contractors 0.30 1.00 0.30 0.21 31_33 Manufacturing 0.56 0.53 0.36 0.12 311 Food Manufacturing 1.24 0.73 0.52 0.18 313 Textile Mills -0.37 1.46 0.18 0.22 314 Textile Product Mills 1.19 1.67 0.51 0.42 314994 All Other Textile Product Mills ----- ----- 0.00 0.00 32111 Sawmills and Wood Preservation -2.27 0.89 0.03 0.03 32121 Veneer, Plywood, and Engineered Wood Product Manufacturing 0.95 1.47 0.45 0.37 3231 Printing and Related Support Activities 0.72 1.94 0.39 0.47 32311 Printing 0.00 0.00 0.13 0.17 32411 Petroleum Refineries -2.03 1.27 0.04 0.05 32412 Asphalt Paving, Roofing, and Saturated Materials Manufacturing -0.19 0.51 0.21 0.08 32419 Other Petroleum and Coal Products Manufacturing 0.56 0.97 0.36 0.22 32511 Petrochemical Manufacturing -1.94 1.79 0.04 0.08 32518 Other Basic Inorganic Chemical Manufacturing 0.94 0.92 0.45 0.23 32519 Other Basic Organic Chemical Manufacturing -0.93 1.03 0.11 0.10 32521 Resin and Synthetic Rubber Manufacturing 2.17 1.42 0.74 0.28 325223 Artificial and Synthetic Fibers and Filaments Manufacturing ----- ----- 1.00 0.00 32552 Adhesive Manufacturing 0.18 1.28 0.28 0.26 32599 All Other Chemical Product and Preparation Manufacturing -1.53 1.66 0.06 0.10 3261 Plastics Product Manufacturing -1.30 1.38 0.08 0.10   160 NAICS Description b SE(b)   SE(  ) 32621 Tire Manufacturing 0.94 0.52 0.45 0.13 32629 Other Rubber Product Manufacturing 1.67 0.58 0.63 0.14 32712 Clay Building Material and Refractories Manufacturing -0.46 0.72 0.17 0.10 32721 Glass and Glass Product Manufacturing 1.12 1.27 0.49 0.32 32733 Concrete Pipe, Brick, and Block Manufacturing -0.99 1.31 0.11 0.12 32791 Abrasive Product Manufacturing -0.96 1.84 0.11 0.18 327994 All Other Nonmetallic Mineral Product Manufacturing ----- ----- 0.00 0.00 33111 Iron and Steel Mills and Ferroalloy Manufacturing -1.43 0.64 0.07 0.04 331214 Iron and Steel Pipe and Tube Manufacturing from Purchased Steel ----- ----- 0.00 0.00 33122 Rolling and Drawing of Purchased Steel -0.13 1.20 0.22 0.21 33131 Alumina and Aluminum Production and Processing -0.34 0.38 0.18 0.06 33142 Copper Rolling, Drawing, Extruding, and Alloying -0.65 1.60 0.14 0.20 33151 Ferrous Metal Foundries -0.23 0.27 0.20 0.04 33152 Nonferrous Metal Foundries -0.03 0.44 0.24 0.08 332 Fabricated Metal Product Manufacturing -1.14 1.66 0.09 0.14 33211 Forging and Stamping 0.67 0.64 0.38 0.15 33221 Cutlery and Handtool Manufacturing -1.57 1.55 0.06 0.09 33231 Plate Work and Fabricated Structural Product Manufacturing 0.74 0.87 0.40 0.21 33232 Ornamental and Architectural Metal Products Manufacturing -0.53 1.36 0.16 0.18 33242 Metal Tank (Heavy Gauge) Manufacturing 0.09 1.25 0.26 0.24 33261 Spring and Wire Product Manufacturing 1.19 1.03 0.51 0.26 33281 Coating, Engraving, Heat Treating, and Allied Activities -0.09 0.45 0.22 0.08 33299 All Other Fabricated Metal Product Manufacturing 1.06 0.58 0.48 0.15 333114 Agricultural Implement Manufacturing ----- ----- 0.00 0.00 333124 Construction Machinery Manufacturing ----- ----- 0.00 0.00 33313 Mining and Oil and Gas Field Machinery Manufacturing -0.54 1.64 0.16 0.22 33329 Other Industrial Machinery Manufacturing 2.73 1.45 0.83 0.21 33331 Commercial and Service Industry Machinery Manufacturing 0.49 0.93 0.34 0.21 33341 Vent., Heating, AC, and Commercial Refrigeration Equipment Manufacturing -0.22 1.34 0.20 0.22 333614 Engine, Turbine, and Power Transmission Equipment Manufacturing ----- ----- 0.00 0.00 3339 Other General Purpose Machinery Manufacturing -0.30 1.38 0.19 0.21 33429 Other Communications Equipment Manufacturing -1.68 1.65 0.06 0.09 33531 Electrical Equipment Manufacturing 2.04 0.88 0.71 0.18 335914 Battery Manufacturing ----- ----- 0.00 0.00 33599 All Other Electrical Equipment and Component Manufacturing 0.32 0.51 0.30 0.11 336214 Motor Vehicle Body and Trailer Manufacturing ----- ----- 0.00 0.00 3363 Motor Vehicle Parts Manufacturing 1.03 1.05 0.47 0.26 33636 Motor Vehicle Seating and Interior Trim Manufacturing -0.49 2.01 0.16 0.28 33637 Motor Vehicle Metal Stamping -0.12 1.29 0.22 0.23 33641 Aerospace Product and Parts Manufacturing -0.05 1.40 0.23 0.25 33661 Ship and Boat Building 0.92 0.79 0.44 0.19   161 NAICS Description b SE(b)   SE(  ) 337 Furniture and Related Product Manufacturing -1.56 1.60 0.06 0.09 3399 Other Miscellaneous Manufacturing -0.88 1.44 0.12 0.15 33992 Sporting and Athletic Goods Manufacturing -0.85 1.77 0.12 0.19 33999 All Other Miscellaneous Manufacturing 0.92 0.92 0.44 0.23 423 Merchant Wholesalers, Durable Goods -2.02 1.55 0.04 0.06 42383 Industrial Machinery and Equipment Merchant Wholesalers 3.21 1.83 0.89 0.18 42469 Other Chemical and Allied Products Merchant Wholesalers -1.86 1.88 0.05 0.08 44_45 Retail Trade -1.40 1.64 0.07 0.11 48_49 Transportation and Warehousing -0.69 1.00 0.14 0.12 482114 Rail Transportation ----- ----- 0.00 0.00 484114 General Freight Trucking, Local ----- ----- 0.00 0.00 48412 General Freight Trucking, Long-Distance 2.95 1.63 0.86 0.20 48511 Urban Transit Systems -1.20 1.57 0.09 0.13 488 Support Activities for Transportation -1.85 1.63 0.05 0.08 514 Information ----- ----- 0.00 0.00 52412 Direct Insurance (except Life, Health, and Medical) Carriers -0.94 2.01 0.11 0.20 5324 Rental and Leasing Services ----- ----- 0.00 0.00 56 Administrative and Support and Waste Management and Remediation Services -0.28 1.46 0.19 0.23 5614 Administrative and Support Services ----- ----- 0.00 0.00 614 Educational Services ----- ----- 0.00 0.00 624 Health Care and Social Assistance ----- ----- 0.00 0.00 72 Accommodation and Food Services 1.57 1.84 0.60 0.44 72211 Full-Service Restaurants 2.08 1.59 0.72 0.33 814 Other Services (except Public Administration) ----- ----- 0.00 0.00 8114 Repair and Maintenance ----- ----- 0.00 0.00 811114 Automotive Mechanical and Electrical Repair and Maintenance ----- ----- 0.00 0.00 924 Public Administration ----- ----- 0.00 0.00 92113 Public Finance Activities -2.31 1.59 0.03 0.05 922164 Fire Protection ----- ----- 0.00 0.00 923124 Administration of Public Health Programs ----- ----- 0.00 0.00 926154 Regulation, Licensing, and Inspection of Miscellaneous Commercial Sectors ----- ----- 0.00 0.00 3 All measurements exceeded PEL (0.2 mg·m-3), inclusive. 4 None of the measurements exceeded PEL (0.200 mg·m-3), inclusive.      162 Supplementary Table A5: Regrouped SIC coefficients and comparison of Predicted Probabilities of exceeding Permissible Exposure Limit (0.2 mg·m-3) for Polycyclic Aromatic Hydrocarbons assessed as Concentrations of Coal Tar Pitch Volatiles in 1980 for Minimally Skilled Workers and Administrative Personnel Covariate Description   Minimally Skilled Workers Administrative Personnel b SE(b)   SE(  )   SE(  ) CAT 0 Minimally Skilled Workers/Labourers 0.73 0.27 ----- ----- ----- ----- CAT 1 Skilled Workers/Operators 0.58 0.25 ----- ----- ----- ----- CAT 2 Supervisor/Foremen -0.23 0.53 ----- ----- ----- ----- CAT 3 Administrative Personnel -1.65 0.68 ----- ----- ----- ----- 1611 Highway and Street Construction, Except Elevated Highways -2.20 1.05 0.05 0.05 0.01 0.01 1622 Bridge, Tunnel, and Elevated Highway Construction -0.91 1.84 0.17 0.26 0.02 0.04 1623 Water, Sewer, Pipeline, and Comm. and Power Line Construction 3.47 1.03 0.94 0.06 0.59 0.32 1629 Heavy Construction, Not Elsewhere Classified -0.46 1.86 0.24 0.34 0.03 0.06 1711 Plumbing, Heating, and Air-Conditioning 2.67 1.51 0.88 0.16 0.40 0.42 1721 Painting and Paper Hanging -1.58 1.44 0.09 0.12 0.01 0.02 174 Masonry, Stonework, Tile Setting, and Plastering 0.23 1.18 0.39 0.28 0.05 0.08 1761 Roofing, Siding, and Sheet Metal Work 0.65 0.32 0.49 0.08 0.08 0.07 1771 Concrete Work -1.17 1.64 0.13 0.19 0.01 0.03 1795 Wrecking and Demolition 0.22 1.36 0.38 0.33 0.05 0.08 1799 Special Trade Contractors, NEC 2.08 0.99 0.80 0.16 0.27 0.25 D Manufacturing (Major Groups: 20-39) 1.06 0.94 0.59 0.23 0.12 0.13 20 Food and Kindred Products -0.73 1.30 0.19 0.21 0.02 0.03 2011 Meat Packing Plants 1.14 1.04 0.61 0.25 0.13 0.15 2396 Automotive Trimmings, Apparel Findings, and Related Products -0.72 1.84 0.19 0.29 0.02 0.04 247 Lumber and Wood Products, Except Furniture ----- ----- 0.00 0.00 ----- ----- 2421 Sawmills and Planing Mills, General 1.82 1.49 0.75 0.28 0.22 0.25 2491 Wood Preserving -3.11 1.45 0.02 0.03 0.00 0.00 2493 Reconstituted Wood Products 0.78 1.35 0.52 0.34 0.09 0.13 2759 Commercial Printing, NEC 0.00 0.00 0.30 0.32 0.04 0.06 287 Chemicals and Allied Products ----- ----- 0.00 0.00 ----- ----- 2819 Industrial Inorganic Chemicals, NEC -2.01 1.26 0.06 0.07 0.01 0.01 2865 Cyclic Organic Crudes and Intermediates, Organic Dyes and Pigments -1.14 0.97 0.14 0.12 0.01 0.02 28957 Carbon Black ----- ----- 0.00 0.00 ----- ----- 2911 Petroleum Refining -2.61 1.57 0.04 0.05 0.00 0.01 2951 Asphalt Paving Mixtures and Blocks 0.17 1.01 0.37 0.24 0.05 0.06 29527 Asphalt Felts and Coatings ----- ----- 0.00 0.00 ----- ----- 2999 Products of Petroleum and Coal, Not Elsewhere Classified 0.61 0.92 0.48 0.23 0.08 0.09 307 Rubber and Miscellaneous Plastics Products ----- ----- 0.00 0.00 ----- ----- 3011 Tires and Inner Tubes 1.03 0.55 0.58 0.14 0.11 0.10 3069 Fabricated Rubber Products, Not Elsewhere Classified 1.01 0.82 0.58 0.21 0.11 0.12   163 Covariate Description   Minimally Skilled Workers Administrative Personnel b SE(b)   SE(  )   SE(  ) 32 Stone, Clay, Glass, And Concrete Products -0.13 1.07 0.30 0.23 0.04 0.05 3272 Concrete Products, Except Block and Brick -0.53 1.27 0.23 0.22 0.03 0.04 3297 Nonclay Refractories 2.08 1.67 0.80 0.27 0.27 0.35 33 Primary Metal Industries 0.93 1.34 0.56 0.34 0.10 0.15 3317 Steel Works, Blast Furnaces, And Rolling And Finishing Mills ----- ----- 0.00 0.00 ----- ----- 3312 Steel Works, Blast Furnaces (Including Coke Ovens), and Rolling Mills -1.33 0.71 0.12 0.07 0.01 0.01 3321 Gray and Ductile Iron Foundries 0.65 0.39 0.49 0.10 0.08 0.07 3325 Steel Foundries, Not Elsewhere Classified -1.91 1.38 0.07 0.09 0.01 0.01 3334 Primary Production of Aluminum -0.73 0.48 0.19 0.08 0.02 0.02 3341 Secondary Smelting and Refining of Nonferrous Metals -1.89 1.55 0.07 0.10 0.01 0.01 3351 Rollings, Drawing, and Extruding of Copper 0.04 1.68 0.34 0.38 0.05 0.08 3353 Aluminum Sheet, Plate, and Foil 0.48 1.29 0.45 0.33 0.07 0.10 3355 Aluminum Rolling and Drawing, Not Elsewhere Classified 0.03 1.05 0.34 0.24 0.04 0.06 3365 Aluminum Foundries -0.45 0.85 0.24 0.16 0.03 0.03 3366 Copper Foundries 0.27 0.68 0.39 0.17 0.06 0.06 3398 Metal Heat Treating -0.41 0.98 0.25 0.19 0.03 0.04 34 Fabricated Metal Products, Except Machinery & Transportation Equip 0.53 0.86 0.46 0.22 0.07 0.08 3423 Hand and Edge Tools, Except Machine Tools and Handsaws -1.44 1.47 0.11 0.14 0.01 0.02 34434 Fabricated Plate Work (Boiler Shops) ----- ----- 0.00 0.00 0.00 0.00 3444 Sheet Metal Work 0.13 1.34 0.36 0.32 0.05 0.08 3462 Iron and Steel Forgings -0.44 0.75 0.24 0.14 0.03 0.03 3479 Coating, Engraving, and Allied Services, Not Elsewhere Classified -0.72 0.64 0.19 0.10 0.02 0.02 349 Miscellaneous Fabricated Metal Products 1.50 1.23 0.69 0.27 0.17 0.21 3498 Fabricated Pipe and Pipe Fittings 1.66 0.85 0.72 0.18 0.19 0.18 35 Industrial and Commercial Machinery and Computer Equipment 0.53 1.41 0.46 0.36 0.07 0.11 353 Construction, Mining, and Materials Handling -0.75 1.53 0.19 0.24 0.02 0.04 35317 Construction Machinery and Equipment ----- ----- 0.00 0.00 ----- ----- 355 Special Industry Machinery, Except Metalworking 0.02 1.10 0.34 0.25 0.04 0.06 356 General Industrial Machinery And Equipment -1.79 1.57 0.08 0.11 0.01 0.01 3562 Ball and Roller Bearings 1.62 1.45 0.71 0.30 0.19 0.26 358 Refrigeration And Service Industry Machinery -0.20 1.38 0.29 0.29 0.04 0.06 3612 Power, Distribution, and Specialty Transformers 2.02 1.40 0.79 0.24 0.26 0.31 3624 Carbon and Graphite Products 0.31 0.54 0.40 0.14 0.06 0.06 3697 Miscellaneous Electrical Machinery, Equipment, and Supplies ----- ----- 0.00 0.00 ----- ----- 37 Transport Equipment 1.35 1.00 0.66 0.23 0.15 0.17 3714 Motor Vehicle Parts and Accessories -1.00 1.21 0.15 0.16 0.02 0.02 3731 Ship Building and Repairing 0.42 0.83 0.43 0.21 0.07 0.07 39 Miscellaneous Manufacturing Industries 0.11 0.98 0.36 0.23 0.05 0.06 3949 Sporting and Athletic Goods, NEC -1.03 1.65 0.15 0.21 0.02 0.03   164 Covariate Description   Minimally Skilled Workers Administrative Personnel b SE(b)   SE(  )   SE(  ) E Transport, Comm, Electric, Gas, & Sanitary Services (Major Grp: 40-49) 0.09 1.04 0.35 0.24 0.05 0.06 4111 Local and Suburban Transit -1.36 1.47 0.11 0.15 0.01 0.02 42124 Local Trucking Without Storage ----- ----- 0.00 0.00 0.00 0.00 4213 Trucking, Except Local 2.49 1.52 0.86 0.19 0.36 0.40 457 Transportation by Air ----- ----- 0.00 0.00 ----- ----- 4911 Electric Services -1.66 1.62 0.09 0.13 0.01 0.02 507 Wholesale Trade-durable Goods ----- ----- 0.00 0.00 ----- ----- 508 Machinery, Equipment, and Supplies 1.59 1.30 0.71 0.27 0.18 0.23 5141 Groceries, General Line 0.87 2.01 0.54 0.51 0.10 0.19 5169 Chemicals and Allied Products, Not Elsewhere Classified -2.16 1.72 0.05 0.09 0.01 0.01 55117 Motor Vehicle Dealers (New and Used) ----- ----- 0.00 0.00 ----- ----- 5812 Eating Places 1.90 1.48 0.77 0.27 0.23 0.31 I4 Services (Major Groups: 70-89) ----- ----- 0.00 0.00 0.00 0.00 737 Business Services ----- ----- 0.00 0.00 ----- ----- 7518 Automotive Rental And Leasing, Without Drivers ----- ----- ----- ----- 0.00 0.00 753 Automotive Repair Shops -0.52 1.54 0.23 0.28 0.03 0.05 75388 General Automotive Repair Shops ----- ----- ----- ----- 0.00 0.00 76 Miscellaneous Repair Services -1.29 1.53 0.12 0.16 0.01 0.02 J7 Public Administration (Major Groups: 91-99) ----- ----- 0.00 0.00 ----- ----- 92247 Fire Protection ----- ----- 0.00 0.00 ----- ----- 94318 Administration of Public Health Programs ----- ----- ----- ----- 0.00 0.00 96518 Regulation, Licensing, and Inspection of Misc. Commercial Sectors ----- ----- ----- ----- 0.00 0.00 4 None of the measurements exceeded PEL (0.200 mg·m-3), inclusive. 7 None of the measurements exceeded PEL (0.200 mg·m-3), inclusive, and no data available for administrative personnel. 8 None of the measurements exceeded PEL (0.200 mg·m-3), inclusive, and no data available for minimally skilled workers/labourers.     165 Supplementary Table A6: Regrouped NAICS coefficients and Predicted Probabilities of exceeding Permissible Exposure Limit (0.2 mg·m-3) for Polycyclic Aromatic Hydrocarbons assessed as Concentrations of Coal Tar Pitch Volatiles in 1980 for Minimally Skilled Workers and Administrative Personnel Covariate Description b SE(b) Minimally Skilled Workers Administrative Personnel   SE(  )   SE(  ) CAT 0 Minimally Skilled Workers/Labourers 0.64 0.26 ----- ----- ----- ----- CAT 1 Skilled Workers/Operators 0.41 0.25 ----- ----- ----- ----- CAT 2 Supervisor/Foremen -0.44 0.54 ----- ----- ----- ----- CAT 3 Administrative Personnel -1.25 0.65 ----- ----- ----- ----- 22111 Electric Power Generation -0.18 1.34 0.26 0.26 0.05 0.07 2362 Nonresidential Building Construction 0.82 1.42 0.48 0.36 0.12 0.18 23711 Water and Sewer Line and Related Structures Construction 2.99 1.27 0.89 0.13 0.55 0.37 237125 Oil and Gas Pipeline and Related Structures Construction ----- ----- 1.00 0.00 ----- ----- 23731 Highway, Street, and Bridge Construction -2.04 1.00 0.05 0.05 0.01 0.01 238 Specialty Trade Contractors -1.36 1.79 0.10 0.16 0.02 0.03 2381 Foundation, Structure, and Building Exterior Contractors 0.08 1.21 0.31 0.26 0.06 0.09 23816 Roofing Contractors 0.84 0.35 0.49 0.08 0.13 0.09 2382 Building Equipment Contractors -0.08 1.60 0.27 0.32 0.05 0.09 23822 Plumbing, Heating, and Air-Conditioning Contractors 2.37 1.26 0.81 0.19 0.40 0.36 23832 Painting and Wall Covering Contractors -1.54 1.54 0.08 0.12 0.01 0.02 23839 Other Building Finishing Contractors 1.30 1.46 0.60 0.36 0.19 0.25 23891 Site Preparation Contractors -0.56 1.33 0.19 0.21 0.03 0.05 31_33 Manufacturing -2.22 1.10 0.04 0.05 0.01 0.01 3118 Food Manufacturing ----- ----- ----- ----- 0.00 0.00 31161 Animal Slaughtering and Processing 1.45 1.00 0.64 0.24 0.21 0.21 314 Textile Product Mills -0.40 1.44 0.22 0.25 0.04 0.06 32111 Sawmills and Wood Preservation -1.78 0.94 0.06 0.06 0.01 0.01 32121 Veneer, Plywood, and Engineered Wood Product Manufacturing 0.93 1.47 0.51 0.37 0.14 0.20 32311 Printing 0.00 0.00 0.30 0.37 0.06 0.11 32411 Petroleum Refineries -2.56 1.71 0.03 0.05 0.00 0.01 32412 Asphalt Paving, Roofing, and Saturated Materials Manufacturing -1.41 0.94 0.09 0.08 0.01 0.02 32419 Other Petroleum and Coal Products Manufacturing 0.48 0.97 0.40 0.24 0.09 0.10 325 Chemical Manufacturing -1.05 1.23 0.12 0.14 0.02 0.03 32511 Petrochemical Manufacturing -1.86 1.78 0.06 0.10 0.01 0.02 32518 Other Basic Inorganic Chemical Manufacturing -1.29 1.53 0.10 0.14 0.02 0.03 32519 Other Basic Organic Chemical Manufacturing -0.42 1.32 0.21 0.22 0.04 0.06 32621 Tire Manufacturing 1.21 0.59 0.58 0.15 0.17 0.14 32629 Other Rubber Product Manufacturing 0.80 1.02 0.48 0.26 0.12 0.14 32712 Clay Building Material and Refractories Manufacturing 0.75 1.18 0.46 0.30 0.12 0.14 32733 Concrete Pipe, Brick, and Block Manufacturing -0.47 1.39 0.20 0.23 0.04 0.06   166 Covariate Description b SE(b) Minimally Skilled Workers Administrative Personnel   SE(  )   SE(  ) 33111 Iron and Steel Mills and Ferroalloy Manufacturing -1.53 0.78 0.08 0.06 0.01 0.01 33131 Alumina and Aluminum Production and Processing -0.34 0.45 0.23 0.08 0.04 0.04 33142 Copper Rolling, Drawing, Extruding, and Alloying -0.78 1.59 0.16 0.21 0.03 0.05 33151 Ferrous Metal Foundries 0.17 0.40 0.33 0.09 0.07 0.06 33152 Nonferrous Metal Foundries 0.25 0.58 0.34 0.13 0.07 0.07 332 Fabricated Metal Product Manufacturing -0.28 0.94 0.24 0.17 0.04 0.05 33211 Forging and Stamping -0.36 0.84 0.22 0.15 0.04 0.05 33221 Cutlery and Handtool Manufacturing -1.37 1.58 0.09 0.14 0.02 0.03 33232 Ornamental and Architectural Metal Products Manufacturing 0.23 1.48 0.34 0.34 0.07 0.11 33281 Coating, Engraving, Heat Treating, and Allied Activities 0.03 0.55 0.30 0.12 0.06 0.05 33299 All Other Fabricated Metal Product Manufacturing 1.94 0.74 0.74 0.14 0.30 0.22 333 Machinery Manufacturing 0.09 0.93 0.31 0.20 0.06 0.07 333127 Construction Machinery Manufacturing ----- ----- 0.00 0.00 ----- ----- 33313 Mining and Oil and Gas Field Machinery Manufacturing -0.61 1.64 0.18 0.25 0.03 0.06 33329 Other Industrial Machinery Manufacturing 2.02 1.59 0.76 0.30 0.32 0.39 33331 Commercial and Service Industry Machinery Manufacturing -0.86 1.42 0.15 0.18 0.03 0.04 33531 Electrical Equipment Manufacturing 1.96 1.26 0.74 0.24 0.30 0.32 33599 All Other Electrical Equipment and Component Manufacturing 0.40 0.60 0.38 0.14 0.08 0.08 336 Transportation Equipment Manufacturing 0.58 1.09 0.42 0.27 0.10 0.12 336217 Motor Vehicle Body and Trailer Manufacturing ----- ----- 0.00 0.00 ----- ----- 33636 Motor Vehicle Seating and Interior Trim Manufacturing -0.54 2.01 0.19 0.32 0.03 0.07 33661 Ship and Boat Building 1.00 0.85 0.53 0.21 0.14 0.14 3399 Other Miscellaneous Manufacturing -0.45 1.52 0.21 0.25 0.04 0.06 33992 Sporting and Athletic Goods Manufacturing -1.03 1.78 0.13 0.20 0.02 0.04 33999 All Other Miscellaneous Manufacturing -0.45 1.31 0.21 0.22 0.04 0.06 4237 Merchant Wholesalers, Durable Goods ----- ----- 0.00 0.00 ----- ----- 42383 Industrial Machinery and Equipment Merchant Wholesalers 3.18 1.83 0.91 0.16 0.60 0.48 42469 Other Chemical and Allied Products Merchant Wholesalers -2.04 1.88 0.05 0.09 0.01 0.02 441117 New Car Dealers ----- ----- 0.00 0.00 ----- ----- 44511 Supermarkets and Other Grocery (except Convenience) Stores 1.13 2.17 0.56 0.54 0.16 0.31 48_49 Transportation and Warehousing -0.83 1.34 0.15 0.18 0.03 0.04 484114 General Freight Trucking, Local ----- ----- 0.00 0.00 0.00 0.00 48412 General Freight Trucking, Long-Distance 2.76 1.64 0.87 0.19 0.49 0.46 48511 Urban Transit Systems -1.33 1.58 0.10 0.14 0.02 0.03 517118 Wired Telecommunications Carriers ----- ----- ----- ----- 0.00 0.00 53218 Automotive Equipment Rental and Leasing ----- ----- ----- ----- 0.00 0.00 5617 Administrative and Support Services ----- ----- 0.00 0.00 ----- ----- 562217 Waste Treatment and Disposal ----- ----- 0.00 0.00 ----- ----- 611118 Elementary and Secondary Schools ----- ----- ----- ----- 0.00 0.00   167 Covariate Description b SE(b) Minimally Skilled Workers Administrative Personnel   SE(  )   SE(  ) 72211 Full-Service Restaurants 2.20 1.62 0.79 0.28 0.36 0.42 817 Other Services (except Public Administration) ----- ----- 0.00 0.00 ----- ----- 811118 Automotive Mechanical and Electrical Repair and Maintenance ----- ----- ----- ----- 0.00 0.00 927 Public Administration ----- ----- 0.00 0.00 ----- ----- 922167 Fire Protection ----- ----- 0.00 0.00 ----- ----- 923128 Administration of Public Health Programs ----- ----- ----- ----- 0.00 0.00 926158 Regulation, Licensing, and Inspection of Misc. Commercial Sectors ----- ----- ----- ----- 0.00 0.00 5 All measurements exceeded PEL (0.200 mg·m-3), inclusive, and no data available for administrative personnel. 7 None of the measurements exceeded PEL (0.200 mg·m-3), inclusive, and no data available for administrative personnel. 8 None of the measurements exceeded PEL (0.200 mg·m-3), inclusive, and no data available for minimally skilled workers/labourers.     168 Supplementary Table A7: Comparison between OSHA databank and IMIS databank incorporating Occupation groups Minimally Skilled Workers and Administrative Personnel of Predicted Probabilities of exceeding Permissible Exposure Limit (0.2 mg·m-3) for Polycyclic Aromatic Hydrocarbons assessed as Concentrations of Coal Tar Pitch Volatiles standardized to 1980 for select SIC codes.   OSHA: Ignoring Occupation IMIS: Minimally Skilled Workers IMIS:  Admin Personnel SIC Description   SE(  )   SE(  )   SE(  ) 1611 Highway and Street Construction, Except Elevated Highways 0.05 0.04 0.05 0.05 0.01 0.01 1622 Bridge, Tunnel, and Elevated Highway Construction 0.16 0.18 0.17 0.26 0.02 0.04 1623 Water, Sewer, Pipeline, and Comm. and Power Line Construction 0.94 0.05 0.94 0.06 0.59 0.32 1629 Heavy Construction, Not Elsewhere Classified 0.03 0.05 0.24 0.34 0.03 0.06 1711 Plumbing, Heating, and Air-Conditioning 0.91 0.12 0.88 0.16 0.40 0.42 1721 Painting and Paper Hanging 0.10 0.10 0.09 0.12 0.01 0.02 174 Masonry, Stonework, Tile Setting, and Plastering 0.38 0.29 0.39 0.28 0.05 0.08 1761 Roofing, Siding, and Sheet Metal Work 0.42 0.06 0.49 0.08 0.08 0.07 1771 Concrete Work 0.07 0.11 0.13 0.19 0.01 0.03 1795 Wrecking and Demolition 0.36 0.26 0.38 0.33 0.05 0.08 1799 Special Trade Contractors, NEC 0.71 0.18 0.80 0.16 0.27 0.25 D Manufacturing (Major Groups: 20-39) 0.28 0.13 0.59 0.23 0.12 0.13 20 Food and Kindred Products 0.35 0.22 0.19 0.21 0.02 0.03 2011 Meat Packing Plants 0.72 0.20 0.61 0.25 0.13 0.15 2396 Automotive Trimmings, Apparel Findings, and Related Products 0.17 0.27 0.19 0.29 0.02 0.04 2421 Sawmills and Planing Mills, General 0.31 0.25 0.75 0.28 0.22 0.25 2491 Wood Preserving 0.01 0.02 0.02 0.03 0.00 0.00 2493 Reconstituted Wood Products 0.48 0.34 0.52 0.34 0.09 0.13 2759 Commercial Printing, NEC 0.15 0.17 0.30 0.32 0.04 0.06 2819 Industrial Inorganic Chemicals, NEC 0.09 0.09 0.06 0.07 0.01 0.01 2865 Cyclic Organic Crudes and Intermed., and Organic Dyes and Pigments 0.14 0.10 0.14 0.12 0.01 0.02 2911 Petroleum Refining 0.05 0.05 0.04 0.05 0.00 0.01 2951 Asphalt Paving Mixtures and Blocks 0.14 0.10 0.37 0.24 0.05 0.06 2999 Products of Petroleum and Coal, Not Elsewhere Classified 0.44 0.23 0.48 0.23 0.08 0.09 3011 Tires and Inner Tubes 0.46 0.12 0.58 0.14 0.11 0.10 3069 Fabricated Rubber Products, Not Elsewhere Classified 0.65 0.12 0.58 0.21 0.11 0.12 3272 Concrete Products, Except Block and Brick 0.30 0.19 0.23 0.22 0.03 0.04 3297 Nonclay Refractories 0.22 0.14 0.80 0.27 0.27 0.35 33 Primary Metal Industries 0.66 0.28 0.56 0.34 0.10 0.15 3312 Steel Works, Blast Furnaces (Including Coke Ovens), and Rolling Mills 0.15 0.07 0.12 0.07 0.01 0.01 3321 Gray and Ductile Iron Foundries 0.27 0.05 0.49 0.10 0.08 0.07 3325 Steel Foundries, Not Elsewhere Classified 0.06 0.05 0.07 0.09 0.01 0.01 3334 Primary Production of Aluminum 0.17 0.06 0.19 0.08 0.02 0.02 3341 Secondary Smelting and Refining of Nonferrous Metals 0.03 0.05 0.07 0.10 0.01 0.01 3351 Rollings, Drawing, and Extruding of Copper 0.31 0.37 0.34 0.38 0.05 0.08   169   OSHA: Ignoring Occupation IMIS: Minimally Skilled Workers IMIS:  Admin Personnel SIC Description   SE(  )   SE(  )   SE(  ) 3353 Aluminum Sheet, Plate, and Foil 0.44 0.25 0.45 0.33 0.07 0.10 3355 Aluminum Rolling and Drawing, Not Elsewhere Classified 0.31 0.23 0.34 0.24 0.04 0.06 3365 Aluminum Foundries 0.26 0.12 0.24 0.16 0.03 0.03 3366 Copper Foundries 0.26 0.11 0.39 0.17 0.06 0.06 3398 Metal Heat Treating 0.34 0.20 0.25 0.19 0.03 0.04 34 Fabricated Metal Products, Except Machinery and Transport. Equipment 0.39 0.17 0.46 0.22 0.07 0.08 3423 Hand and Edge Tools, Except Machine Tools and Handsaws 0.07 0.10 0.11 0.14 0.01 0.02 3444 Sheet Metal Work 0.33 0.31 0.36 0.32 0.05 0.08 3462 Iron and Steel Forgings 0.34 0.15 0.24 0.14 0.03 0.03 3479 Coating, Engraving, and Allied Services, Not Elsewhere Classified 0.15 0.06 0.19 0.10 0.02 0.02 349 Miscellaneous Fabricated Metal Products 0.49 0.22 0.69 0.27 0.17 0.21 3498 Fabricated Pipe and Pipe Fittings 0.51 0.18 0.72 0.18 0.19 0.18 35 Industrial and Commercial Machinery and Computer Equipment 0.71 0.27 0.46 0.36 0.07 0.11 353 Construction, Mining, and Materials Handling 0.17 0.22 0.19 0.24 0.02 0.04 355 Special Industry Machinery, Except Metalworking 0.48 0.25 0.34 0.25 0.04 0.06 356 General Industrial Machinery And Equipment 0.06 0.09 0.08 0.11 0.01 0.01 3562 Ball and Roller Bearings 0.69 0.32 0.71 0.30 0.19 0.26 358 Refrigeration And Service Industry Machinery 0.13 0.18 0.29 0.29 0.04 0.06 3612 Power, Distribution, and Specialty Transformers 0.68 0.20 0.79 0.24 0.26 0.31 3624 Carbon and Graphite Products 0.37 0.11 0.40 0.14 0.06 0.06 37 Transport Equipment 0.38 0.20 0.66 0.23 0.15 0.17 3714 Motor Vehicle Parts and Accessories 0.14 0.15 0.15 0.16 0.02 0.02 3731 Ship Building and Repairing 0.38 0.18 0.43 0.21 0.07 0.07 39 Miscellaneous Manufacturing Industries 0.22 0.16 0.36 0.23 0.05 0.06 3949 Sporting and Athletic Goods, NEC 0.14 0.20 0.15 0.21 0.02 0.03 E Transport, Comm, Electric, Gas, and Sanitary Services (Major Grp: 40-49) 0.29 0.19 0.35 0.24 0.05 0.06 4111 Local and Suburban Transit 0.10 0.14 0.11 0.15 0.01 0.02 4213 Trucking, Except Local 0.79 0.20 0.86 0.19 0.36 0.40 4911 Electric Services 0.06 0.09 0.09 0.13 0.01 0.02 508 Machinery, Equipment, and Supplies 0.76 0.23 0.71 0.27 0.18 0.23 5169 Chemicals and Allied Products, Not Elsewhere Classified 0.05 0.08 0.05 0.09 0.01 0.01 5812 Eating Places 0.82 0.20 0.77 0.27 0.23 0.31 753 Automotive Repair Shops 0.06 0.08 0.23 0.28 0.03 0.05 76 Miscellaneous Repair Services 0.06 0.09 0.12 0.16 0.01 0.02      170 Supplementary Table A8: Comparison between OSHA and IMIS databank incorporating Occupation groups Minimally Skilled Workers and Administrative Personnel of Predicted Probabilities of exceeding Permissible Exposure Limit (0.2 mg·m-3) for Polycyclic Aromatic Hydrocarbons assessed as Concentrations of Coal Tar Pitch Volatiles standardized to 1980 for select NAICS codes.   OSHA: Ignoring Occupation IMIS: Minimally Skilled Workers IMIS:  Admin Personnel NAICS Description   SE(  )   SE(  )   SE(  ) 22111 Electric Power Generation 0.15 0.16 0.26 0.26 0.05 0.07 2362 Nonresidential Building Construction 0.44 0.35 0.48 0.36 0.12 0.18 23711 Water and Sewer Line and Related Structures Construction 0.72 0.20 0.89 0.13 0.55 0.37 23731 Highway, Street, and Bridge Construction 0.05 0.04 0.05 0.05 0.01 0.01 238 Specialty Trade Contractors 0.06 0.10 0.10 0.16 0.02 0.03 2381 Foundation, Structure, and Building Exterior Contractors 0.27 0.24 0.31 0.26 0.06 0.09 23816 Roofing Contractors 0.41 0.07 0.49 0.08 0.13 0.09 2382 Building Equipment Contractors 0.21 0.27 0.27 0.32 0.05 0.09 23822 Plumbing, Heating, and Air-Conditioning Contractors 0.85 0.15 0.81 0.19 0.40 0.36 23832 Painting and Wall Covering Contractors 0.09 0.09 0.08 0.12 0.01 0.02 23839 Other Building Finishing Contractors 0.52 0.28 0.60 0.36 0.19 0.25 23891 Site Preparation Contractors 0.30 0.21 0.19 0.21 0.03 0.05 31-33 Manufacturing 0.36 0.12 0.04 0.05 0.01 0.01 314 Textile Product Mills 0.50 0.42 0.22 0.25 0.04 0.06 32111 Sawmills and Wood Preservation 0.03 0.03 0.06 0.06 0.01 0.01 32121 Veneer, Plywood, and Engineered Wood Product Manufacturing 0.45 0.36 0.51 0.37 0.14 0.20 32311 Printing 0.13 0.17 0.30 0.37 0.06 0.11 32411 Petroleum Refineries 0.04 0.05 0.03 0.05 0.00 0.01 32412 Asphalt Paving, Roofing, and Saturated Materials Manufacturing 0.21 0.08 0.09 0.08 0.01 0.02 32419 Other Petroleum and Coal Products Manufacturing 0.36 0.22 0.40 0.24 0.09 0.10 32511 Petrochemical Manufacturing 0.04 0.07 0.06 0.10 0.01 0.02 32518 Other Basic Inorganic Chemical Manufacturing 0.45 0.23 0.10 0.14 0.02 0.03 32519 Other Basic Organic Chemical Manufacturing 0.11 0.10 0.21 0.22 0.04 0.06 32621 Tire Manufacturing 0.44 0.13 0.58 0.15 0.17 0.14 32629 Other Rubber Product Manufacturing 0.63 0.14 0.48 0.26 0.12 0.14 32712 Clay Building Material and Refractories Manufacturing 0.17 0.10 0.46 0.30 0.12 0.14 32733 Concrete Pipe, Brick, and Block Manufacturing 0.11 0.12 0.20 0.23 0.04 0.06 33111 Iron and Steel Mills and Ferroalloy Manufacturing 0.07 0.04 0.08 0.06 0.01 0.01 33131 Alumina and Aluminum Production and Processing 0.18 0.06 0.23 0.08 0.04 0.04 33142 Copper Rolling, Drawing, Extruding, and Alloying 0.14 0.20 0.16 0.21 0.03 0.05 33151 Ferrous Metal Foundries 0.20 0.04 0.33 0.09 0.07 0.06 33152 Nonferrous Metal Foundries 0.23 0.08 0.34 0.13 0.07 0.07 332 Fabricated Metal Product Manufacturing 0.09 0.14 0.24 0.17 0.04 0.05   171   OSHA: Ignoring Occupation IMIS: Minimally Skilled Workers IMIS:  Admin Personnel NAICS Description   SE(  )   SE(  )   SE(  ) 33211 Forging and Stamping 0.38 0.15 0.22 0.15 0.04 0.05 33221 Cutlery and Handtool Manufacturing 0.06 0.09 0.09 0.14 0.02 0.03 33232 Ornamental and Architectural Metal Products Manufacturing 0.16 0.18 0.34 0.34 0.07 0.11 33281 Coating, Engraving, Heat Treating, and Allied Activities 0.22 0.08 0.30 0.12 0.06 0.05 33299 All Other Fabricated Metal Product Manufacturing 0.48 0.14 0.74 0.14 0.30 0.22 33313 Mining and Oil and Gas Field Machinery Manufacturing 0.16 0.22 0.18 0.25 0.03 0.06 33329 Other Industrial Machinery Manufacturing 0.83 0.21 0.76 0.30 0.32 0.39 33331 Commercial and Service Industry Machinery Manufacturing 0.34 0.21 0.15 0.18 0.03 0.04 33531 Electrical Equipment Manufacturing 0.71 0.18 0.74 0.24 0.30 0.32 33599 All Other Electrical Equipment and Component Manufacturing 0.30 0.11 0.38 0.14 0.08 0.08 33636 Motor Vehicle Seating and Interior Trim Manufacturing 0.16 0.27 0.19 0.32 0.03 0.07 33661 Ship and Boat Building 0.44 0.19 0.53 0.21 0.14 0.14 3399 Other Miscellaneous Manufacturing 0.12 0.15 0.21 0.25 0.04 0.06 33992 Sporting and Athletic Goods Manufacturing 0.12 0.18 0.13 0.20 0.02 0.04 33999 All Other Miscellaneous Manufacturing 0.44 0.23 0.21 0.22 0.04 0.06 42383 Industrial Machinery and Equipment Merchant Wholesalers 0.89 0.19 0.91 0.16 0.60 0.48 42469 Other Chemical and Allied Products Merchant Wholesalers 0.05 0.08 0.05 0.09 0.01 0.02 48-49 Transportation and Warehousing 0.14 0.12 0.15 0.18 0.03 0.04 48412 General Freight Trucking, Long-Distance 0.86 0.20 0.87 0.19 0.49 0.46 48511 Urban Transit Systems 0.09 0.12 0.10 0.14 0.02 0.03 72211 Full-Service Restaurants 0.71 0.33 0.79 0.28 0.36 0.42   172 Supplementary Table A9: Exposure assessment to PAHs and breast cancer risk stratified by hormone receptor statusⱡ        ER/PR+  ER/PR-    Receptor JEM Metric Controls (%)   Cases (%)   OR 95% CI   Cases (%)   OR 95% CI   effect DOM Cumulative Exposure – Eq. (3.1)               None 866 (75.9)  606 (74.2)  -----    122 (77.7)  -----      Low (0.1–1.8) 91 (8.0)  66 (8.1)  1.16 0.82 1.64  7 (4.5)  0.59 0.26 1.32    Medium (1.9–6.8) 91 (8.0)  73 (8.9)  1.23 0.88 1.73  12 (7.6)  0.90 0.47 1.73    High (6.9–90.0) 93 (8.2)  72 (8.8)  0.97 0.69 1.37  16 (10.2)  1.17 0.66 2.09          ptrend > 0.5     ptrend > 0.9  p > 0.8  NCIø Cumulative Exposure – Eq. (3.4)               None 914 (80.1)  644 (78.8)  -----    128 (81.5)  -----      Low (0.1–1.8) 77 (6.7)  50 (6.1)  1.03 0.70 1.51  3 (1.9)  0.30 0.09 0.99    Medium (1.9–7.0) 75 (6.6)  66 (8.1)  1.34 0.94 1.92  14 (8.9)  1.32 0.71 2.46    High (7.1–79.0) 75 (6.6)  57 (7.0)  0.95 0.65 1.39  12 (7.6)  1.08 0.56 2.07          ptrend > 0.5     ptrend > 0.7  p > 0.8  PPM Ever-Never: Any level              Never 454 (39.8)  262 (32.1)  -----    47 (29.9)  -----      Ever 687 (60.2)  555 (67.9)  1.27 1.04 1.56  110 (70.1)  1.40 0.96 2.04                 p > 0.7                 Ever-Never: At maximum level†              Never 454 (39.8)  262 (32.1)  -----    47 (29.9)  -----      Maximum level at low* 107 (09.4)  69 (08.4)  1.04 0.74 1.48  13 (08.3)  1.03 0.53 2.00    Maximum level at medium¶ 178 (15.6)  122 (14.9)  1.15 0.86 1.53  30 (19.1)  1.55 0.94 2.56    Maximum level at high 402 (35.2)  364 (44.6)  1.40 1.12 1.75  67 (42.7)  1.44 0.95 2.17         ptrend < 0.01    ptrend = 0.06  p > 0.9                 Duration (years) of exposure at any level              None (0) 454 (39.7)  262 (32.1)  -----    47 (29.9)  -----      Short (0.1–4.2) 229 (20.1)  173 (21.2)  1.35 1.04 1.75  35 (22.3)  1.47 0.92 2.38    Moderate (4.3–13.0) 230 (20.2)  190 (23.3)  1.29 1.01 1.67  44 (28.0)  1.66 1.05 2.62    Long (13.1–82.2) 228 (20.0)  192 (23.4)  1.17 0.90 1.53  31 (19.8)  1.08 0.65 1.78         ptrend > 0.1    ptrend > 0.4  p > 0.9                 Duration (years) of exposure at medium¶ or high levels               None (0) 454 (39.7)  262 (32.1)  -----    47 (29.9)  -----      Ever: Maximum at low level∆ 107 (09.4)  69 (08.4)  1.04 0.74 1.48  13 (08.3)  1.03 0.53 2.01    Short (0.1–2.7) 194 (17.0)  153 (18.7)  1.38 1.05 1.80  27 (17.2)  1.30 0.77 2.17    Moderate (2.8–9.0) 196 (17.2)  147 (18.0)  1.21 0.92 1.60  32 (20.4)  1.57 0.96 2.58    Long (9.1–80.8) 190 (16.7)  186 (22.8)  1.37 1.04 1.80  38 (24.2)  1.55 0.96 2.52         ptrend = 0.03    ptrend = 0.04  p > 0.4                                 173        ER/PR+  ER/PR-    Receptor JEM Metric Controls (%)   Cases (%)   OR 95% CI   Cases (%)   OR 95% CI   effect PPM Duration (years) of exposure at the high level                None (0) 454 (39.8)  262 (32.1)  -----    47 (29.9)  -----      Ever: Maximum at low*  or medium¶ levels◊ 285 (25.0)  191 (23.4)  1.11 0.86 1.42  43 (27.4)  1.35 0.86 2.11    Short (0.1–2.3) 134 (11.7)  120 (14.7)  1.58 1.17 2.14  21 (13.4)  1.47 0.83 2.58    Moderate (2.4–7.4) 134 (11.7)  99 (12.1)  1.17 0.85 1.61  21 (13.4)  1.48 0.84 2.60    Long (7.5–74.1) 134 (11.7)   145 (17.7)   1.44 1.07 1.94   25 (15.9)   1.37 0.79 2.38           ptrend = 0.02    ptrend > 0.1  p > 0.9                  Average Probability – Eq. (3.5)                None (0) 454 (39.7)  262 (32.1)  -----    47 (29.9)  -----      Low (0.01–0.02) 229 (20.1)  157 (19.2)  1.17 0.90 1.52  37 (23.6)  1.46 0.92 2.34    Medium (0.03–0.07) 229 (20.1)  188 (23.0)  1.35 1.04 1.74  39 (24.8)  1.58 0.99 2.52    High (0.08–0.88) 229 (20.1)  210 (25.7)   1.31 1.01 1.69   34 (21.7)  1.15 0.70 1.88           ptrend < 0.02    ptrend > 0.3  p > 0.5                                  Weighted Duration (Years) – Eq. (3.6)                None (0) 454 (39.7)  262 (32.1)  -----    47 (29.9)  -----      Short (0.1–0.4) 229 (20.1)  165 (20.2)  1.21 0.94 1.57  42 (26.8)  1.67 1.06 2.63    Moderate (0.5–1.7) 229 (20.1)  181 (22.2)  1.27 0.98 1.65  27 (17.2)  1.06 0.64 1.77    Long (1.8–55.1) 229 (20.1)  209 (25.6)   1.34 1.04 1.73   41 (26.1)  1.46 0.91 2.34           ptrend < 0.02    ptrend > 0.2  p > 0.7 ⱡ  Adjusted for age, centre, education, ethnicity, smoking (pack-years). All ptrend values are calculated by treating ordinal categories as continuous values (ptrend values are calculated by treating ordinal categories as continuous values for the exposed categories only) ø Exposed for the NCI-JEM is defined as non-zero intensity score conditional on the probability score being medium or higher. Jobs that have a probability of low become grouped with the unexposed jobs using this approach. † Maximum level classification, regardless of duration, is the maximum exposure level to which the participant was exposed across all occupations.    ∆ Exposed for the NCI-JEM is defined as non-zero intensity score conditional on the probability score being medium or higher. Jobs that have a probability of low become grouped with the unexposed jobs using this approach. * Analysis for exposure at low level (estimated probability of exposure above 0.2 mg·m-3 of coal tar pitch volatiles) is θ = (0.1 – 2.9%) in at least one job ¶ Analysis for exposure at medium level (estimated probability of exposure above 0.2 mg·m-3 of coal tar pitch volatiles) is θ = (3.0 – 8.9%) in at least one job Analysis for exposure at high level (estimated probability of exposure above 0.2 mg·m-3 of coal tar pitch volatiles) is θ ≥ 9% in at least one job ◊ To ensure referent group are truly unexposed, a nuisance variable was created for the low/medium-exposed group where value = 1, if highest duration at low or medium level exposure, else 0   174 Supplementary Table A10: Sensitivity analysis of exposure assessment to polycyclic aromatic hydrocarbons and breast cancer risk excluding cases not registered with the BC Screening Mammography Programⱡ JEM Metric Cases (%)   Controls (%)  OR 95% CI DOM Cumulative Exposure – Eq. (3.1)       None 660 (75.7)  866 (75.9)  -----    Low (0.1–1.8) 63 (7.2)  91 (8.0)  1.09 0.77 1.55  Medium (1.9–6.8) 69 (7.9)  91 (8.0)  1.16 0.83 1.64  High (6.9–90.0) 80 (9.2)  93 (8.2)  1.04 0.75 1.45       p-trend > 0.5 NCIø Cumulative Exposure – Eq. (3.4)        None 696 (79.8)  914 (80.1)  -----    Low (0.1–1.8) 48 (14.3)  77 (6.7)  0.98 0.66 1.44  Medium (1.9–7.0) 67 (7.7)  75 (6.6)  1.36 0.95 1.94  High (7.1–79.0) 61 (7.0)  75 (6.6)  0.97 0.67 1.40       p-trend > 0.4 PPM Ever-Never: Any level       Never 280 (32.1)  454 (39.8)  -----    Ever 592 (67.9)  687 (60.2)  1.29 1.06 1.58          Ever-Never: At maximum level†       Never 280 (32.1)  454 (39.8)  -----    Maximum level at low* 77 (08.8)  107 (09.4)  1.04 0.74 1.47  Maximum level at medium¶ 139 (15.9)  178 (15.6)  1.21 0.92 1.60  Maximum level at high 376 (43.1)  402 (35.2)  1.41 1.13 1.75       ptrend < 0.01          Duration (years) of exposure at any level       None (0) 280 (32.1)  454 (39.8)  -----    Short (0.1–4.2) 184 (21.1)  229 (20.1)  1.40 1.09 1.81  Moderate (4.3–13.0) 204 (23.4)  230 (20.1)  1.31 1.02 1.69  Long (13.1–82.2) 204 (23.4)  228 (20.0)  1.16 0.90 1.51       ptrend > 0.1          Duration (years) of exposure at medium¶ or high levels        None (0) 280 (32.1)  454 (39.8)  -----    Ever: Maximum at low level∆ 77 (08.8)  107 (09.4)  1.04 0.74 1.46  Short (0.1–2.7) 161 (18.5)  194 (17.0)  1.41 1.08 1.84  Moderate (2.8–9.0) 159 (18.2)  196 (17.2)  1.27 0.97 1.66  Long (9.1–80.8) 195 (22.4)  190 (16.7)  1.35 1.04 1.77       ptrend = 0.03          Duration (years) of exposure at the high level      None (0) 280 (32.1)  454 (39.7)  -----    Ever: Maximum at low* or medium¶ levels◊ 216 (24.8)  285 (25.2)  1.14 0.90 1.46  Short (0.1–2.3) 124 (14.2)  134 (11.7)  1.60 1.19 2.15  Moderate (2.4–7.4) 107 (12.3)  134 (11.7)  1.27 0.93 1.73  Long (7.5–74.1) 145 (16.6)  134 (11.7)  1.36 1.01 1.83           ptrend = 0.03        Average Probability – Eq. (3.5)      None (0) 280 (32.1)  454 (39.7)  -----    Low (0.01–0.02) 189 (21.7)  229 (20.1)  1.30 1.01 1.67  Medium (0.03–0.07) 198 (22.7)  229 (20.1)  1.35 1.05 1.74  High (0.08–0.88) 205 (23.5)  229 (20.1)  1.22 0.94 1.58       ptrend = 0.07                     175 JEM Metric Cases (%)   Controls (%)  OR 95% CI PPM Weighted Duration (Years) – Eq. (3.6)      None (0) 280 (32.1)  454 (39.7)  -----    Short (0.1–0.4) 198 (22.7)  229 (20.1)  1.37 1.07 1.76  Moderate (0.5–1.7) 177 (20.3)  229 (20.1)  1.17 0.91 1.52  Long (1.8–55.1) 217 (24.9)  229 (20.1)  1.34 1.04 1.72       ptrend < 0.05 ⱡ  Adjusted for age, centre, education, ethnicity, smoking (pack-years). All ptrend values are calculated by treating ordinal categories as continuous values (ptrend values are calculated by treating ordinal categories as continuous values for the exposed categories only) ø Exposed for the NCI-JEM is defined as non-zero intensity score conditional on the probability score being medium or higher. Jobs that have a probability of low become grouped with the unexposed jobs using this approach. † Maximum level classification, regardless of duration, is the maximum exposure level to which the participant was exposed across all occupations.    ∆ Exposed for the NCI-JEM is defined as non-zero intensity score conditional on the probability score being medium or higher. Jobs that have a probability of low become grouped with the unexposed jobs using this approach. * Analysis for exposure at low level (estimated probability of exposure above 0.2 mg·m-3 of coal tar pitch volatiles) is θ = (0.1 – 2.9%) in at least one job ¶ Analysis for exposure at medium level (estimated probability of exposure above 0.2 mg·m-3 of coal tar pitch volatiles) is θ = (3.0 – 8.9%) in at least one job Analysis for exposure at high level (estimated probability of exposure above 0.2 mg·m-3 of coal tar pitch volatiles) is θ ≥ 9% in at least one job  ◊ To ensure referent group are truly unexposed, a nuisance variable was created for the low/medium-exposed group where value = 1, if highest duration at low or medium level exposure, else 0   176 Supplementary Table A11: Exposure assessment to polycyclic aromatic hydrocarbons and breast cancer risk stratified by smoking status‡    Smoking: Never   Smoking: Ever  JEM Metric Cases (%) Controls (%)   OR 95% CI  Cases (%) Controls (%)   OR 95% CI p-interaction DOM Cumulative Exposure – Eq. (3.1)                None 504 (77.4) 505 (78.3)  -----    305 (70.1) 357 (72.7)  -----     Low (0.1–1.8) 42 (6.5) 45 (7.0)  1.14 0.72 1.80  37 (8.5) 46 (9.4)  0.96 0.60 1.55   Medium (1.9–6.8) 43 (6.6) 49 (7.6)  0.90 0.58 1.41  52 (12.0) 41 (8.4)  1.53 0.97 2.40   High (6.9–90.0) 62 (9.5) 46 (7.1)  1.22 0.80 1.85  41 (9.4) 47 (9.6)  0.99 0.63 1.57           ptrend > 0.5      ptrend > 0.4 pint-trend > 0.7 NCIø Cumulative Exposure – Eq. (3.4)                None 533 (81.9) 527 (81.7)  -----    322 (74.0) 382 (77.8)  -----     Low (0.1–1.8) 32 (4.9) 41 (6.4)  0.98 0.59 1.61  28 (6.4) 36 (7.3)  0.93 0.54 1.59   Medium (1.9–7.0) 39 (6.0) 38 (5.9)  1.08 0.67 1.76  51 (11.7) 37 (7.5)  1.74 1.10 2.75   High (7.1–79.0) 47 (7.2) 39 (6.0)  1.07 0.67 1.70  34 (7.8) 36 (7.3)  1.08 0.65 1.78           ptrend > 0.7       ptrend > 0.1 pint-trend > 0.3 PPM Ever-Never: Any level                Never 217 (33.3) 268 (41.5)  -----    124 (28.5) 182 (37.1)  -----     Ever 434 (66.7) 377 (58.5)  1.30 1.02 1.81  311 (71.5) 309 (62.9)  1.35 1.01 1.81                 p > 0.6                  Ever-Never: At maximum level†                Never 217 (33.3) 268 (41.5)  -----    124 (28.5) 182 (37.1)  -----     Maximum level at low* 52 (08.0) 60 (09.3)  0.98 0.64 1.51  38 (08.7) 47 (09.6)  1.07 0.65 1.75   Maximum level at medium¶ 105 (16.1) 105 (16.3)  1.21 0.86 1.70  70 (16.1) 73 (14.9)  1.31 0.87 1.99   Maximum level at high 277 (42.5) 212 (32.9)  1.44 1.10 1.89  203 (46.7) 189 (38.5)  1.43 1.04 1.98       ptrend < 0.01      ptrend = 0.02 pint-trend > 0.7                  Duration (years) of exposure at any level               None (0) 217 (33.3) 268 (41.5)  -----    124 (28.5) 182 (37.1)  -----     Short (0.1–4.2) 140 (21.5) 123 (19.1)  1.55 1.13 2.13  95 (21.8) 105 (21.4)  1.28 0.88 1.85   Moderate (4.3–13.0) 138 (21.2) 122 (18.9)  1.26 0.92 1.74  116 (26.7) 108 (22.0)  1.47 1.02 2.11   Long (13.1–82.2) 156 (24.0) 132 (20.5)  1.11 0.81 1.52  100 (23.0) 96 (19.5)  1.31 0.89 1.92       ptrend > 0.4     ptrend = 0.09 pint-trend > 0.2                  Duration (years) of exposure at medium¶ or high levels               None (0) 217 (33.3) 268 (41.5)  -----    124 (28.5) 182 (37.1)  -----     Ever: Maximum at low level∆ 52 (08.0) 60 (09.3)  0.98 0.63 1.51  38 (08.7) 47 (09.6)  1.07 0.65 1.75   Short (0.1–2.7) 121 (18.6) 108 (16.7)  1.47 1.05 2.04  82 (18.9) 86 (17.5)  1.37 0.93 2.04   Moderate (2.8–9.0) 105 (16.1) 101 (15.7)  1.27 0.90 1.79  96 (22.1) 94 (19.1)  1.39 0.94 2.03   Long (9.1–80.8) 156 (24.0) 108 (16.7)  1.34 0.97 1.87  95 (21.8) 82 (16.7)  1.50 1.01 2.22       ptrend = 0.04     ptrend = 0.07 pint-trend > 0.5                                 177    Smoking: Never   Smoking: Ever  JEM Metric Cases (%) Controls (%)   OR 95% CI  Cases (%) Controls (%)   OR 95% CI p-interaction PPM Duration (years) of exposure at the high level               None (0) 217 (33.3) 268 (41.5)  -----    124 (28.5) 182 (37.1)  -----     Ever: Maximum at low*  or medium¶ levels◊ 157 (24.2) 165 (25.6)  1.12 0.83 1.51  108 (24.8) 120 (24.4)  1.22 0.86 1.75   Short (0.1–2.3) 82 (12.6) 69 (10.7)  1.56 1.06 2.28  74 (17.0) 65 (13.2)  1.58 1.04 2.41   Moderate (2.4–7.4) 73 (11.2) 69 (10.7)  1.29 0.88 1.92  62 (14.3) 64 (13.0)  1.31 0.85 2.04   Long (7.5–74.1) 122 (18.7) 74 (11.5)  1.47 1.02 2.12  67 (15.4) 60 (12.2)  1.44 0.93 2.21           ptrend = 0.03          ptrend < 0.10 pint-trend > 0.9             Average Probability – Eq. (3.5)           None (0) 217 (33.3) 268 (41.5)  -----    124 (28.5) 182 (37.1)  -----     Low (0.01–0.02) 124 (19.0) 120 (18.6)  1.33 0.96 1.83  94 (21.6) 109 (22.2)  1.17 0.81 1.69   Medium (0.03–0.07) 140 (21.5) 129 (20.0)  1.28 0.94 1.75  115 (26.4) 100 (20.4)  1.58 1.09 2.29   High (0.08–0.88) 170 (26.1) 128 (19.9)  1.29 0.94 1.77  102 (23.4) 100 (20.4)  1.35 0.93 1.98           ptrend = 0.09           ptrend = 0.04 pint-trend > 0.4                                Weighted Duration (Years) – Eq. (3.6)               None (0) 217 (33.3) 268 (41.5)  -----    124 (28.5) 182 (37.1)  -----     Short (0.1–0.4) 135 (20.7) 125 (19.4)  1.35 0.98 1.85  99 (22.8) 104 (21.2)  1.29 0.89 1.86   Moderate (0.5–1.7) 134 (20.6) 125 (19.4)  1.26 0.91 1.72  98 (22.5) 104 (21.2)  1.28 0.88 1.87   Long (1.8–55.1) 165 (25.3) 127 (19.7)  1.30 0.95 1.79  114 (26.2) 101 (20.6)  1.50 1.04 2.18       ptrend = 0.09     ptrend = 0.04 pint-trend > 0.4 ‡  Adjusted for age, centre, education, ethnicity. All ptrend values are calculated by treating ordinal categories as continuous values (ptrend values are calculated by treating ordinal categories as continuous values for the exposed categories only)  ø Exposed for the NCI-JEM is defined as non-zero intensity score conditional on the probability score being medium or higher. Jobs that have a probability of low become grouped with the unexposed jobs using this approach. † Maximum level classification, regardless of duration, is the maximum exposure level to which the participant was exposed across all occupations.    ∆ Exposed for the NCI-JEM is defined as non-zero intensity score conditional on the probability score being medium or higher. Jobs that have a probability of low become grouped with the unexposed jobs using this approach. * Analysis for exposure at low level (estimated probability of exposure above 0.2 mg·m-3 of coal tar pitch volatiles) is θ = (0.1 – 2.9%) in at least one job ¶ Analysis for exposure at medium level (estimated probability of exposure above 0.2 mg·m-3 of coal tar pitch volatiles) is θ = (3.0 – 8.9%) in at least one job Analysis for exposure at high level (estimated probability of exposure above 0.2 mg·m-3 of coal tar pitch volatiles) is θ ≥ 9% in at least one job ◊ To ensure referent group are truly unexposed, a nuisance variable was created for the low/medium-exposed group where value = 1, if highest duration at low or medium level exposure, else 0    178 Supplementary Table A12: Exposure assessment to polycyclic aromatic hydrocarbons and breast cancer risk stratified by European and Asian ethnicities‡    Ethnicity: European   Ethnicity: Asian   JEM Metric Cases (%) Controls (%)   OR 95% CI  Cases (%) Controls (%)   OR 95% CI  p-interaction DOM Cumulative Exposure – Eq. (3.1)                 None 497 (72.4) 676 (75.7)  -----    205 (77.7) 96 (73.8)  -----      Low (0.1–1.8) 58 (8.5) 78 (8.7)  1.06 0.74 1.53  11 (4.2) 8 (6.2)  0.76 0.29 2.01    Medium (1.9–6.8) 70 (10.2) 65 (7.3)  1.51 1.05 2.18  15 (5.7) 12 (9.2)  0.57 0.26 1.29    High (6.9–90.0) 61 (8.9) 74 (8.3)  1.06 0.73 1.52  33 (12.5) 14 (10.8)  1.06 0.53 2.12            ptrend > 0.1         ptrend > 0.6   pint-trend > 0.2 NCIø Cumulative Exposure – Eq. (3.4)                 None 526 (69.2) 712 (79.7)  -----    217 (82.2) 103  (79.2)  -----      Low (0.1–1.8) 45 (6.6) 65 (7.3)  1.00 0.67 1.50  7 (2.7) 7 (5.4)  0.57 0.19 1.72    Medium (1.9–7.0) 68 (9.9) 56 (6.3)  1.67 1.14 2.44  14 (5.3) 10 (7.7)  0.65 0.27 1.54    High (7.1–79.0) 47 (6.9) 60 (6.7)  0.99 0.66 1.49  26 (9.8) 10 (7.7)  1.17 0.53 2.57           ptrend > 0.1         ptrend > 0.8   pint-trend > 0.3 PPM Ever-Never: Any level                 Never 209 (30.5) 363 (40.6)  -----    81 (30.7) 40 (30.8)  -----      Ever 477 (69.5) 530 (59.4)  1.46 1.17 1.81  183 (69.3) 90 (69.2)  0.89 0.54 1.45                   p = 0.07                    Ever-Never: At maximum level†                 Never 209 (30.5) 363 (40.6)  -----    81 (30.7) 40 (30.8)  -----      Maximum level at low* 59 (08.6) 86 (09.6)  1.08 0.74 1.58  19 (07.2) 12 (09.2)  0.69 0.30 1.59    Maximum level at medium¶ 119 (17.3) 146 (16.3)  1.32 0.98 1.79  37 (14.0) 16 (12.3)  1.03 0.50 2.13    Maximum level at high 299 (43.6) 298 (33.4)  1.64 1.29 2.09  127 (48.1) 62 (47.7)  0.89 0.53 1.50        ptrend < 0.01     ptrend > 0.7  p < 0.05                   Duration (years) of exposure at any level                None (0) 209 (30.5) 363 (40.6)  -----    81 (30.7) 40 (30.8)  -----      Short (0.1–4.2) 168 (24.5) 185 (20.7)  1.58 1.20 2.08  37 (14.0) 24 (18.5)  0.79 0.41 1.53    Moderate (4.3–13.0) 166 (24.2) 176 (19.7)  1.49 1.13 1.97  58 (22.0) 31 (23.8)  0.81 0.44 1.49    Long (13.1–82.2) 143 (20.8) 169 (19.0)  1.27 0.95 1.70  88 (33.3) 35 (26.9)  1.04 0.58 1.88        ptrend = 0.05     ptrend > 0.9  p > 0.4                   Duration (years) of exposure at medium¶ or high levels                None (0) 209 (30.5) 363 (40.6)  -----    81 (30.7) 40 (30.8)  -----      Ever: Maximum at low level∆ 59 (08.6) 86 (09.7)  1.08 0.74 1.57  19 (07.2) 12 (09.2)  0.71 0.31 1.64    Short (0.1–2.7) 143 (20.8) 149 (16.7)  1.67 1.25 2.23  37 (14.0) 26 (20.0)  0.71 0.37 1.37    Moderate (2.8–9.0) 137 (20.0) 161 (18.0)  1.39 1.03 1.86  36 (13.6) 21 (16.2)  0.79 0.40 1.57    Long (9.1–80.8) 138 (20.1) 134 (15.0)  1.56 1.15 2.12  91 (34.5) 31 (23.8)  1.22 0.67 2.23        ptrend < 0.01     ptrend > 0.5  p > 0.5   179    Ethnicity: European   Ethnicity: Asian   JEM Metric Cases (%) Controls (%)   OR 95% CI  Cases (%) Controls (%)   OR 95% CI  p-interaction PPM Duration (years) of exposure at the high level                None (0) 209 (30.5) 363 (40.6)  -----    81 (30.7) 40 (30.8)  -----      Ever: Maximum at low*  or medium¶ levels◊ 178 (26.0) 232 (26.1)  1.23 0.94 1.60  56 (21.2) 28 (21.5)  0.91 0.49 1.67    Short (0.1–2.3) 114 (16.6) 103 (11.5)  1.90 1.38 2.63  23 (08.7) 20 (15.4)  0.59 0.28 1.22    Moderate (2.4–7.4) 86 (12.5) 103 (11.5)  1.40 0.99 1.97  29 (11.0) 14 (10.8)  0.98 0.45 2.11    Long (7.5–74.1) 99 (14.4) 92 (10.3)  1.61 1.14 2.26  75 (28.4) 28 (21.5)  1.09 0.58 2.04             ptrend < 0.01         ptrend > 0.6  pint-trend > 0.3                  Average Probability – Eq. (3.5)                None (0) 209 (30.5) 363 (40.6)  -----    81 (30.7) 40 (30.8)  -----      Low (0.01–0.02) 161 (23.5) 193 (21.6)  1.36 1.04 1.79  36 (13.6) 23 (17.7)  0.73 0.37 1.43    Medium (0.03–0.07) 167 (24.3) 181 (20.3)  1.51 1.14 2.00  55 (20.8) 26 (20.0)  0.99 0.54 1.85    High (0.08–0.88) 149 (21.7) 156 (17.5)  1.51 1.12 2.02  92 (34.8) 41 (31.5)  0.91 0.51 1.62        ptrend < 0.01     ptrend > 0.9  pint-trend > 0.1                  Weighted Duration (Years) – Eq. (3.6)                None (0) 209 (30.5) 363 (40.6)  -----    81 (30.7) 40 (30.8)  -----      Short (0.1–0.4) 162 (23.6) 187 (20.9)  1.43 1.08 1.88  47 (17.8) 25 (19.2)  0.91 0.48 1.73    Moderate (0.5–1.7) 149 (21.7) 180 (20.2)  1.36 1.02 1.81  52 (19.7) 28 (21.5)  0.83 0.45 1.54    Long (1.8–55.1) 166 (24.2) 163 (18.3)  1.60 1.20 2.14  84 (31.8) 37 (28.5)  0.92 0.51 1.65        ptrend < 0.01     ptrend > 0.7  pint-trend > 0.1 ‡  Adjusted for age, centre, education, smoking (pack-years) – centre is excluded for Asian strata. All ptrend values are calculated by treating ordinal categories as continuous values (ptrend values are calculated by treating ordinal categories as continuous values for the exposed categories only) ø Exposed for the NCI-JEM is defined as non-zero intensity score conditional on the probability score being medium or higher. Jobs that have a probability of low become grouped with the unexposed jobs using this approach. † Maximum level classification, regardless of duration, is the maximum exposure level to which the participant was exposed across all occupations.    ∆ Exposed for the NCI-JEM is defined as non-zero intensity score conditional on the probability score being medium or higher. Jobs that have a probability of low become grouped with the unexposed jobs using this approach. * Analysis for exposure at low level (estimated probability of exposure above 0.2 mg·m-3 of coal tar pitch volatiles) is θ = (0.1 – 2.9%) in at least one job ¶ Analysis for exposure at medium level (estimated probability of exposure above 0.2 mg·m-3 of coal tar pitch volatiles) is θ = (3.0 – 8.9%) in at least one job Analysis for exposure at high level (estimated probability of exposure above 0.2 mg·m-3 of coal tar pitch volatiles) is θ ≥ 9% in at least one job ◊ To ensure referent group are truly unexposed, a nuisance variable was created for the low/medium-exposed group where value = 1, if highest duration at low or medium level exposure, else 0    180 Supplementary Table A13: Exposure assessment to polycyclic aromatic hydrocarbons and breast cancer risk stratified by socioeconomic status‡    SES (College or less)   SES (University or higher)   JEM Metric Cases (%) Controls (%)   OR 95% CI  Cases (%) Controls (%)   OR 95% CI  p-interaction DOM Cumulative Exposure – Eq. (3.1)                 None 503 (72.5) 472 (75.5)  -----    308 (77.8) 394 (76.4)  -----      Low (0.1–1.8) 54 (07.8) 42 (06.7)  1.32 0.86 2.05  25 (06.3) 49 (09.5)  0.73 0.43 1.23    Medium (1.9–6.8) 61 (08.8) 46 (07.4)  1.38 0.91 2.09  34 (08.6) 45 (08.7)  1.05 0.65 1.72    High (6.9–90.0) 76 (10.9) 65 (10.4)  1.02 0.70 1.47  29 (07.3) 28 (05.4)  1.46 0.83 2.56         ptrend > 0.4     ptrend > 0.3  pint-trend > 0.8 NCIø Cumulative Exposure – Eq. (3.4)                 None 532 (76.7) 499 (79.8)  -----    325 (82.1) 415 (80.4)  -----      Low (0.1–1.8) 44 (06.3) 34 (05.4)  1.36 0.84 2.20  16 (04.0) 43 (08.3)  0.52 0.29 0.96    Medium (1.9–7.0) 59 (08.5) 39 (06.3)  1.51 0.98 2.34  31 (07.8) 36 (07.0)  1.25 0.75 2.11    High (7.1–79.0) 58 (08.5) 53 (08.5)  0.95 0.64 1.42  24 (06.1) 22 (04.3)  1.55 0.84 2.88        ptrend > 0.4     ptrend > 0.2  pint-trend > 0.6 PPM Ever-Never: Any level                 Never 168 (24.2) 201 (32.2)  -----    174 (43.9) 253 (49.0)  -----      Ever 526 (75.8) 424 (67.8)  1.41 1.10 1.81  222 (56.1) 263 (51.0)  1.26 0.95 1.66                   p > 0.5                   Ever-Never: At maximum level†                 Never 168 (24.2) 201 (32.2)  -----    174 (43.9) 253 (49.0)  -----      Maximum level at low* 62 (08.9) 67 (10.7)  1.06 0.71 1.60  28 (07.1) 40 (07.8)  1.05 0.62 1.80    Maximum level at medium¶ 124 (17.9) 110 (17.6)  1.36 0.97 1.91  51 (12.9) 68 (13.2)  1.13 0.74 1.73    Maximum level at high 340 (49.0) 247 (39.5)  1.54 1.17 2.02  143 (36.1) 155 (30.0)  1.37 1.01 1.87        ptrend < 0.01     ptrend = 0.05  pint-trend > 0.5                   Duration (years) of exposure at any level                None (0) 168 (24.2) 201 (32.2)  -----    174 (43.9) 253 (49.0)  -----      Short (0.1–4.2) 140 (20.2) 106 (16.9)  1.64 1.18 2.30  95 (24.0) 123 (23.9)  1.20 0.85 1.69    Moderate (4.3–13.0) 178 (25.6) 155 (24.8)  1.31 0.97 1.79  77 (19.5) 75 (14.5)  1.56 1.06 2.31    Long (13.1–82.2) 208 (30.0) 163 (26.1)  1.36 1.01 1.84  50 (12.6) 65 (12.6)  1.04 0.67 160        ptrend = 0.10     ptrend > 0.2  pint-trend > 0.9                   Duration (years) of exposure at medium¶ or high levels                None (0) 168 (24.2) 201 (32.2)  -----    174 (43.9) 253 (49.0)  -----      Ever: Maximum at low level∆ 62 (08.9) 67 (10.7)  1.07 0.71 1.61  28 (07.1) 40 (07.8)  1.06 0.62 1.80    Short (0.1–2.7) 124 (17.9) 92 (14.7)  1.62 1.14 2.29  79 (19.9) 102 (19.8)  1.16 0.81 1.68    Moderate (2.8–9.0) 143 (20.6) 123 (19.7)  1.38 0.99 1.91  60 (15.2) 73 (14.1)  1.31 0.87 1.97    Long (9.1–80.8) 197 (28.4) 142 (22.7)  1.49 1.09 2.02  55 (13.9) 48 (09.3)  1.53 0.97 2.41        ptrend = 0.02     ptrend = 0.04  pint-trend > 0.6                                   181    SES (College or less)   SES (University or higher)   JEM Metric Cases (%) Controls (%)   OR 95% CI  Cases (%) Controls (%)   OR 95% CI  p-interaction PPM Duration (years) of exposure at the high level                None (0) 168 (24.2) 201 (32.2)  -----    174 (43.9) 253 (49.0)  -----      Ever: Maximum at low*  or medium¶ levels◊ 186 (26.8) 177 (28.3)  1.25 0.92 1.68  79 (19.9) 108 (20.9)  1.10 0.77 1.58    Short (0.1–2.3) 102 (14.7) 66 (10.6)  1.90 1.30 2.79  54 (13.6) 68 (13.2)  1.22 0.80 1.87    Moderate (2.4–7.4) 91 (13.1) 80 (12.8)  1.33 0.91 1.94  45 (11.4) 54 (10.5)  1.28 0.81 2.02    Long (7.5–74.1) 147 (21.2) 101 (16.1)  1.46 1.04 2.06  44 (11.1) 33 (06.4)  1.78 1.07 2.98            ptrend = 0.04     ptrend = 0.02  pint-trend > 0.3  Average Probability – Eq. (3.5)                None (0) 168 (24.2) 201 (32.2)  -----    174 (43.9) 253 (49.0)  -----      Low (0.01–0.02) 143 (20.6) 134 (21.4)  1.28 0.93 1.76  75 (18.9) 95 (18.4)  1.23 0.85 1.79    Medium (0.03–0.07) 177 (25.5) 130 (20.8)  1.63 1.19 2.23  77 (19.5) 99 (19.2)  1.17 0.81 1.69    High (0.08–0.88) 206 (29.7) 160 (25.6)  1.36 1.01 1.84  70 (17.7) 69 (13.4)  1.42 0.94 2.13        ptrend = 0.02     ptrend = 0.10  pint-trend > 0.9                                  Weighted Duration (Years) – Eq. (3.6)                None (0) 168 (24.2) 201 (32.2)  -----    174 (43.9) 253 (49.0)  -----      Short (0.1–0.4) 157 (22.6) 131 (20.9)  1.41 1.02 1.93  77 (19.5) 98 (19.0)  1.21 0.84 1.75    Moderate (0.5–1.7) 158 (22.8) 138 (22.1)  1.32 0.96 1.81  75 (18.9) 91 (17.6)  1.26 0.86 1.83    Long (1.8–55.1) 211 (30.4) 155 (24.8)  1.50 1.11 2.02  70 (17.7) 74 (14.4)  1.32 0.88 1.97        ptrend = 0.02     ptrend > 0.1  pint-trend > 0.9 ‡  Adjusted for age, centre, education, smoking (pack-years) – centre is excluded for Asian strata. All ptrend values are calculated by treating ordinal categories as continuous values (ptrend values are calculated by treating ordinal categories as continuous values for the exposed categories only) ø Exposed for the NCI-JEM is defined as non-zero intensity score conditional on the probability score being medium or higher. Jobs that have a probability of low become grouped with the unexposed jobs using this approach. † Maximum level classification, regardless of duration, is the maximum exposure level to which the participant was exposed across all occupations.    * Analysis for exposure at low level (estimated probability of exposure above 0.2 mg·m-3 of coal tar pitch volatiles) is θ = (0.1 – 2.9%) in at least one job ¶ Analysis for exposure at medium level (estimated probability of exposure above 0.2 mg·m-3 of coal tar pitch volatiles) is θ = (3.0 – 8.9%) in at least one job Analysis for exposure at high level (estimated probability of exposure above 0.2 mg·m-3 of coal tar pitch volatiles) is θ ≥ 9% in at least one job ◊ To ensure referent group are truly unexposed, a nuisance variable was created for the low/medium-exposed group where value = 1, if highest duration at low or medium level exposure, else 0     182 Supplementary Table A14: Exposure assessment to polycyclic aromatic hydrocarbons and breast cancer risk stratified by body mass index‡    BMI (Normal or less: < 25)   B   (Overweight or above: ≥ 25)   JEM Metric Cases (%) Controls (%)   OR 95% CI  Cases (%) Controls (%)   OR 95% CI  p-interaction DOM Cumulative Exposure – Eq. (3.1)                 None 451 (75.7) 505 (74.7)  -----    356 (73.1) 356 (77.9)  -----      Low (0.1–1.8) 38 (06.4) 55 (08.1)  0.90 0.57 1.42  40 (08.2) 34 (07.5)  1.30 0.79 2.14    Medium (1.9–6.8) 50 (08.4) 63 (09.3)  0.98 0.65 1.48  45 (09.2) 28 (06.1)  1.73 1.03 2.90    High (6.9–90.0) 57 (09.5) 53 (07.9)  1.09 0.72 1.65  46 (09.5) 39 (08.5)  1.10 0.68 1.76          ptrend > 0.8      ptrend > 0.1  pint-trend > 0.4 NCI ø Cumulative Exposure – Eq. (3.4)                 None 475 (79.7) 537 (79.4)  -----    378 (77.6) 371 (81.2)  -----      Low (0.1–1.8) 31 (05.2) 52 (07.7)  0.77 0.48 1.25  28 (05.7) 24 (05.3)  1.25 0.70 2.25    Medium (1.9–7.0) 40 (06.7) 47 (07.0)  1.05 0.66 1.66  50 (10.3) 28 (06.1)  1.90 1.15 3.15    High (7.1–79.0) 50 (08.4) 40 (05.9)  1.25 0.79 1.97  31 (06.4) 34 (07.4)  0.83 0.49 1.41         ptrend > 0.4      ptrend > 0.3  pint-trend > 0.8 PPM Ever-Never: Any level                 Never 188 (31.5) 280 (41.4)  -----    151 (31.0) 169 (37.0)  -----      Ever 408 (68.5) 396 (58.6)  1.38 1.07 1.77  336 (69.0) 288 (63.0)  1.22 0.92 1.62                   p > 0.5                   Ever-Never: At maximum level†                 Never 188 (31.5) 280 (41.4)  -----    151 (31.0) 169 (37.0)  -----      Maximum level at low* 46 (07.7) 62 (09.2)  0.99 0.63 1.55  43 (08.8) 45 (09.8)  1.01 0.62 1.64    Maximum level at medium¶ 101 (17.0) 97 (14.3)  1.46 1.02 2.08  74 (15.2) 81 (17.7)  1.03 0.69 1.54    Maximum level at high 261 (43.8) 237 (35.1)  1.45 1.10 1.91  219 (45.0) 162 (35.4)  1.38 1.01 1.90         ptrend < 0.01          ptrend = 0.05  pint-trend > 0.6                   Duration (years) of exposure at any level                None (0) 188 (31.5) 280 (41.4)  -----    151 (31.0) 169 (37.0)  -----      Short (0.1–4.2) 123 (20.6) 141 (20.9)  1.33 0.97 1.84  112 (23.0) 86 (18.8)  1.54 1.06 2.24    Moderate (4.3–13.0) 135 (22.7) 129 (19.1)  1.42 1.03 1.97  118 (24.2) 101 (22.1)  1.18 0.83 1.70    Long (13.1–82.2) 50 (25.2) 126 (18.6)  1.39 1.01 1.93  106 (21.8) 101 (22.1)  0.98 0.68 1.43         ptrend = 0.03          ptrend > 0.9  pint-trend > 0.1                   Duration (years) of exposure at medium¶ or high levels                None (0) 188 (31.5) 280 (41.4)  -----    151 (31.0) 169 (37.0)  -----      Ever: Maximum at low level∆ 46 (07.8) 62 (09.2)  0.99 0.64 1.55  43 (08.8) 45 (09.8)  1.01 0.62 1.64    Short (0.1–2.7) 114 (19.1) 114 (16.9)  1.49 1.06 2.08  88 (18.1) 78 (17.1)  1.27 0.86 1.87    Moderate (2.8–9.0) 105 (17.6) 116 (17.1)  1.26 0.89 1.78  97 (19.9) 79 (17.3)  1.36 0.93 2.00    Long (9.1–80.8) 143 (24.0) 104 (15.4)  1.63 1.16 2.29  108 (22.2) 86 (18.8)  1.17 0.80 1.71         ptrend < 0.01          ptrend > 0.2  pint-trend > 0.2                                   183    BMI (Normal or less: < 25)   B   (Overweight or above: ≥ 25)   JEM Metric Cases (%) Controls (%)   OR 95% CI  Cases (%) Controls (%)   OR 95% CI  p-interaction PPM Duration (years) of exposure at the high level                None (0) 188 (31.5) 280 (41.4)  -----    151 (31.0) 169 (37.0)  -----      Ever: Maximum at low*  or medium¶ levels◊ 147 (24.7) 159 (23.5)  1.27 0.94 1.73  117 (24.0) 126 (27.6)  1.02 0.72 1.45    Short (0.1–2.3) 78 (13.1) 78 (11.6)  1.49 1.01 2.18  76 (15.6) 53 (11.6)  1.66 1.08 2.55    Moderate (2.4–7.4) 76 (12.7) 86 (12.7)  1.25 0.85 1.83  59 (12.1) 48 (10.5)  1.32 0.83 2.10    Long (7.5–74.1) 107 (18.0) 73 (10.8)  1.64 1.12 2.40  84 (17.3) 61 (13.3)  1.20 0.79 1.83             ptrend = 0.01          ptrend > 0.3  pint-trend > 0.6                  Average Probability – Eq. (3.5)                None (0) 188 (31.5) 280 (41.4)  -----    151 (31.0) 169 (37.0)  -----      Low (0.01–0.02) 112 (18.8) 126 (18.6)  1.30 0.94 1.81  104 (21.4) 102 (22.3)  1.13 0.79 1.63    Medium (0.03–0.07) 142 (23.8) 135 (20.0)  1.41 1.03 1.94  113 (23.2) 92 (20.1)  1.39 0.96 2.00    High (0.08–0.88) 154 (25.9) 135 (20.0)  1.42 1.03 1.96  119 (24.4) 94 (20.6)  1.16 0.79 1.68         ptrend = 0.02      ptrend > 0.2  pint-trend > 0.5                  Weighted Duration (Years) – Eq. (3.6)                None (0) 188 (31.5) 280 (41.4)  -----    151 (31.0) 169 (37.0)  -----      Short (0.1–0.4) 125 (21.0) 124 (18.3)  1.42 1.02 1.96  107 (22.0) 104 (22.7)  1.14 0.79 1.63    Moderate (0.5–1.7) 132 (22.2) 143 (21.2)  1.25 0.91 1.72  100 (20.5) 84 (18.4)  1.29 0.89 1.89    Long (1.8–55.1) 151 (25.3) 129 (19.1)  1.49 1.08 2.06  129 (26.5) 100 (21.9)  1.25 0.87 1.80         ptrend = 0.02      ptrend > 0.1  pint-trend > 0.6 ‡  Adjusted for age, centre, education, smoking (pack-years) – centre is excluded for Asian strata. All ptrend values are calculated by treating ordinal categories as continuous values (ptrend values are calculated by treating ordinal categories as continuous values for the exposed categories only) ø Exposed for the NCI-JEM is defined as non-zero intensity score conditional on the probability score being medium or higher. Jobs that have a probability of low become grouped with the unexposed jobs using this approach. † Maximum level classification, regardless of duration, is the maximum exposure level to which the participant was exposed across all occupations.    * Analysis for exposure at low level (estimated probability of exposure above 0.2 mg·m-3 of coal tar pitch volatiles) is θ = (0.1 – 2.9%) in at least one job ¶ Analysis for exposure at medium level (estimated probability of exposure above 0.2 mg·m-3 of coal tar pitch volatiles) is θ = (3.0 – 8.9%) in at least one job Analysis for exposure at high level (estimated probability of exposure above 0.2 mg·m-3 of coal tar pitch volatiles) is θ ≥ 9% in at least one job ◊ To ensure referent group are truly unexposed, a nuisance variable was created for the low/medium-exposed group where value = 1, if highest duration at low or medium level exposure, else 0      184 Supplementary Table A15: Exposure assessment to polycyclic aromatic hydrocarbons and breast cancer risk stratified by (first degree) family history‡    No Family History   (First degree) Family History   JEM Metric Cases (%) Controls (%)   OR 95% CI  Cases (%) Controls (%)   OR 95% CI  p-interaction DOM Cumulative Exposure – Eq. (3.1)                 None 657 (75.2) 737 (75.3)  -----    155 (71.4) 129 (79.2)  -----      Low (0.1–1.8) 57 (06.5) 77 (07.9)  0.98 0.67 1.42  22 (10.2) 14 (08.6)  1.28 0.61 2.73    Medium (1.9–6.8) 70 (08.0) 81 (08.3)  1.02 0.72 1.44  25 (11.5) 10 (06.1)  2.78 1.19 6.50    High (6.9–90.0) 90 (10.3) 83 (08.5)  1.10 0.79 1.54  15 (06.9) 10 (06.1)  1.04 0.42 2.55         ptrend > 0.6     ptrend > 0.1  pint-trend > 0.1 NCI ø Cumulative Exposure – Eq. (3.4)                 None 693 (79.3) 778 (79.6)  -----    165 (76.1) 136 (83.4)  -----      Low (0.1–1.8) 42 (04.8) 66 (06.7)  0.84 0.55 1.27  18 (08.3) 11 (06.8)  1.30 0.56 3.03    Medium (1.9–7.0) 68 (07.8) 64 (06.5)  1.30 0.90 1.88  22 (10.1) 11 (06.8)  2.02 0.89 4.60    High (7.1–79.0) 71 (08.1) 70 (07.2)  1.05 0.73 1.51  12 (05.5) 5 (03.0)  1.35 0.42 4.29        ptrend > 0.4     ptrend > 0.1  pint-trend > 0.2 PPM Ever-Never: Any level                 Never 280 (32.0) 381 (39.0)  -----    62 (28.6) 73 (44.8)  -----      Ever 594 (68.0) 597 (61.0)  1.23 1.01 1.51  155 (71.4) 90 (55.2)  1.84 1.16 2.92                   p = 0.08                   Ever-Never: At maximum level†                 Never 280 (32.0) 381 (39.0)  -----    62 (28.6) 73 (44.8)  -----      Maximum level at low* 76 (08.7) 92 (09.4)  1.04 0.73 1.48  14 (06.4) 15 (09.2)  0.95 0.40 2.24    Maximum level at medium¶ 131 (15.0) 150 (15.3)  1.16 0.87 1.56  44 (20.3) 28 (17.2)  1.68 0.90 3.13    Maximum level at high 387 (44.3) 355 (36.3)  1.31 1.04 1.64  97 (44.7) 47 (28.8)  2.27 1.34 3.86         ptrend = 0.02          ptrend < 0.01  pint-trend = 0.03                   Duration (years) of exposure at any level                None (0) 280 (32.0) 381 (39.0)  -----    62 (28.6) 73 (44.8)  -----      Short (0.1–4.2) 178 (20.4) 194 (19.8)  1.31 1.01 1.71  57 (26.2) 35 (21.5)  1.95 1.09 3.48    Moderate (4.3–13.0) 202 (23.1) 202 (20.7)  1.22 0.94 1.59  54 (24.9) 28 (17.2)  2.16 1.17 3.97    Long (13.1–82.2) 214 (24.5) 201 (20.5)  1.15 0.88 1.50  44 (20.3) 27 (16.5)  1.38 0.72 2.65         ptrend > 0.2          ptrend > 0.1  pint-trend > 0.2                   Duration (years) of exposure at medium¶ or high levels                None (0) 280 (32.0) 381 (39.0)  -----    62 (28.6) 73 (44.8)  -----      Ever: Maximum at low level∆ 76 (08.7) 92 (09.4)  1.04 0.73 1.47  14 (06.4) 15 (09.2)  0.97 0.41 2.28    Short (0.1–2.7) 159 (18.2) 163 (16.6)  1.36 1.03 1.80  44 (20.3) 31 (19.0)  1.64 0.90 3.00    Moderate (2.8–9.0) 152 (17.4) 169 (17.3)  1.14 0.86 1.51  51 (23.5) 27 (16.6)  2.24 1.20 4.18    Long (9.1–80.8) 207 (23.7) 173 (17.7)  1.29 0.98 1.70  46 (21.2) 17 (10.4)  2.53 1.24 5.16         ptrend = 0.09          ptrend < 0.01  pint-trend = 0.01   185    No Family History   (First degree) Family History   JEM Metric Cases (%) Controls (%)   OR 95% CI  Cases (%) Controls (%)   OR 95% CI  p-interaction PPM Duration (years) of exposure at the high level                None (0) 280 (32.0) 381 (39.0)  -----    62 (28.6) 73 (44.8)  -----      Ever: Maximum at low*  or medium¶ levels◊ 207 (23.7) 242 (24.7)  1.11 0.86 1.43  58 (26.7) 43 (26.4)  1.43 0.82 2.49    Short (0.1–2.3) 121 (13.8) 114 (11.6)  1.50 1.10 2.05  35 (16.1) 20 (12.3)  1.94 0.98 3.82    Moderate (2.4–7.4) 109 (12.5) 118 (12.1)  1.15 0.83 1.58  27 (12.5) 16 (09.8)  2.31 1.07 5.01    Long (7.5–74.1) 157 (18.0) 123 (12.6)  1.29 0.95 1.75  35 (16.1) 11 (06.7)  2.79 1.25 6.24             ptrend > 0.1          ptrend < 0.01  pint-trend = 0.03                  Average Probability – Eq. (3.5)                None (0) 280 (32.0) 381 (39.0)  -----    62 (28.6) 73 (44.8)  -----      Low (0.01–0.02) 165 (18.9) 191 (19.5)  1.18 0.90 1.55  53 (24.4) 38 (23.3)  1.57 0.89 2.76    Medium (0.03–0.07) 212 (24.3) 202 (20.6)  1.36 1.05 1.76  43 (19.8) 27 (16.6)  1.71 0.91 3.20    High (0.08–0.88) 217 (24.8) 204 (20.9)  1.14 0.87 1.49  59 (27.2) 25 (15.3)  2.55 1.34 4.84         ptrend > 0.1          ptrend < 0.01  pint-trend = 0.03                  Weighted Duration (Years) – Eq. (3.6)                None (0) 280 (32.0) 381 (39.0)  -----    62 (28.6) 73 (44.8)  -----      Short (0.1–0.4) 181 (20.7) 193 (19.7)  1.25 0.96 1.63  53 (24.4) 36 (22.1)  1.66 0.94 2.94    Moderate (0.5–1.7) 287 (21.4) 201 (20.5)  1.18 0.90 1.53  46 (21.2) 28 (17.2)  1.75 0.95 3.25    Long (1.8–55.1) 226 (25.9) 203 (20.8)  1.25 0.97 1.63  56 (25.8) 26 (15.9)  2.26 1.19 4.28         ptrend > 0.1          ptrend < 0.01  pint-trend = 0.06 ‡  Adjusted for age, centre, education, smoking (pack-years) – centre is excluded for Asian strata. All ptrend values are calculated by treating ordinal categories as continuous values (ptrend values are calculated by treating ordinal categories as continuous values for the exposed categories only) ø Exposed for the NCI-JEM is defined as non-zero intensity score conditional on the probability score being medium or higher. Jobs that have a probability of low become grouped with the unexposed jobs using this approach. † Maximum level classification, regardless of duration, is the maximum exposure level to which the participant was exposed across all occupations.    * Analysis for exposure at low level (estimated probability of exposure above 0.2 mg·m-3 of coal tar pitch volatiles) is θ = (0.1 – 2.9%) in at least one job ¶ Analysis for exposure at medium level (estimated probability of exposure above 0.2 mg·m-3 of coal tar pitch volatiles) is θ = (3.0 – 8.9%) in at least one job Analysis for exposure at high level (estimated probability of exposure above 0.2 mg·m-3 of coal tar pitch volatiles) is θ ≥ 9% in at least one job ◊ To ensure referent group are truly unexposed, a nuisance variable was created for the low/medium-exposed group where value = 1, if highest duration at low or medium level exposure, else 0    186 Supplementary Table A16: Top 10 most common at risk industries (NAICS 3-digit code) based on Yes-No JEM-specific exposure classification using the 3,514 unique occupations observed across all participants∆.  DOM-JEM (Marginal % of Exposed Industries)   NCI-JEM (Marginal % of Exposed Industries)  PPM-JEM (Marginal % of Exposed Industries) 722: Food Service & Drinking Places (61.8)  311: Food Manufacturing (6.4)  722: Food Service & Drinking Places (16.4) 311: Food Manufacturing (4.5)  339: Miscellaneous Manufacturing (4.5)  311: Food Manufacturing (8.0) 339: Miscellaneous Manufacturing (3.4)  327: Non-Metallic Mineral Product Manufacturing (4.0)  315: Clothing Manufacturing (7.5) 445: Food & Beverage Stores (3.4)  911: Federal Government Public Administration (4.0)  334: Computer & Electronic Product Manufacturing (6.2) 211: Oil and Gas Extraction (2.2)  315: Clothing Manufacturing (3.5)  325: Chemical Manufacturing (5.4) 332: Fabricated Metal Product Manufacturing (2.2) 112: Animal Production (3.0)  322: Paper Manufacturing (4.7) 335: Electrical Equip & Comp. Manufacturing (2.2)  325: Chemical Manufacturing (3.0)  524: Insurance Carriers and Related Activities (4.5) 622: Hospitals (2.2)  326: Plastics & Rubber Products Manufacturing (3.0)  335: Electrical Equip & Comp. Manufacturing (3.5) 721: Accommodation Services (2.2)  611: Educational Services (3.0)  339: Miscellaneous Manufacturing (3.4) 813: Religious, Civic, & Prof. Organizations (2.2)  334: Computer & Electronic Product Manufacturing (2.5)  236: Construction of Buildings (2.8)  Exposure at high level (estimated probability of exposure above 0.2 mg·m-3 of coal tat pitch volatiles is θ ≥ 9% in at least one job) ∆ To obtain these marginal percentages we used the 3,514 unique occupations, which were unique identifiers that combined industry (NAICS) and occupation (SOC) codes. Only unique occupations that had non-zero scores based on their respective JEM were retained and then using those unique at risk occupations the industry code was truncated (NAICS 3-digit code). For example, 72211-353031 is the combined code for Restaurant (Industry) and Waitress (Occupation) and 72211-352012 is the combined code for Restaurant (Industry) and Cook (Occupation), the PPM-JEM identifies both as exposed, and the truncated industry code for both is 722. Using the truncated codes for exposed industries we calculated their respective marginal percentage within each JEM. Based on the 3-digit code, the DOM, NCI, and PPM-JEM identified  21, 44, and 71 industries at risk of PAH exposure, respectively.  187 Supplementary Table A17: Breast cancer odds ratios by genotype-menopausal stratum under inheritance-specific models among European women    Postmenopausal (Cases: 425, Controls: 509)  Premenopausal (Cases: 216, Controls: 294)  Interaction Gene SNP Model  OR 95% CI  OR 95% CI  p-valueⱡ padj-value₸ Regulates PAH and xenobiotic metabolism      AHR rs3757824 Dominant  0.99 (0.69 - 1.42)  0.81 (0.62 - 1.06)  0.39 0.80 AHRR rs349583 Dominant  1.04 (0.72 - 1.51)  1.25 (0.95 - 1.65)  0.44 0.82 AIP rs4084113 Dominant  0.90 (0.62 - 1.29)   0.96 (0.73 - 1.26)   0.77 0.93 ARNT rs11204735 Recessive  1.52 (0.98 - 2.35)   1.11 (0.82 - 1.51)   0.27 0.72 Production of carcinogenic intermediates      AKR1A1 rs2088102 Recessive  0.81 (0.52 - 1.28)  1.34 (0.98 - 1.81)  0.07 0.48 AKR1C1 rs6650153 Recessive  0.85 (0.14 - 5.20)  0.54 (0.19 - 1.56)  0.63 0.93 AKR1C2 rs11252867 Additive‡  1.07 (0.79 - 1.45)   1.14 (0.92 - 1.40)   0.77 0.93 AKR1C3 rs12387 Recessive  1.19 (0.47 - 2.99)  6.35 (2.15 - 18.8)  0.02 0.26 AKR1C4 rs3812617 Recessive  1.31 (0.48 - 3.57)  4.75 (1.56 - 14.5)  0.09 0.50 DHDH rs2270939 Dominant  1.10 (0.76 - 1.61)  1.07 (0.81 - 1.41)  0.86 0.97 EPHX1 rs2854461 Dominant  1.41 (0.99 - 2.02)   1.14 (0.88 - 1.48)   0.33 0.73 NAT1 rs7845127 Dominant  1.11 (0.78 - 1.57)  1.43 (1.10 - 1.85)  0.23 0.72 NAT2 rs4646243 Recessive  1.71 (0.45 - 6.48)  6.39 (1.39 - 29.4)  0.20 0.72 PTGS2 rs5275 Additive‡   0.84 (0.64 - 1.10)   0.88 (0.72 - 1.07)   0.79 0.93 CYP450 superfamily      CYP19A1 rs10046 Additive‡  1.16 (0.89 - 1.50)  1.16 (0.97 - 1.40)  0.98 1.00 CYP1A1 rs2470893 Recessive  1.79 (1.01 - 3.18)   1.14 (0.75 - 1.74)   0.24 0.72 CYP1A2 rs2470890 Recessive  1.07 (0.64 - 1.79)  1.18 (0.80 - 1.75)  0.76 0.93 CYP1B1 rs162558 Recessive  0.46 (0.18 - 1.18)  1.88 (1.07 - 3.32)  0.01 0.26 CYP2C19 rs12248560 Dominant  1.26 (0.88 - 1.80)   1.40 (1.08 - 1.83)   0.62 0.93 CYP2E1 rs2070673 Additive‡  0.73 (0.51 - 1.05)  0.93 (0.72 - 1.19)  0.29 0.72 Detoxification of reactive intermediates during xenobiotic metabolism      COMT rs5993882 Dominant  1.16 (0.81 - 1.67)  1.23 (0.95 - 1.60)  0.74 0.93 GSTP1 rs1695 Recessive  1.16 (0.66 - 2.03)   1.19 (0.78 - 1.82)   0.97 1.00 NFE2L2 rs1806649 Dominant  1.05 (0.74 - 1.51)   1.14 (0.88 - 1.48)   0.67 0.93 NQO1 rs1800566 Dominant  0.64 (0.44 - 0.94)  1.09 (0.83 - 1.42)  0.03 0.26 PON1 rs854551 Additive‡  1.11 (0.81 - 1.53)  1.29 (1.03 - 1.61)  0.46 0.82 Estradiol metabolism      ESR1 rs2813543 Recessive  0.29 (0.10 - 0.86)  0.68 (0.35 - 1.30)  0.19 0.72 ESR2 rs1271572 Recessive  0.81 (0.51 - 1.31)  0.84 (0.61 - 1.17)  1.00 1.00 ⱡ Adjusted for age and centre ‡ Additive model shows OR for each additional minor allele ₸ Adjusted p-value for the false discovery rate   188 Supplementary Table A18: Genetic analysis for Asian women using gene-based permutations under inheritance-specific models   Asians (Cases: 250, Controls: 131) Gene SNP MAF Model OR 95% CI p-valueⱡ padj-value₸ Regulates PAH and xenobiotic metabolism AHR rs3757824 0.28 Dominant 0.98 (0.64 - 1.49) 0.91 0.99 AHRR rs349583 0.51 Dominant 0.99 (0.60 - 1.63) 0.98 0.99 AIP rs4084113 0.17 Dominant 0.93 (0.59 - 1.47) 0.76 0.99 ARNT rs11204735 0.58 Recessive 1.06 (0.68 - 1.65) 0.81 0.99 Production of carcinogenic intermediates AKR1A1 rs2088102 0.64 Recessive 0.99 (0.64 - 1.53) 0.96 0.99 AKR1C1 rs6650153∆ 0.05 Recessive ------ ---------- ------ ------ AKR1C2 rs11252867∆ 0.00 Additive‡ ------ ---------- ------ ------ AKR1C3 rs12387 0.16 Recessive 0.99 (0.18 - 5.49) 0.99 0.99 AKR1C4 rs3812617 0.14 Recessive 0.99 (0.18 - 5.49) 0.99 0.99 DHDH rs2270939 0.59 Dominant 1.13 (0.62 - 2.09) 0.69 0.99 EPHX1 rs2854461 0.57 Dominant 1.42 (0.83 - 2.44) 0.20 0.89 NAT1 rs7845127 0.36 Dominant 1.04 (0.67 - 1.60) 0.88 0.99 NAT2 rs4646243 0.48 Recessive 1.09 (0.65 - 1.84) 0.74 0.99 PTGS2 rs5275 0.21 Additive‡ 1.15 (0.79 - 1.69) 0.47 0.99 CYP450 superfamily CYP19A1 rs10046 0.44 Additive‡ 0.80 (0.59 - 1.09) 0.16 0.89 CYP1A1 rs2470893∆ 0.00 Recessive ------ ---------- ------ ------ CYP1A2 rs2470890 0.84 Recessive 1.37 (0.87 - 2.17) 0.18 0.89 CYP1B1 rs162558 0.03 Recessive 1.01 (0.99 - 1.03) 0.30 0.99 CYP2C19 rs12248560 0.01 Dominant 1.05 (0.19 - 5.85) 0.95 0.99 CYP2E1 rs2070673 0.41 Additive‡ 0.95 (0.71 - 1.28) 0.75 0.99 Detoxification of reactive intermediates during xenobiotic metabolism COMT rs5993882 0.11 Dominant 0.71 (0.43 - 1.17) 0.18 0.89 GSTP1 rs1695 0.19 Recessive 0.53 (0.19 - 1.46) 0.22 0.89 NFE2L2 rs1806649 0.10 Dominant 1.60 (0.89 - 2.89) 0.12 0.89 NQO1 rs1800566 0.48 Dominant 1.19 (0.74 - 1.90) 0.47 0.99 PON1 rs854551 0.07 Additive‡ 1.23 (0.65 - 2.31) 0.53 0.99 Estradiol metabolism ESR1 rs2813543 0.14 Recessive 0.49 (0.10 - 2.47) 0.39 0.99 ESR2 rs1271572 0.31 Recessive 1.14 (0.56 - 2.35) 0.72 0.99 ⱡ Adjusted for age and centre ‡ Additive model shows OR for each additional minor allele ₸ Adjusted p-value for the false discovery rate ∆ Insufficient sample size to calculate OR  189 Supplementary Table A19: Breast cancer odds ratios by genotype-exposure (duration at high PAH exposure) stratum based on co-dominant model for remaining gene-representative SNPs among European women‡      Exposure: 0 Years   Exposure: 0.1 - 3.6 Years   Exposure: 3.6 - 74.1 Years   Interaction Gene SNP Geno Case Control OR 95% CI   Case Control OR 95% CI   Case Control OR 95% CI   ptrend (pFDR) Production of carcinogenic intermediates                EPHX1 rs2854461 CC 149 246 1.00 ----------  58 57 1.97 (1.26 - 3.08)  51 66 1.36 (0.87 - 2.14)       AC 166 216 1.35 (1.01 - 1.81)  58 59 1.98 (1.27 - 3.09)  67 49 2.48 (1.58 - 3.89)     AA 41 65 1.08 (0.69 - 1.70)  19 14 2.85 (1.36 - 5.97)  18 16 2.05 (0.99 - 4.27)   0.37 (0.74) NAT1 rs7845127 GG 136 258 1.00 ----------  67 67 2.13 (1.40 - 3.26)  67 65 2.07 (1.35 - 3.16)       AG 180 218 1.52 (1.14 - 2.04)  59 44 2.85 (1.79 - 4.52)  54 56 1.86 (1.18 - 2.92)     AA 40 51 1.43 (0.89 - 2.28)  9 19 1.13 (0.49 - 2.62)  15 10 2.66 (1.13 - 6.27)   0.18 (0.53) NAT2◊ rs4646243 AA 258 387 NA NA  104 103 NA NA  93 91 NA NA       GA 88 136 NA NA  31 25 NA NA  38 40 NA NA     GG 10 4 NA NA  0 2 NA NA  5 0 NA NA   0.79 (0.89) CYP450 superfamily                  CYP19A1 rs10046 AA 94 165 1.00 ----------  36 42 1.73 (1.01 - 2.96)  30 32 1.72 (0.96 - 3.10)       GA 180 253 1.27 (0.92 - 1.76)  61 67 1.90 (1.21 - 3.01)  77 69 2.08 (1.34 - 3.24)     GG 82 108 1.33 (0.90 - 1.97)  38 21 3.79 (2.04 - 7.04)  29 30 1.83 (1.01 - 3.30)   0.82 (0.89) CYP1A1 rs2470893 GG 181 266 1.00 ----------  60 66 1.56 (1.02 - 2.37)  58 53 1.70 (1.09 - 2.65)       AG 135 210 0.92 (0.69 - 1.23)  54 54 1.67 (1.07 - 2.61)  61 65 1.43 (0.93 - 2.19)     AA 39 51 1.14 (0.71 - 1.81)  20 10 3.67 (1.60 - 8.42)  17 12 2.14 (0.98 - 4.67)   0.76 (0.89) CYP2C19 rs12248560 GG 192 329 1.00 ----------  77 79 1.90 (1.29 - 2.81)  79 87 1.59 (1.08 - 2.33)       AG 140 167 1.39 (1.04 - 1.86)  54 47 2.23 (1.42 - 3.50)  47 36 2.31 (1.41 - 3.79)     AA 24 31 1.21 (0.68 - 2.15)  4 4 2.03 (0.49 - 8.43)  10 8 2.21 (0.84 - 5.81)   0.90 (0.90) Detoxification of reactive intermediates during xenobiotic metabolism             COMT rs5993882 AA 204 333 1.00 ----------  72 70 1.96 (1.31 - 2.93)  73 75 1.67 (1.12 - 2.49)       CA 126 165 1.26 (0.94 - 1.70)  56 50 2.16 (1.39 - 3.36)  56 50 1.93 (1.23 - 3.02)     CC 26 29 1.45 (0.82 - 2.55)  7 10 1.25 (0.46 - 3.41)  7 6 1.98 (0.65 - 6.04)   0.51 (0.88) PON1 rs854551 GG 217 337 1.00 ----------  76 83 1.69 (1.15 - 2.49)  82 88 1.58 (1.09 - 2.29)       AG 121 171 1.15 (0.85 - 1.54)  46 40 2.03 (1.25 - 3.27)  49 39 2.01 (1.24 - 3.25)     AA 18 19 1.47 (0.75 - 2.89)  13 7 3.47 (1.35 - 8.96)  5 4 1.60 (0.41 - 6.22)   0.79 (0.89) Estradiol metabolism                  ESR1 rs2813543 GG 226 298 1.00 ----------  85 76 1.68 (1.14 - 2.46)  76 81 1.28 (0.87 - 1.89)       AG 120 201 0.81 (0.60 - 1.08)  45 46 1.53 (0.96 - 2.46)  57 40 2.01 (1.26 - 3.21)     AA 10 27 0.52 (0.25 - 1.11)  5 8 1.04 (0.32 - 3.37)  3 10 0.43 (0.11 - 1.60)   0.23 (0.56) ‡ Adjusted for age, centre, education, and smoking (pack-years) ◊ Sample size insufficient to create stable estimates   190 Supplementary Table A20: Breast cancer odds ratios by genotype-exposure (average probability of PAH exposure) stratum based on co-dominant model for gene-representative SNPs among European women‡       Exposure: 0%   Exposure: 0.1 - 2.9%   Exposure: 3.0 - 87.9%   Interaction Gene SNP Geno Case Control OR 95% CI   Case Control OR 95% CI   Case Control OR 95% CI   ptrend (pFDR) Production of carcinogenic intermediates                AKR1C3 rs12387 AA 121 230 1.00 ----------  130 156 1.52 (1.09 - 2.10)  158 155 1.78 (1.29 - 2.47)     GA 62 92 1.30 (0.88 - 1.94)  69 70 1.74 (1.16 - 2.60)  58 67 1.47 (0.96 - 2.25)     GG 6 6 2.02 (0.63 - 6.45)  11 2 10.5 (2.27 - 48.8)  12 6 3.93 (1.42 - 10.9)  0.32 (0.62) AKR1C4 rs3812617 GG 121 241 1.00 ----------  138 156 1.68 (1.22 - 2.32)  163 164 1.82 (1.32 - 2.50)      AG 64 83 1.54 (1.04 - 2.30)  63 72 1.60 (1.06 - 2.40)  55 58 1.66 (1.07 - 2.59)     AA 4 4 1.98 (0.48 - 8.13)  9 1 18.2 (2.25 - 147.5)  10 7 2.98 (1.10 - 8.12)  0.16 (0.48) EPHX1 rs2854461 CC 80 141 1.00 ----------  85 112 1.27 (0.85 - 1.90)  93 115 1.30 (0.87 - 1.95)       AC 89 141 1.17 (0.80 - 1.73)  97 95 1.74 (1.16 - 2.60)  105 87 2.03 (1.35 - 3.05)     AA 20 46 0.77 (0.42 - 1.40)  28 22 2.29 (1.22 - 4.30)  30 27 1.86 (1.02 - 3.38)   0.11 (0.45) NAT1 rs7845127 GG 81 168 1.00 ----------  78 103 1.48 (0.99 - 2.21)  111 119 1.80 (1.23 - 2.64)       AG 89 135 1.35 (0.92 - 1.97)  107 90 2.32 (1.57 - 3.43)  97 91 1.98 (1.33 - 2.96)     AA 19 25 1.61 (0.83 - 3.11)  25 36 1.37 (0.77 - 2.46)  20 19 1.96 (0.97 - 3.97)   0.36 (0.62) NAT2◊ rs4646243 AA 132 238 NA NA  155 172 NA NA  168 169 NA NA       GA 52 88 NA NA  50 53 NA NA  55 60 NA NA     GG 5 2 NA NA  5 4 NA NA  5 0 NA NA   0.61 (0.81) PTGS2 rs5275 AA 75 149 1.00 ----------  83 79 2.03 (1.34 - 3.10)  111 83 2.55 (1.70 - 3.83)       GA 93 141 1.39 (0.94 - 2.04)  104 121 1.67 (1.13 - 2.46)  97 110 1.63 (1.10 - 2.44)       GG 21 38 1.15 (0.62 - 2.11)   23 29 1.51 (0.81 - 2.81)   20 36 1.02 (0.55 - 1.89)   <0.01 (0.03) CYP450 superfamily                  CYP19A1 rs10046 AA 48 100 1.00 ----------  56 70 1.55 (0.94 - 2.56)  56 69 1.56 (0.94 - 2.58)       GA 94 158 1.28 (0.83 - 1.97)  105 115 1.81 (1.16 - 2.81)  119 114 1.99 (1.28 - 3.09)     GG 47 69 1.36 (0.82 - 2.27)  49 44 2.25 (1.31 - 3.86)  53 46 2.22 (1.30 - 3.80)   0.89 (0.89) CYP1A1 rs2470893 GG 96 170 1.00 ----------  102 113 1.51 (1.04 - 2.19)  101 100 1.62 (1.10 - 2.37)       AG 69 131 0.90 (0.61 - 1.33)  83 87 1.58 (1.06 - 2.36)  98 111 1.40 (0.95 - 2.05)     AA 23 27 1.60 (0.86 - 2.99)  25 28 1.43 (0.78 - 2.62)  28 18 2.64 (1.36 - 5.11)   0.88 (0.89) CYP2C19 rs12248560 GG 109 211 1.00 ----------  112 127 1.60 (1.13 - 2.27)  127 157 1.44 (1.02 - 2.02)       AG 69 99 1.33 (0.90 - 1.96)  84 89 1.74 (1.18 - 2.55)  88 60 2.58 (1.70 - 3.90)     AA 11 18 1.16 (0.52 - 2.59)  14 13 1.90 (0.85 - 4.23)  13 12 1.76 (0.77 - 4.05)   0.45 (0.67) Detoxification of reactive intermediates during xenobiotic metabolism             COMT rs5993882 AA 104 203 1.00 ----------  126 144 1.60 (1.13 - 2.25)  119 129 1.62 (1.13 - 2.32)       CA 69 109 1.23 (0.83 - 1.81)  71 73 1.78 (1.18 - 2.68)  98 83 2.10 (1.43 - 3.09)     CC 16 16 1.76 (0.84 - 3.70)  13 12 2.03 (0.88 - 4.65)  11 17 1.17 (0.53 - 2.62)   0.36 (0.62) PON1 rs854551 GG 122 207 1.00 ----------  124 147 1.35 (0.97 - 1.88)  129 152 1.35 (0.97 - 1.89)       AG 59 109 0.95 (0.64 - 1.41)  74 71 1.72 (1.15 - 2.56)  83 70 1.82 (1.22 - 2.73)     AA 8 12 1.09 (0.43 - 2.77)  12 11 1.86 (0.79 - 4.38)  16 7 3.40 (1.35 - 8.57)   0.11 (0.45)                                       191       Exposure: 0%   Exposure: 0.1 - 2.9%   Exposure: 3.0 - 87.9%   Interaction Gene SNP Geno Case Control OR 95% CI   Case Control OR 95% CI   Case Control OR 95% CI   ptrend (pFDR) Estradiol metabolism                  ESR1 rs2813543 GG 111 187 1.00 ----------  144 138 1.64 (1.17 - 2.30)  132 128 1.58 (1.11 - 2.24)     AG 71 122 0.99 (0.68 - 1.45)  63 84 1.23 (0.82 - 1.86)  88 81 1.70 (1.15 - 2.52)     AA 7 18 0.67 (0.27 - 1.65)  3 7 0.76 (0.19 - 3.06)  8 20 0.65 (0.27 - 1.54)   0.71 (0.86) ‡ Adjusted for age, centre, education, and smoking (pack-years) ◊ Sample size insufficient to create stable estimates   192 Supplementary Table A21: Breast cancer odds ratios by genotype-exposure (weighted duration of PAH exposure) stratum based on co-dominant model for gene-representative SNPs among European women‡       Exposure: 0 Years   Exposure: 0.1 - 0.8 Years   Exposure: 0.9 - 55.1 Years   Interaction Gene SNP Geno Case Control OR 95% CI   Case Control OR 95% CI   Case Control OR 95% CI   ptrend (pFDR) Production of carcinogenic intermediates                AKR1C3 rs12387 AA 121 230 1.00 ----------  138 160 1.57 (1.13 - 2.16)  150 151 1.73 (1.25 - 2.41)       GA 62 93 1.29 (0.87 - 1.91)  67 66 1.80 (1.19 - 2.72)  60 71 1.42 (0.94 - 2.16)     GG 6 6 2.02 (0.63 - 6.45)  10 2 9.74 (2.07 - 45.8)  13 6 4.22 (1.55 - 11.5)   0.37 (0.65) AKR1C4 rs3812617 GG 121 242 1.00 ----------  147 162 1.73 (1.26 - 2.38)  154 158 1.79 (1.29 - 2.47)       AG 64 83 1.55 (1.04 - 2.31)  60 66 1.68 (1.10 - 2.55)  58 64 1.59 (1.03 - 2.44)     AA 4 4 1.98 (0.48 - 8.15)  8 1 16.7 (2.03 - 137.1)  11 7 3.25 (1.22 - 8.69)   0.15 (0.46) EPHX1 rs2854461 CC 80 142 1.00 ----------  89 113 1.32 (0.88 - 1.96)  89 114 1.28 (0.86 - 1.91)       AC 89 141 1.18 (0.80 - 1.74)  99 90 1.92 (1.28 - 2.87)  103 92 1.87 (1.24 - 2.80)     AA 20 46 0.77 (0.42 - 1.41)  27 26 1.89 (1.02 - 3.48)  31 23 2.26 (1.22 - 4.18)   0.06 (0.39) NAT1 rs7845127 GG 81 168 1.00 ----------  85 104 1.58 (1.06 - 2.36)  104 117 1.71 (1.17 - 2.52)       AG 89 136 1.34 (0.91 - 1.96)  107 91 2.30 (1.56 - 3.40)  97 91 1.97 (1.32 - 2.95)     AA 19 25 1.61 (0.83 - 3.11)  23 34 1.36 (0.75 - 2.47)  22 21 1.92 (0.97 - 3.79)   0.42 (0.65) NAT2 rs4646243 AA 132 239 1.00 ----------  155 172 1.57 (1.15 - 2.14)  168 169 1.66 (1.22 - 2.27)       GA 52 88 1.09 (0.72 - 1.64)  55 54 1.73 (1.12 - 2.68)  50 59 1.42 (0.91 - 2.23)     GG 5 2 4.57 (0.87 - 24.1)  5 3 2.44 (0.56 - 10.6)  5 1 6.37 (0.70 - 58.1)   0.41 (0.65) PTGS2 rs5275 AA 75 150 1.00 ----------  87 76 2.25 (1.48 - 3.43)  107 86 2.37 (1.58 - 3.56)       GA 93 141 1.40 (0.95 - 2.06)  102 122 1.63 (1.10 - 2.40)  99 108 1.72 (1.15 - 2.56)       GG 21 38 1.16 (0.63 - 2.12)   26 31 1.63 (0.90 - 2.97)   17 35 0.88 (0.46 - 1.70)   <0.01 (0.04) CYP450 superfamily                  CYP19A1 rs10046 AA 48 100 1.00 ----------  59 73 1.58 (0.96 - 2.58)  53 66 1.53 (0.92 - 2.54)       GA 94 159 1.27 (0.82 - 1.95)  110 114 1.91 (1.23 - 2.96)  114 115 1.88 (1.21 - 2.93)     GG 47 69 1.36 (0.81 - 2.27)  46 42 2.20 (1.27 - 3.82)  56 48 2.25 (1.32 - 3.81)   0.85 (0.90) CYP1A1 rs2470893 GG 96 171 1.00 ----------  109 110 1.66 (1.15 - 2.40)  94 104 1.47 (1.00 - 2.16)       AG 69 131 0.91 (0.61 - 1.34)  80 92 1.46 (0.98 - 2.17)  101 105 1.52 (1.04 - 2.23)     AA 23 27 1.61 (0.87 - 3.01)  25 26 1.57 (0.85 - 2.90)  28 20 2.34 (1.23 - 4.45)   0.90 (0.90) CYP2C19 rs12248560 GG 109 211 1.00 ----------  114 128 1.63 (1.15 - 2.31)  125 156 1.41 (1.00 - 1.98)       AG 69 100 1.31 (0.89 - 1.93)  87 88 1.81 (1.23 - 2.65)  85 61 2.46 (1.62 - 3.72)     AA 11 18 1.16 (0.52 - 2.58)  14 13 1.83 (0.82 - 4.07)  13 12 1.82 (0.79 - 4.19)   0.43 (0.65) Detoxification of reactive intermediates during xenobiotic metabolism             COMT rs5993882 AA 104 204 1.00 ----------  126 143 1.61 (1.14 - 2.27)  119 131 1.60 (1.12 - 2.29)       CA 69 109 1.24 (0.84 - 1.82)  77 69 2.07 (1.38 - 3.11)  92 86 1.89 (1.29 - 2.79)     CC 16 16 1.76 (0.84 - 3.71)  12 17 1.32 (0.60 - 2.90)  12 12 1.82 (0.78 - 4.23)   0.52 (0.70) PON1 rs854551 GG 122 208 1.00 ----------  128 147 1.39 (1.00 - 1.94)  125 153 1.31 (0.94 - 1.84)       AG 59 109 0.96 (0.65 - 1.42)  74 73 1.69 (1.13 - 2.52)  83 67 1.89 (1.26 - 2.83)     AA 8 12 1.09 (0.43 - 2.77)  13 9 2.56 (1.06 - 6.20)  15 9 2.42 (1.02 - 5.76)   0.14 (0.46)                                       193       Exposure: 0 Years   Exposure: 0.1 - 0.8 Years   Exposure: 0.9 - 55.1 Years   Interaction Gene SNP Geno Case Control OR 95% CI   Case Control OR 95% CI   Case Control OR 95% CI   ptrend (pFDR) Estradiol metabolism                  ESR1 rs2813543 GG 111 188 1.00 ----------  144 136 1.68 (1.20 - 2.36)  132 131 1.54 (1.09 - 2.19)       AG 71 122 1.00 (0.68 - 1.46)  67 84 1.31 (0.88 - 1.97)  84 80 1.65 (1.11 - 2.46)     AA 7 18 0.67 (0.27 - 1.66)  4 9 0.77 (0.23 - 2.59)  7 18 0.63 (0.25 - 1.59)   0.74 (0.89) ‡ Adjusted for age, centre, education, and smoking (pack-years)   194 Supplementary Table A22: Breast cancer odds ratios by genotype-exposure (ever-never for duration at high PAH exposure) stratum based on co-dominant model for remaining SNPs with potential modifying effects among European women‡       Exposure: Never   Exposure: Ever   Interaction Gene SNP Genotype  Case Control OR 95% CI   Case Control OR 95% CI   ptrend (p-FDR) Production of carcinogenic intermediates EPHX1 rs2854461 CC 149 246 1.00 ----------  109 123 1.64 (1.14 - 2.36)     AC 166 216 1.35 (1.01 - 1.81)  125 108 2.21 (1.54 - 3.18)     AA 41 65 1.09 (0.69 - 1.70)   37 30 2.42 (1.40 - 4.18)   0.52 (0.78) NAT1 rs7845127 GG 136 258 1.00 ----------  134 132 2.09 (1.48 - 2.96)     AG 180 218 1.52 (1.14 - 2.03)  113 100 2.28 (1.58 - 3.28)     AA 40 51 1.42 (0.89 - 2.28)   24 29 1.70 (0.93 - 3.13)   0.09 (0.27) NAT2 rs4646243 AA 258 387 1.00 ----------  197 194 1.71 (1.28 - 2.29)     GA 88 136 1.00 (0.73 - 1.37)  69 65 1.76 (1.17 - 2.64)     GG 10 4 3.52 (1.07 - 11.6)   5 2 3.04 (0.54 - 17.0)   0.77 (0.84) CYP450 superfamily CYP19A1 rs10046 AA 94 165 1.00 ----------  66 74 1.72 (1.10 - 2.69)     GA 180 253 1.27 (0.92 - 1.76)  138 136 1.99 (1.36 - 2.91)     GG 82 108 1.33 (0.90 - 1.96)   67 51 2.60 (1.62 - 4.16)   0.74 (0.84) CYP1A1 rs2470893 GG 181 266 1.00 ----------  118 119 1.62 (1.15 - 2.29)     AG 135 210 0.92 (0.69 - 1.23)  115 119 1.54 (1.08 - 2.18)     AA 39 51 1.14 (0.71 - 1.81)   37 22 2.78 (1.54 - 5.01)   0.39 (0.67) CYP2C19 rs12248560 GG 192 329 1.00 ----------  156 166 1.73 (1.27 - 2.37)     AG 140 167 1.39 (1.04 - 1.86)  101 83 2.26 (1.57 - 3.27)     AA 24 31 1.21 (0.68 - 2.15)   14 12 2.15 (0.95 - 4.85)   0.95 (0.95) Detoxification of reactive intermediates during xenobiotic metabolism COMT rs5993882 AA 204 333 1.00 ----------  145 145 1.81 (1.31 - 2.51)     CA 126 165 1.26 (0.94 - 1.70)  112 100 2.04 (1.44 - 2.90)     CC 26 29 1.45 (0.82 - 2.56)   14 16 1.53 (0.72 - 3.27)   0.32 (0.64)               PON1 rs854551 GG 217 337 1.00 ----------  158 171 1.63 (1.20 - 2.23)     AG 121 171 1.14 (0.85 - 1.54)  95 79 2.02 (1.39 - 2.94)     AA 18 19 1.47 (0.75 - 2.89)   18 11 2.74 (1.25 - 6.01)   0.70 (0.84) Estradiol metabolism ESR1 rs2813543 GG 226 298 1.00 ----------  161 157 1.47 (1.07 - 2.02)     AG 120 201 0.81 (0.60 - 1.08)  102 86 1.76 (1.22 - 2.54)     AA 10 27 0.52 (0.25 - 1.12)   8 18 0.68 (0.28 - 1.63)   0.31 (0.64) ‡ Adjusted for age, centre, education, and smoking (pack-years)   195 Supplementary Table A23: Breast cancer odds ratios by genotype-exposure (ever-never for average probability of PAH exposure∆) stratum based on co-dominant model for remaining SNPs with potential modifying effects among European women‡       Exposure: Never   Exposure: Ever   Interaction Gene SNP Genotype  Case Control OR 95% CI   Case Control OR 95% CI   ptrend (p-FDR) Production of carcinogenic intermediates AKR1C3 rs12387 AA 121 230 1.00 ----------  288 312 1.64 (1.24 - 2.18)     GA 62 93 1.29 (0.87 - 1.91)  127 137 1.60 (1.15 - 2.24)     GG 6 6 2.02 (0.63 - 6.46)   23 8 5.59 (2.41 - 13.0)   0.69 (0.83) AKR1C4 rs3812617 GG 121 242 1.00 ----------  301 321 1.75 (1.33 - 2.32)     AG 64 83 1.55 (1.04 - 2.31)  118 130 1.63 (1.16 - 2.29)     AA 4 4 1.99 (0.48 - 8.17)   19 8 4.91 (2.07 - 11.7)   0.19 (0.58) EPHX1 rs2854461 CC 80 142 1.00 ----------  178 227 1.30 (0.92 - 1.83)     AC 89 141 1.18 (0.80 - 1.74)  202 183 1.88 (1.33 - 2.66)     AA 20 46 0.77 (0.42 - 1.41)   58 49 2.06 (1.28 - 3.32)   0.06 (0.37) NAT1 rs7845127 GG 81 168 1.00 ----------  189 222 1.64 (1.17 - 2.30)     AG 89 136 1.34 (0.91 - 1.96)  204 182 2.13 (1.52 - 3.00)     AA 19 25 1.61 (0.83 - 3.11)   45 55 1.57 (0.96 - 2.55)   0.29 (0.71) NAT2 rs4646243 AA 132 239 1.00 ----------  323 342 1.61 (1.23 - 2.11)     GA 52 88 1.09 (0.72 - 1.64)  105 113 1.57 (1.10 - 2.23)     GG 5 2 4.58 (0.87 - 24.1)   10 4 3.41 (1.01 - 11.5)   0.49 (0.76) PTGS2 rs5275 AA 75 150 1.00 ----------  194 162 2.31 (1.62 - 3.29)     GA 93 141 1.40 (0.95 - 2.06)  201 231 1.66 (1.18 - 2.34)       GG 21 38 1.16 (0.63 - 2.13)   43 66 1.23 (0.76 - 1.98)   <0.01 (0.07) CYP450 superfamily CYP19A1 rs10046 AA 48 100 1.00 ----------  112 139 1.55 (1.01 - 2.40)     GA 94 159 1.27 (0.82 - 1.95)  224 230 1.89 (1.27 - 2.81)     GG 47 69 1.36 (0.81 - 2.27)   102 90 2.23 (1.41 - 3.51)   0.88 (0.88) CYP1A1 rs2470893 GG 96 171 1.00 ----------  203 214 1.57 (1.13 - 2.16)     AG 69 131 0.91 (0.61 - 1.34)  181 198 1.49 (1.07 - 2.07)     AA 23 27 1.61 (0.87 - 3.01)   53 46 1.90 (1.18 - 3.06)   0.68 (0.83) CYP2C19 rs12248560 GG 109 211 1.00 ----------  239 284 1.50 (1.12 - 2.03)     AG 69 100 1.31 (0.89 - 1.93)  172 150 2.06 (1.48 - 2.85)     AA 11 18 1.16 (0.52 - 2.58)   27 25 1.82 (1.00 - 3.33)   0.86 (0.88) Detoxification of reactive intermediates during xenobiotic metabolism COMT rs5993882 AA 104 204 1.00 ----------  245 274 1.61 (1.19 - 2.17)     CA 69 109 1.24 (0.84 - 1.82)  169 156 1.96 (1.41 - 2.72)     CC 16 16 1.76 (0.84 - 3.71)   24 29 1.53 (0.84 - 2.78)   0.39 (0.76) PON1 rs854551 GG 122 208 1.00 ----------  253 300 1.35 (1.01 - 1.80)     AG 59 109 0.96 (0.65 - 1.42)  157 141 1.77 (1.28 - 2.46)     AA 8 12 1.09 (0.43 - 2.77)   28 18 2.49 (1.31 - 4.71)   0.14 (0.56)     196       Exposure: Never   Exposure: Ever   Interaction Gene SNP Genotype  Case Control OR 95% CI   Case Control OR 95% CI   ptrend (p-FDR) Estradiol metabolism ESR1 rs2813543 GG 111 188 1.00 ----------  276 267 1.61 (1.19 - 2.17)     AG 71 122 1.00 (0.68 - 1.46)  151 165 1.47 (1.06 - 2.05)     AA 7 18 0.67 (0.27 - 1.66)   11 27 0.68 (0.32 - 1.43)   0.50 (0.76) ‡ Adjusted for age, centre, education, and smoking (pack-years) ∆ The results for average probability will produce the same results as weighted duration because average probability of exposure is based on the quotient between the summation of the product of probability and duration (for each job) and the total duration while weighted duration is the numerator involved in the calculation of the quotient.   197 Supplementary Table A24: Breast cancer odds ratios by genotype-exposure (ever-never for smoking) stratum based on co-dominant model for remaining SNPs with potential modifying effects among European women‡       Smoking: Never   Smoking: Ever   Interaction Gene SNP Genotype  Case Control OR 95% CI   Case Control OR 95% CI   ptrend (p-FDR) Production of carcinogenic intermediates AKR1C3 rs12387 AA 174 269 1.00 ----------  232 270 1.30 (1.00 - 1.69)     GA 91 114 1.22 (0.87 - 1.72)  98 114 1.25 (0.89 - 1.74)     GG 15 9 2.73 (1.15 - 6.45)   14 5 4.45 (1.56 - 12.7)   0.41 (0.58) AKR1C4 rs3812617 GG 182 279 1.00 ----------  238 280 1.27 (0.98 - 1.65)     AG 87 104 1.26 (0.89 - 1.78)  94 108 1.26 (0.90 - 1.76)     AA 11 9 2.01 (0.81 - 5.01)   12 3 6.03 (1.67 - 21.8)   0.65 (0.73) EPHX1 rs2854461 CC 110 184 1.00 ----------  147 185 1.35 (0.98 - 1.87)     AC 134 160 1.50 (1.07 - 2.09)  156 159 1.64 (1.18 - 2.28)     AA 36 48 1.36 (0.83 - 2.24)   41 47 1.48 (0.91 - 2.40)   0.40 (0.58) NAT1 rs7845127 GG 121 201 1.00 ----------  147 186 1.25 (0.91 - 1.72)     AG 127 153 1.33 (0.96 - 1.85)  166 164 1.62 (1.18 - 2.22)     AA 32 38 1.39 (0.82 - 2.36)   31 41 1.18 (0.70 - 2.00)   0.43 (0.58) NAT2 rs4646243 AA 217 286 1.00 ----------  236 293 1.04 (0.81 - 1.33)     GA 61 102 0.79 (0.55 - 1.14)  96 96 1.26 (0.90 - 1.77)     GG 2 4 0.58 (0.10 - 3.23)   12 2 7.77 (1.71 - 35.3)   <0.01 (0.11) PTGS2 rs5275 AA 117 148 1.00 ----------  151 163 1.15 (0.82 - 1.60)     GA 137 182 0.96 (0.69 - 1.34)  155 186 1.03 (0.74 - 1.43)       GG 26 62 0.54 (0.32 - 0.90)   38 42 1.08 (0.65 - 1.80)   0.30 (0.58) CYP450 superfamily CYP19A1 rs10046 AA 67 133 1.00 ----------  93 105 1.67 (1.11 - 2.52)     GA 146 183 1.59 (1.10 - 2.29)  170 203 1.62 (1.13 - 2.33)     GG 67 76 1.73 (1.11 - 2.71)   81 82 1.90 (1.24 - 2.92)   0.14 (0.57) CYP1A1 rs2470893 GG 141 194 1.00 ----------  156 189 1.10 (0.81 - 1.50)     AG 108 153 0.94 (0.68 - 1.31)  141 175 1.06 (0.77 - 1.45)     AA 29 45 0.89 (0.53 - 1.49)   47 26 2.33 (1.37 - 3.95)   0.09 (0.54) CYP2C19 rs12248560 GG 167 259 1.00 ----------  179 232 1.15 (0.87 - 1.52)     AG 100 113 1.35 (0.97 - 1.89)  140 137 1.56 (1.15 - 2.13)     AA 13 20 0.98 (0.47 - 2.03)   25 22 1.61 (0.87 - 2.97)   0.67 (0.73) Detoxification of reactive intermediates during xenobiotic metabolism COMT rs5993882 AA 147 242 1.00 ----------  200 232 1.37 (1.03 - 1.81)     CA 113 127 1.46 (1.05 - 2.04)  124 137 1.46 (1.06 - 2.01)     CC 20 23 1.38 (0.73 - 2.62)   20 22 1.43 (0.75 - 2.74)   0.21 (0.58) PON1 rs854551 GG 167 250 1.00 ----------  207 254 1.19 (0.91 - 1.56)     AG 98 127 1.18 (0.85 - 1.65)  116 122 1.39 (1.01 - 1.93)     AA 15 15 1.46 (0.69 - 3.09)   21 15 2.08 (1.03 - 4.17)   0.85 (0.85)     198       Smoking: Never   Smoking: Ever   Interaction Gene SNP Genotype  Case Control OR 95% CI   Case Control OR 95% CI   ptrend (p-FDR) Estradiol metabolism ESR1 rs2813543 GG 161 221 1.00 ----------  223 231 1.27 (0.96 - 1.68)     AG 111 152 1.00 (0.72 - 1.38)  111 134 1.12 (0.81 - 1.56)     AA 8 18 0.62 (0.26 - 1.48)   10 26 0.52 (0.24 - 1.11)   0.35 (0.58) ‡ Adjusted for age, centre, and education    199 Supplementary Table A25: Breast cancer odds ratios by genotype-exposure (ever-never for duration at high PAH exposure) stratum based on co-dominant model for remaining SNPs with potential modifying effects among European women stratified by smoking status‡    Smoking: Never  Interactions (p-FDR)       Exposure: Never   Exposure: Ever   Interaction  GxExS ExS Gene SNP Geno Case Control OR 95% CI   Case Control OR 95% CI   ptrend (pFDR)  ptrend  ptrend  Production of carcinogenic intermediates              AKR1C3 rs12387 AA 102 195 1.00 ----------  72 74 2.25 (1.44 - 3.53)        GA 59 77 1.44 (0.94 - 2.20)  32 37 1.89 (1.07 - 3.33)        GG 7 5 2.78 (0.84 - 9.19)   8 4 4.51 (1.28 - 15.9)   0.25 (0.62)  0.48 (0.72) 0.66 (0.82) AKR1C4 rs3812617 GG 106 203 1.00 ----------  76 76 2.29 (1.48 - 3.56)        AG 58 69 1.53 (0.99 - 2.37)  29 35 1.83 (1.02 - 3.28)        AA 4 5 1.70 (0.44 - 6.57)   7 4 3.98 (1.10 - 14.5)   0.22 (0.62)  0.31 (0.52) 0.48 (0.82) EPHX1 rs2854461 CC 66 126 1.00 ----------  44 58 1.79 (1.04 - 3.08)        AC 81 119 1.46 (0.96 - 2.23)  53 41 3.27 (1.89 - 5.66)        AA 21 32 1.39 (0.73 - 2.65)   15 16 2.43 (1.09 - 5.43)   0.86 (0.94)  0.84 (0.89) 0.69 (0.82) NAT1 rs7845127 GG 70 145 1.00 ----------  51 56 2.26 (1.36 - 3.77)        AG 79 109 1.46 (0.96 - 2.21)  48 44 2.53 (1.48 - 4.32)        AA 19 23 1.61 (0.81 - 3.22)   13 15 2.05 (0.90 - 4.66)   0.24 (0.62)  0.89 (0.89) 0.95 (0.98) NAT2 rs4646243 AA 132 199 1.00 ----------  85 87 1.77 (1.16 - 2.70)        GA 34 74 0.72 (0.45 - 1.16)  27 28 1.76 (0.96 - 3.23)        GG 2 4 0.65 (0.11 - 3.84)   0 0 1.44 (0.96 - 2.16)   0.44 (0.66)  0.14 (0.41) 0.22 (0.53) PTGS2 rs5275 AA 70 106 1.00 ----------  47 42 2.12 (1.22 - 3.68)        GA 79 128 0.94 (0.62 - 1.44)  58 54 1.90 (1.14 - 3.17)          GG 19 43 0.68 (0.36 - 1.27)   7 19 0.70 (0.27 - 1.81)   0.36 (0.62)   0.24 (0.49) 0.39 (0.78) CYP450 superfamily                CYP19A1 rs10046 AA 41 99 1.00 ----------  26 34 2.34 (1.20 - 4.59)        GA 88 129 1.70 (1.07 - 2.70)  58 54 3.32 (1.88 - 5.86)        GG 39 49 2.08 (1.17 - 3.68)   28 27 3.09 (1.55 - 6.15)   0.32 (0.62)  0.08 (0.33) 0.07 (0.29) CYP1A1 rs2470893 GG 86 142 1.00 ----------  55 52 2.04 (1.24 - 3.37)        AG 64 105 0.91 (0.59 - 1.38)  44 48 1.76 (1.04 - 2.97)        AA 17 30 0.92 (0.47 - 1.79)   12 15 1.51 (0.65 - 3.48)   0.72 (0.86)  0.18 (0.43) 0.14 (0.43) CYP2C19 rs12248560 GG 97 181 1.00 ----------  70 78 2.01 (1.28 - 3.15)        AG 63 80 1.43 (0.94 - 2.18)  37 33 2.28 (1.30 - 4.02)        AA 8 16 0.82 (0.33 - 2.03)   5 4 3.07 (0.78 - 12.0)   0.94 (0.94)  0.81 (0.89) 0.98 (0.98) Detoxification of reactive intermediates during xenobiotic metabolism           COMT rs5993882 AA 93 169 1.00 ----------  54 73 1.56 (0.97 - 2.50)        CA 63 92 1.17 (0.77 - 1.77)  50 35 3.08 (1.81 - 5.25)        CC 12 16 1.36 (0.61 - 3.04)   8 7 2.27 (0.77 - 6.65)   0.32 (0.62)  0.02 (0.24) 0.06 (0.29) PON1 rs854551 GG 96 179 1.00 ----------  71 71 2.28 (1.45 - 3.61)        AG 63 89 1.41 (0.93 - 2.13)  35 38 2.08 (1.19 - 3.65)        AA 9 9 1.66 (0.62 - 4.41)   6 6 2.12 (0.65 - 6.95)   0.20 (0.62)  0.04 (0.24) 0.03 (0.29)   200    Smoking: Never  Interactions (p-FDR)       Exposure: Never   Exposure: Ever   Interaction  GxExS ExS Gene SNP Geno Case Control OR 95% CI   Case Control OR 95% CI   ptrend (pFDR)  ptrend  ptrend  Estradiol metabolism                ESR1 rs2813543 GG 101 155 1.00 ----------  60 66 1.74 (1.08 - 2.81)        AG 63 110 0.92 (0.61 - 1.39)  48 42 2.12 (1.25 - 3.59)        AA 4 11 0.64 (0.19 - 2.09)   4 7 0.94 (0.26 - 3.42)   0.70 (0.86)  0.80 (0.89) 0.66 (0.82) ‡ Adjusted for age, centre, and education      201 Supplementary Table A25: Breast cancer odds ratios by genotype-exposure (ever-never for duration at high PAH exposure) stratum based on co-dominant model for remaining SNPs with potential modifying effects among European women stratified by smoking status‡ (Continued)    Smoking: Ever  Interactions (p-FDR)       Exposure: Never   Exposure: Ever   Interaction  GxExS ExS Gene SNP Geno Case Control OR 95% CI   Case Control OR 95% CI   ptrend (pFDR)  ptrend  ptrend  Production of carcinogenic intermediates              AKR1C3 rs12387 AA 125 183 1.00 ----------  107 87 2.00 (1.31 - 3.05)        GA 56 61 1.35 (0.87 - 2.08)  42 53 1.20 (0.71 - 2.00)        GG 6 2 4.91 (0.96 - 25.2)   8 3 4.44 (1.13 - 17.5)   0.04 (0.12)  0.48 (0.72) 0.66 (0.82) AKR1C4 rs3812617 GG 126 189 1.00 ----------   112 91 2.04 (1.34 - 3.09)          AG 56 57 1.50 (0.97 - 2.33)   38 51 1.16 (0.68 - 1.96)          AA 5 0 NA (NA - NA)   7 3 3.98 (0.99 - 16.0)   <0.01 (0.05)  0.31 (0.52) 0.48 (0.82) EPHX1 rs2854461 CC 82 120 1.00 ----------  65 65 1.55 (0.94 - 2.54)        AC 85 93 1.33 (0.88 - 2.00)  71 66 1.68 (1.03 - 2.74)        AA 20 33 0.90 (0.48 - 1.70)   21 14 2.40 (1.12 - 5.15)   0.59 (0.65)  0.84 (0.89) 0.69 (0.82) NAT1 rs7845127 GG 65 110 1.00 ----------  82 76 1.95 (1.20 - 3.16)        AG 101 108 1.60 (1.06 - 2.42)  65 56 2.09 (1.26 - 3.48)        AA 21 28 1.24 (0.65 - 2.37)   10 13 1.31 (0.53 - 3.26)   0.21 (0.27)  0.89 (0.89) 0.95 (0.98) NAT2 rs4646243 AA 125 186 1.00 ----------  111 107 1.69 (1.13 - 2.54)        GA 54 60 1.38 (0.89 - 2.14)  42 36 1.81 (1.05 - 3.13)        GG 8 0 NA (NA - NA)   4 2 3.03 (0.54 - 17.1)   0.16 (0.24)  0.14 (0.41) 0.22 (0.53) PTGS2 rs5275 AA 71 108 1.00 ----------   80 55 2.29 (1.40 - 3.75)          GA 93 117 1.17 (0.78 - 1.77)   62 69 1.43 (0.88 - 2.34)            GG 23 21 1.65 (0.84 - 3.23)   15 21 1.10 (0.52 - 2.33)   <0.01 (0.05)   0.24 (0.49) 0.39 (0.78) CYP450 superfamily                CYP19A1 rs10046 AA 53 65 1.00 ----------  40 40 1.29 (0.70 - 2.35)        GA 92 121 0.94 (0.60 - 1.49)  78 82 1.24 (0.74 - 2.08)        GG 42 59 0.86 (0.50 - 1.49)   39 23 2.25 (1.16 - 4.36)   0.14 (0.23)  0.08 (0.33) 0.07 (0.29) CYP1A1 rs2470893 GG 94 122 1.00 ----------  62 67 1.29 (0.79 - 2.10)        AG 71 104 0.90 (0.60 - 1.35)  70 71 1.37 (0.85 - 2.21)        AA 22 20 1.37 (0.70 - 2.69)   25 6 5.49 (2.10 - 14.3)   0.11 (0.22)  0.18 (0.43) 0.14 (0.43) CYP2C19 rs12248560 GG 94 145 1.00 ----------  85 87 1.55 (0.99 - 2.42)        AG 77 87 1.36 (0.90 - 2.04)  63 50 2.16 (1.31 - 3.56)        AA 16 14 1.64 (0.75 - 3.56)   9 8 1.75 (0.64 - 4.82)   0.79 (0.79)  0.81 (0.89) 0.98 (0.98) Detoxification of reactive intermediates during xenobiotic metabolism           COMT rs5993882 AA 110 161 1.00 ----------  90 71 2.04 (1.29 - 3.23)        CA 63 72 1.35 (0.88 - 2.06)  61 65 1.56 (0.97 - 2.51)        CC 14 13 1.70 (0.76 - 3.80)   6 9 1.03 (0.35 - 3.07)   0.02 (0.09)  0.02 (0.24) 0.06 (0.29) PON1 rs854551 GG 120 155 1.00 ----------  87 99 1.24 (0.81 - 1.90)        AG 58 81 0.92 (0.61 - 1.40)  58 41 1.91 (1.15 - 3.18)        AA 9 10 1.24 (0.48 - 3.19)   12 5 3.17 (1.06 - 9.47)   0.09 (0.22)  0.04 (0.24) 0.03 (0.29)   202    Smoking: Ever  Interactions (p-FDR)       Exposure: Never   Exposure: Ever   Interaction  GxExS ExS Gene SNP Geno Case Control OR 95% CI   Case Control OR 95% CI   ptrend (pFDR)  ptrend  ptrend  Estradiol metabolism                ESR1 rs2813543 GG 124 140 1.00 ----------  99 91 1.29 (0.84 - 1.98)        AG 57 90 0.73 (0.48 - 1.11)  54 44 1.51 (0.91 - 2.52)        AA 6 16 0.42 (0.16 - 1.13)   4 10 0.49 (0.15 - 1.62)   0.35 (0.41)  0.80 (0.89) 0.66 (0.82) ‡ Adjusted for age, centre, and education   203 Supplementary Table A26: Breast cancer odds ratios by genotype-exposure (ever-never for average probability of PAH exposure∆) stratum based on co-dominant model for remaining SNPs with potential modifying effects among European women stratified by smoking status‡    Smoking: Never  Interactions (p-FDR)       Exposure: Never   Exposure: Ever   Interaction  GxExS ExS Gene SNP Geno Case Control OR 95% CI   Case Control OR 95% CI   ptrend (pFDR)  ptrend  ptrend  Production of carcinogenic intermediates              AKR1C3 rs12387 AA 55 121 1.00 ----------  119 148 1.95 (1.25 - 3.05)        GA 33 52 1.35 (0.78 - 2.34)  58 62 2.14 (1.28 - 3.59)        GG 2 4 1.18 (0.21 - 6.75)   13 5 6.25 (2.06 - 18.9)   0.86 (0.89)  0.45 (0.70) 0.65 (0.78) AKR1C4 rs3812617 GG 55 128 1.00 ----------  127 151 2.13 (1.37 - 3.32)        AG 34 45 1.66 (0.95 - 2.89)  53 59 2.15 (1.27 - 3.63)        AA 1 4 0.60 (0.06 - 5.62)   10 5 5.19 (1.64 - 16.4)   0.62 (0.89)  0.50 (0.70) 0.71 (0.78) EPHX1 rs2854461 CC 34 78 1.00 ----------  76 106 1.84 (1.07 - 3.16)        AC 47 76 1.56 (0.89 - 2.71)  87 84 2.91 (1.68 - 5.04)        AA 9 23 1.01 (0.42 - 2.43)   27 25 2.99 (1.47 - 6.11)   0.50 (0.89)  0.45 (0.70) 0.34 (0.78) NAT1 rs7845127 GG 43 98 1.00 ----------  78 103 2.03 (1.23 - 3.35)        AG 38 70 1.27 (0.74 - 2.18)  89 83 2.75 (1.64 - 4.61)        AA 9 9 2.65 (0.97 - 7.22)   23 29 2.00 (1.01 - 4.00)   0.26 (0.89)  0.62 (0.70) 0.52 (0.78) NAT2 rs4646243 AA 68 127 1.00 ----------  149 159 1.94 (1.27 - 2.95)        GA 21 48 0.83 (0.46 - 1.51)  40 54 1.55 (0.91 - 2.67)        GG 1 2 0.93 (0.08 - 10.7)   1 2 0.95 (0.08 - 11.8)   0.83 (0.89)  0.55 (0.70) 0.72 (0.78) PTGS2 rs5275 AA 38 73 1.00 ----------  79 75 2.28 (1.33 - 3.92)        GA 42 77 1.09 (0.63 - 1.88)  95 105 1.91 (1.14 - 3.23)          GG 10 27 0.72 (0.31 - 1.66)   16 35 1.03 (0.49 - 2.18)   0.34 (0.89)   0.12 (0.50) 0.25 (0.78) CYP450 superfamily                CYP19A1 rs10046 AA 17 61 1.00 ----------  50 72 2.76 (1.38 - 5.52)        GA 46 83 2.01 (1.04 - 3.87)  100 100 3.94 (2.08 - 7.48)        GG 27 33 2.86 (1.35 - 6.06)   40 43 3.61 (1.76 - 7.41)   0.12 (0.89)  0.02 (0.23) 0.01 (0.16) CYP1A1 rs2470893 GG 50 95 1.00 ----------  91 99 1.91 (1.17 - 3.10)        AG 29 63 0.81 (0.46 - 1.43)  79 90 1.78 (1.08 - 2.94)        AA 10 19 1.00 (0.43 - 2.34)   19 26 1.47 (0.72 - 3.01)   0.87 (0.89)  0.85 (0.85) 0.92 (0.92) CYP2C19 rs12248560 GG 54 121 1.00 ----------  113 138 2.04 (1.31 - 3.19)        AG 32 46 1.55 (0.88 - 2.72)  68 67 2.47 (1.47 - 4.13)        AA 4 10 0.79 (0.23 - 2.71)   9 10 2.30 (0.85 - 6.20)   0.89 (0.89)  0.64 (0.70) 0.48 (0.78) Detoxification of reactive intermediates during xenobiotic metabolism           COMT rs5993882 AA 53 111 1.00 ----------  94 131 1.67 (1.05 - 2.65)        CA 29 59 1.04 (0.59 - 1.82)  84 68 2.79 (1.69 - 4.59)        CC 8 7 2.14 (0.72 - 6.36)   12 16 1.87 (0.79 - 4.40)   0.83 (0.89)  0.24 (0.70) 0.40 (0.78) PON1 rs854551 GG 53 115 1.00 ----------  114 135 2.04 (1.30 - 3.22)        AG 34 57 1.36 (0.79 - 2.35)  64 70 2.30 (1.37 - 3.85)        AA 3 5 1.17 (0.26 - 5.24)   12 10 2.80 (1.10 - 7.11)   0.74 (0.89)  0.07 (0.40) 0.05 (0.31)   204    Smoking: Never  Interactions (p-FDR)       Exposure: Never   Exposure: Ever   Interaction  GxExS ExS Gene SNP Geno Case Control OR 95% CI   Case Control OR 95% CI   ptrend (pFDR)  ptrend  ptrend  Estradiol metabolism                ESR1 rs2813543 GG 46 100 1.00 ----------  115 121 2.37 (1.46 - 3.84)        AG 42 67 1.38 (0.82 - 2.35)  69 85 1.99 (1.19 - 3.33)        AA 2 9 0.53 (0.11 - 2.57)   6 9 1.44 (0.47 - 4.40)   0.29 (0.89)  0.51 (0.70) 0.39 (0.78) ‡ Adjusted for age, centre, and education ∆ The results for average probability will produce the same results as weighted duration because average probability of exposure is based on the quotient between the summation of the product of probability and duration (for each job) and the total duration while weighted duration is the numerator involved in the calculation of the quotient.     205 Supplementary Table A26: Breast cancer odds ratios by genotype-exposure (ever-never for average probability of PAH exposure∆) stratum based on co-dominant model for remaining SNPs with potential modifying effects among European women stratified by smoking status‡ (Continued)    Smoking: Ever  Interactions (p-FDR)       Exposure: Never   Exposure: Ever   Interaction  GxExS ExS Gene SNP Geno Case Control OR 95% CI   Case Control OR 95% CI   ptrend (pFDR)  ptrend  ptrend  Production of carcinogenic intermediates              AKR1C3 rs12387 AA 65 107 1.00 ----------  167 163 1.67 (1.09 - 2.56)        GA 29 39 1.17 (0.65 - 2.08)  69 75 1.44 (0.89 - 2.34)        GG 4 2 3.44 (0.60 - 19.6)   10 3 5.32 (1.40 - 20.3)   0.44 (0.66)  0.45 (0.70) 0.65 (0.78) AKR1C4 rs3812617 GG 65 111 1.00 ----------  173 169 1.73 (1.14 - 2.64)        AG 30 37 1.34 (0.75 - 2.39)  64 71 1.48 (0.90 - 2.42)        AA 3 0 NA (NA - NA)   9 3 5.00 (1.29 - 19.3)   0.19 (0.37)  0.50 (0.70) 0.71 (0.78) EPHX1 rs2854461 CC 45 64 1.00 ----------  102 121 1.17 (0.70 - 1.93)        AC 42 61 0.96 (0.55 - 1.66)  114 98 1.58 (0.96 - 2.62)        AA 11 23 0.65 (0.29 - 1.49)   30 24 1.77 (0.89 - 3.48)   0.08 (0.25)  0.45 (0.70) 0.34 (0.78) NAT1 rs7845127 GG 37 67 1.00 ----------  110 119 1.67 (1.01 - 2.76)        AG 51 65 1.45 (0.84 - 2.50)  115 99 2.17 (1.29 - 3.65)        AA 10 16 1.17 (0.48 - 2.87)   21 25 1.49 (0.70 - 3.13)   0.65 (0.71)  0.62 (0.70) 0.52 (0.78) NAT2 rs4646243 AA 63 110 1.00 ----------  173 183 1.68 (1.11 - 2.54)        GA 31 38 1.47 (0.83 - 2.61)  65 58 1.91 (1.15 - 3.17)        GG 4 0 NA (NA - NA)   8 2 6.70 (1.35 - 33.3)   0.29 (0.50)  0.55 (0.70) 0.72 (0.78) PTGS2 rs5275 AA 36 76 1.00 ----------   115 87 2.79 (1.68 - 4.65)          GA 51 61 1.82 (1.05 - 3.15)   104 125 1.75 (1.05 - 2.92)            GG 11 11 2.40 (0.93 - 6.15)   27 31 1.78 (0.91 - 3.49)   <0.01 (0.02)   0.12 (0.50) 0.25 (0.78) CYP450 superfamily                CYP19A1 rs10046 AA 31 38 1.00 ----------  62 67 1.15 (0.62 - 2.13)        GA 48 73 0.83 (0.45 - 1.52)  122 130 1.17 (0.66 - 2.06)        GG 19 36 0.66 (0.32 - 1.37)   62 46 1.71 (0.90 - 3.26)   0.08 (0.25)  0.02 (0.23) 0.01 (0.16) CYP1A1 rs2470893 GG 45 74 1.00 ----------  111 115 1.60 (0.98 - 2.62)        AG 40 67 0.97 (0.56 - 1.67)  101 108 1.52 (0.93 - 2.48)        AA 13 7 3.06 (1.13 - 8.32)   34 19 2.84 (1.40 - 5.74)   0.57 (0.71)  0.85 (0.85) 0.92 (0.92) CYP2C19 rs12248560 GG 54 87 1.00 ----------  125 145 1.36 (0.87 - 2.15)        AG 37 54 1.11 (0.64 - 1.91)  103 83 2.08 (1.28 - 3.38)        AA 7 7 1.52 (0.50 - 4.61)   18 15 1.91 (0.86 - 4.26)   0.61 (0.71)  0.64 (0.70) 0.48 (0.78) Detoxification of reactive intermediates during xenobiotic metabolism           COMT rs5993882 AA 50 90 1.00 ----------  150 142 1.90 (1.20 - 3.03)        CA 40 49 1.45 (0.84 - 2.50)  84 88 1.70 (1.06 - 2.75)        CC 8 9 1.59 (0.57 - 4.41)   12 13 1.66 (0.69 - 4.01)   0.14 (0.34)  0.24 (0.70) 0.40 (0.78) PON1 rs854551 GG 68 90 1.00 ----------  139 164 1.14 (0.74 - 1.74)        AG 25 51 0.66 (0.37 - 1.18)  91 71 1.67 (1.03 - 2.70)        AA 5 7 0.96 (0.29 - 3.19)   16 8 2.61 (1.04 - 6.57)   0.02 (0.15)  0.07 (0.40) 0.05 (0.31)   206    Smoking: Ever  Interactions (p-FDR)       Exposure: Never   Exposure: Ever   Interaction  GxExS ExS Gene SNP Geno Case Control OR 95% CI   Case Control OR 95% CI   ptrend (pFDR)  ptrend  ptrend  Estradiol metabolism                ESR1 rs2813543 GG 64 85 1.00 ----------  159 146 1.41 (0.91 - 2.19)        AG 29 54 0.72 (0.41 - 1.25)  82 80 1.39 (0.86 - 2.27)        AA 5 9 0.72 (0.23 - 2.29)   5 17 0.38 (0.13 - 1.11)   0.98 (0.98)  0.51 (0.70) 0.39 (0.78) ‡ Adjusted for age, centre, and education ∆ The results for average probability will produce the same results as weighted duration because average probability of exposure is based on the quotient between the summation of the product of probability and duration (for each job) and the total duration while weighted duration is the numerator involved in the calculation of the quotient.  207 Supplementary Material B1: Occupational History – Coding and use of crosswalks  All occupational histories were initially classified to the 2010 US Standard Occupation Classification (SOC2010) scheme using a two-step process. First, histories were coded using an automated approach for clustering job descriptions called SOCEye,(228) which assigned free-text job descriptions their corresponding SOC2010 code. After the initial assignment, all job descriptions were manually reviewed to ensure accuracy of the of the code selected.   To classify the occupational histories to coding schemes compatible with the DOM and NCI-JEM, we used crosswalks to translate each SOC2010 code its corresponding International Standard Classification of Occupations 1968 (ISCO68) and 1980 US Bureau of Census Industry and Occupation classification (OCC1980) code, respectively, using readily available crosswalks. Due to differences in the coding schemes with respect to the time at which the JEMs were developed, multiple coding schemes were required and crosswalks used are described below:  DOM-JEM:  1) SOC2010 to ISCO08 2) ISCO08 to ISCO88 3) ISCO88 to ISCO68  NCI-JEM:  1) SOC2010 to SOC2000 2) SOC2000 to OCC2000 3) OCC2000 to OCC1990 4) OCC1990 to OCC1980   208 Within a coding scheme, codes remained generally concordant between editions, i.e. the SOC2000 code for Waiters and Waitresses is 35-3031 and the SOC2010 code is also 35-3031. However, converting between coding schemes can produce situations where a single SOC2010 code can correspond to multiple ISCO08 codes or OCC1980 codes.   For cases where there was a one-to-multiple codes between a SOC2010 and multiple ISCO68 or OCC1980 codes then we performed the following steps. Without loss of generality, assume that for the occupation of interest the job title waitress was assigned the SOC2010 code A1. To use NCI-JEM we need to determine the what the job title of waitress is in the OCC1980 scheme. Using an automated approach that linked the appropriate crosswalks:   SOC2010 to SOC2000: A1 converts to B1 There is a one-to-one conversion between versions of the SOC scheme  SOC2000 to OCC2000: B1 converts to C1 and C2 There is a one-to-two conversion between the two different schemes where occupation B1 (SOC2000) corresponds to C1 and C2 (OCC2000)  OCC2000 to OCC1990: C1 converts to D1, C2 converts to D2 and D3 There is a one-to-one conversion between C1 and D1, and for C2 (OCC2000) there is a one-to-two conversion that corresponds to D2 and D3 (OCC1990)  OCC2000 to OCC1980: D1 converts to E1 and E2, D2 converts to E3, D3 converts to E4 There is a one-to-two conversion between the 1990 and 1980 edition where D1 (OCC1990) corresponds to E1 and E2 (OCC1980), and one-to-one conversions for D2 and D3 that correspond to E3 and E4  Overall: A1 (SOC2010) converts to E1, E2, E3, or E4 (OCC1980)   209 In this example, our occupation of waitress corresponds to A1 under SOC2010 and E1, E2, E3, or E4 under OCC1980. To apply the NCI-JEM to the occupational history, the automated program applied the corresponding intensity and probability score to E1 through E4. If the intensity and probability scores for E1 through E4, based on the NCI-JEM, were the same it accepted those values. If any set of intensity and probability scores were discordant then the original job title and description were manually reviewed against the job title and description of the OCC1980 codes to determine which of E1 through E4 was closest to the listed occupation and assigned the appropriate scores. For example, if the intensity and probability scores for E1 were 0 and 1, respectively, and the same set of scores were observed for E2 through E4, then for the occupation listed the intensity and probability scores remained 0 and 1, respectively. However, if the intensity and probability scores were 0 and 1 for E1, E2, and E4, but 1 and 1 for E3, then a manual review was undertaken to compare the occupation history to the title and descriptions of E1 through E4 and select the appropriate scores.     210 Supplementary Material B2: Algorithm for updating employee exposure based on materials handled and tasks performed on the job  In cases where a JEM indicates that, for some particular job, there is no exposure or risk of exposure but produces a conflict with the results of the supplemental questions: materials handled or tasks performed on the job, then the assigned score is updated accordingly. The general steps are outlined below: 1. JEM assigns score for job A 2. If the score for job A indicates no exposure or risk of exposure, then check supplemental question 1 (materials handled) and 2 (tasks performed) 3. If supplemental question 1 or 2 is YES for exposure, then update score for job A from 0 (None) to the next (lowest) appropriate level based on the JEM. If the algorithm is executed based on the conditional statement, then the score will be moved from None to the lowest category. If the score for job A is non-zero, then we do not update the score with the supplementary questions. For the DOM and NCI-JEM, which use ordinal scores, the algorithm updates the ordinal value to 1 (Low) in step 3. For the PPM-JEM, which uses a probability score, the algorithm updates the probability in step 3 using the probability 2.9%. This probability is the upper limit of the “low” exposure group described in Chapter 3 that defines the 25th percentile of non-zero probabilities assigned to all occupations among controls.        

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.24.1-0340690/manifest

Comment

Related Items