Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Variation in ATM and genetic susceptibility to non-Hodgkin lymphoma Sipahimalani, Payal 2006

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-ubc_2006-0656.pdf [ 7.7MB ]
Metadata
JSON: 831-1.0099901.json
JSON-LD: 831-1.0099901-ld.json
RDF/XML (Pretty): 831-1.0099901-rdf.xml
RDF/JSON: 831-1.0099901-rdf.json
Turtle: 831-1.0099901-turtle.txt
N-Triples: 831-1.0099901-rdf-ntriples.txt
Original Record: 831-1.0099901-source.json
Full Text
831-1.0099901-fulltext.txt
Citation
831-1.0099901.ris

Full Text

Variation in A T M and genetic susceptibility to non-Hodgkin lymphoma b y P a y a l S i p a h i m a l a n i B . S c . H o n s , M e m o r i a l U n i v e r s i t y , 2003 A T H E S I S S U B M I T T E D I N P A R T I A L F U L F I L L M E N T O F T H E R E Q U I R E M E N T S F O R T H E D E G R E E O F M a s t e r o f S c i e n c e i n T H E F A C U L T Y O F G R A D U A T E S T U D I E S ( M e d i c a l Genet i c s ) The University of British Columbia M a y 2006 © P a y a l S i p a h i m a l a n i , 2006 Abstract T h e a t a x i a t e lang iec tas ia m u t a t e d ( A T M ) gene is c r i t i c a l for the d e t e c t i o n a n d repa i r of doub le s t r a n d e d breaks . M u t a t i o n s i n t h i s gene cause the a u t o s o m a l reces-sive s y n d r o m e a t a x i a t e lang iec tas ia ( A T ) , a feature of w h i c h is a h i g h r i s k of cancer , p a r t i c u l a r l y l y m p h o m a . W e have u n d e r t a k e n a p o p u l a t i o n - b a s e d c a s e / c o n t r o l s t u d y to assess the role of genetic v a r i a t i o n i n A T M o n the r i sk of n o n - H o d g k i n l y m p h o m a ( N H L ) i n the general p o p u l a t i o n . T h e t e r m N H L encompasses several subtypes , m a n y of w h i c h have i n c o m m o n the occurrence of specif ic s o m a t i c t r a n s l o c a t i o n s t h a t c o n t r i b u t e to l y m p h o m a g e n e s i s . W e hypothes i ze t h a t v a r i a n t s t h a t resu l t i n s l i g h t l y decreased f u n c t i o n of A T M c o u l d reduce D N A d o u b l e - s t r a n d e d break re -p a i r capac i ty , c o n t r i b u t i n g to the occurrence of t r a n s l o c a t i o n s a n d subsequent l y m -p h o m a s . T h e s t u d y p o p u l a t i o n consists of 798 N H L cases a n d 793 contro l s t h a t are frequency m a t c h e d b y reg ion , age, sex a n d e thn i c i ty . G e n e t i c v a r i a t i o n i n the p r o -m o t e r a n d a l l exons of A T M was d e t e r m i n e d b y b i - d i r e c t i o n a l sequenc ing of the g e rml ine (blood) D N A of 86 N H L pat i ents , b o t h T a n d B - c e l l . S e q u e n c i n g revealed 79 v a r i a n t s , 18 of w h i c h co r respond to a m i n o a c i d differences. S i x of these v a r i -ants are p r e d i c t e d to be deleter ious to p r o t e i n f u n c t i o n ; these v a r i a n t s were a l l rare (0 .5-1.1%). Seven of the 86 (8.1%) N H L pat i ents were heterozygous at these l o c i . E l e v e n v a r i a n t s were present at a f requency of 5% or greater ; these m a k e u p 10 h a p l o t y p e s . Seven t a g S N P s were p r e d i c t e d to speci fy these 10 hap lo types . L i n k a g e d i s e q u i l i b r i u m across the A T M gene is h i g h b u t not complete . S i x t a g S N P s (1 fa i l ed i n assay design) a n d the 6 p u t a t i v e l y deleter ious v a r i -ants were g e n o t y p e d i n the ent ire c a s e / c o n t r o l set. D i r e c t assoc ia t i on tests based o n the t a g S N P s a n d h a p l o t y p e - b a s e d ind i re c t assoc ia t i on tests were p e r f o r m e d . T h e s ix rare var iants were also assessed. T h e results of the assoc ia t i on tests i n d i c a t e t h a t c o m m o n v a r i a n t s of A T M do not s i gn i f i cant ly c o n t r i b u t e to the overa l l r i s k of N H L i n the genera l p o p u l a t i o n . O u r resul ts , however , p o i n t t o the p o s s i b i l i t y of a rare v a r i a n t - r a r e disease m o d e l where some rare , f u n c t i o n a l l y deleter ious v a r i a n t s m a y c o n t r i b u t e to a n increased r i s k of deve lopment of rare subtypes of the disease. Contents A b s t r a c t i i Contents i v L i s t of Tables v i i L i s t of F igures v i i i L i s t of A b b r e v i a t i o n s i x Acknowledgements x i i 1 I n t r o d u c t i o n 1 1.1 N o n - H o d g k i n l y m p h o m a 1 1.2 D o u b l e s t randed break repair 3 1.2.1 Non-homologous end j o i n i n g 5 1.2.2 Homologous re combinat i on 7 1.3 A T M 9 1.3.1 A t a x i a te langiectas ia 9 1.3.2 F u n c t i o n s of A T M 11 1.3.3 G e n o m i c s t ruc ture of A T M 14 1.4 A s s o c i a t i o n studies 15 1.4.1 L i n k a g e d i s e q u i l i b r i u m a n d t a g S N P s 17 1.4.2 Strengths a n d weaknesses of assoc iat ion studies 19 2 M a t e r i a l s a n d M e t h o d s 22 2.1 Cases a n d controls 22 2.2 D N A e x t r a c t i o n . . 23 2.3 V a r i a n t detect ion sequencing 25 2.4 P r e d i c t i o n of haplo types a n d choice of t a g S N P s 26 2.5 G e n o t y p i n g 26 2.6 S t a t i s t i c a l analys is 27 3 R e s u l t s a n d D i s c u s s i o n 29 3.1 V a r i a n t detect ion sequencing 29 3.2 G e r m l i n e v a r i a t i o n i n A T M a n d compar i son w i t h l i t e ra ture 36 3.3 t a g S N P selection for genotyp ing 40 3.4 G e n o t y p i n g results 41 3.4.1 Q u a l i t y c ontro l of genotyp ing d a t a 43 3.4.2 L i n k a g e d i s e q u i l i b r i u m 45 3.5 A s s o c i a t i o n tests w i t h c o m m o n var iants 47 3.5.1 O v e r a l l a n d subtype analyses 47 3.5.2 A n a l y s i s of different e thnic i t ies 56 3.5.3 A n a l y s i s of haplo types 59 3.6 A s s o c i a t i o n s t u d y w i t h c ombined rare var iants 59 3.6.1 M a n t l e cel l l y m p h o m a 63 3.6.2 M a r g i n a l zone l y m p h o m a 64 4 C o n c l u s i o n s 66 5 F u t u r e W o r k 68 B i b l i o g r a p h y 70 A p p e n d i x A P r i m e r s a n d probes 79 A . l P C R pr imers 79 A . 2 TaqM&n sssays 79 A p p e n d i x B C o r r e c t i o n for m u l t i p l e test ing 82 A p p e n d i x C E t h i c s a p p r o v a l 86 List of Tables Tab le 2.1 L i s t of N H L subtypes 23 T a b l e 2.2 Case a n d contro l samples 24 Tab le 3.1 V a r i a n t detect ion sequencing set 30 T a b l e 3.2 V a r i a n t s ident i f ied by sequencing 34 T a b l e 3.3 S t u d y power 41 Tab le 3.4 H a p l o t y p e s pred i c ted us ing c o m m o n var iants 42 Tab le 3.5 P u t a t i v e l y deleterious var iants 43 Tab le 3.6 O R s for c o m m o n var iants 48 T a b l e 3.7 O R s for c o m m o n var iants for subtypes of N H L 51 T a b l e 3.8 O R s for c o m m o n var iants i n different e thnic i t ies 57 Tab le 3.9 O R s for haplo types 60 Tab le 3.10 O R s for 6 rare var iants 62 T a b l e A . l P C R pr imers a n d condi t ions 80 Tab le A . 2 T a g M a n pr imers and probes 81 Tab le B . l C o r r e c t i o n for m u l t i p l e tes t ing 83 List of Figures F i g u r e 1.1 D o u b l e s t randed break repair 6 F i g u r e 1.2 G e n o m i c s t ruc ture of A T M 16 F i g u r e 1.3 L i n k a g e d i s e q u i l i b r i u m 18 F i g u r e 3.1 Ver i f i ca t i on of the 3 ' U T R 32 F i g u r e 3.2 A T M var iant types 37 F i g u r e 3.3 V i s u a l genotypes 38 F i g u r e 3.4 E x a m p l e TaqM&n genotyp ing p lot 44 F i g u r e 3.5 L i n k a g e d i s e q u i l i b r i u m plots 46 List of Abbreviations A C D a c id - c i t ra te -dex t rose A T a t a x i a t e lang ie c tas ia A T M a t a x i a t e lang ie c tas ia m u t a t e d A T R A T M a n d R a d - 3 - r e l a t e d B A S C B R C A l - a s s o c i a t e d genome surve i l lance c o m p l e x B R C A 1 breast cancer assoc iated 1 B L M B l o o m s y n d r o m e C I confidence interva ls D L B C L diffuse large B ce l l l y m p h o m a D N A - P K c s D N A - d e p e n d e n t p r o t e i n k inase c a t a l y t i c s u b u n i t D S B doub le s t r a n d e d breaks E D T A e thylene d i a m i n e te t ra -ace t i c a c i d E M e x p e c t a t i o n m a x i m i z a t i o n F C L f o l l i cu lar s m a l l ce l l c leaved l y m p h o m a F D R false d iscovery ra te F L f o l l i cu lar large ce l l l y m p h o m a F M f o l l i cu lar m i x e d l y m p h o m a H R homologous r e c o m b i n a t i o n H W E H a r d y - W e i n b e r g e q u i l i b r i u m L D l inkage d i s e q u i l i b r i u m L P L l y m p h o p l a s m a c y t i c l y m p h o m a M A F m i n o r al le le f requency M A L T l ow grade B ce l l of m u c o s a assoc iated l y m p h o i d t issue M C L m a n t l e ce l l l y m p h o m a M D C 1 m e d i a t o r of D N A d a m a g e checkpo int 1 M D O R m i n i m u m detectab le odds r a t i o M F mycos is fungoides M i s c B C L misce l laneous B ce l l l y m p h o m a M i s c T C L misce l laneous T ce l l l y m p h o m a M Z L m a r g i n a l zone l y m p h o m a N H E J non -homologous e n d j o i n i n g N H L n o n - H o d g k i n l y m p h o m a O R odds r a t i o O R F o p e n r e a d i n g f rame P C R p o l y m e r a s e c h a i n react ions P I K K p h o s p h a t i d y l i n o s i t o l 3 - O H kinase - l ike k inase P T C L p e r i p h e r a l T ce l l l y m p h o m a S L L s m a l l l y m p h o c y t i c l y m p h o m a S N P s ingle nuc leo t ide p o l y m o r p h i s m t a g S N P h a p l o t y p e tagg ing S N P T E t r i s - E D T A T E R T te lomerase reverse t r a n s c r i p t a s e h o l o e n z y m e U T R u n t r a n s l a t e d reg i on W R N W e r n e r s y n d r o m e gene 5 3 B P 1 p 5 3 - b i n d i n g p r o t e i n Acknowledgements I ' d l ike t o acknowledge the i n v a l u a b l e c o n t r i b u t i o n of m y superv i sor , D r . A n g e l a B r o o k s - W i l s o n . Y o u r pat ience a n d u n b r i d l e d e n t h u s i a s m were a constant source of i n s p i r a t i o n . T h i s s t u d y c o u l d not have h a p p e n e d w i t h o u t D r . J o h n S p i n e l l i a n d A m y M a c A r t h u r . I ' d also l ike to t h a n k S t e p h e n L e a c h , for t each ing me e v e r y t h i n g t h a t I needed to k n o w i n the l ab , a n d K a r e n N o v i k for t i re less ly a n s w e r i n g a l l m y endless quest ions a n d for b e i n g a f r i end . I ' d l ike t o t h a n k m y thesis a d v i s o r y c o m m i t t e e , D r . C a r o l y n B r o w n , D r . E l i z a b e t h S i m p s o n a n d D r . D i x i e M a g e r , for s c a r i n g me j u s t enough to keep me o n t r a c k a n d everyone i n the C a n c e r G e n e t i c s g roup at the G e n o m e Sciences C e n t r e , for m a k i n g i t such a f u n place to be. O n a p e r s o n a l note , I c o u l d not have done th i s w i t h o u t the s u p p o r t of m y parents , w h o f ind a w a y to m a k e me feel l oved , even f r o m t h o u s a n d s of mi les away. A n d R i c h a r d , w h o gets the l i o n ' s share of the c red i t . T h a n k y o u for y o u r pat ience a n d encouragement , a n d for f i n d i n g ways to m a k e me l a u g h , w h e n I need i t most . P A Y A L S I P A H I M A L A N I The University of British Columbia May 2006 1 Introduction 1.1 Non-Hodgkin lymphoma N o n - H o d g k i n l y m p h o m a ( N H L ) is a so l id t u m o u r of l y m p h o i d o r i g i n . It is now the f our th most c o m m o n cause of cancer death i n the U n i t e d States [64] a n d the seventh most c o m m o n cause of death wor ldwide [52]. T h e incidence a n d m o r t a l i t y rates of N H L have been increas ing over the last three decades, a n d N H L is cur rent ly the f our th most c o m m o n f o rm of m a l i g n a n c y i n C a n a d a [50]. T h e reasons for th is are p o o r l y unders tood a n d whi le changes i n l i festyle or e n v i r o n m e n t a l factors are l ike ly to affect such change, cases are l ike ly to arise i n the most genet ica l ly susceptible f ract ion of our p o p u l a t i o n . It is therefore increas ingly i m p o r t a n t to determine the genetic a n d e n v i r o n m e n t a l factors tha t contr ibute to N H L . T h e t e r m n o n - H o d g k i n l y m p h o m a encompasses several subtypes of l y m p h o -pro l i ferat ive m a l i g n a n t disease. These subtypes have different c l i n i c a l a n d h is to log -i c a l presentat ion . N H L can occur d u r i n g c h i l d h o o d a n d throughout a d u l t h o o d but i ts incidence increases w i t h age. N H L , l ike most cancers , is a c o m p l e x genet ic disease. T h o u g h N H L f a m i -lies have been desc r ibed (see [79] for e x a m p l e ) , most cases are sporad i c . W i t h i n the N H L fami l i es , affected re lat ives genera l ly have different types of l y m p h o p r o l i f -erat ive m a l i g n a n c y , i n c l u d i n g H o d g k i n a n d n o n - H o d g k i n l y m p h o m a s a n d l y m p h o i d leukaemias [10, 23, 41 , 63]. T h e occurrence of different types of l y m p h o p r o l i f e r a t i v e disease w i t h i n these rare fami l ies r a t h e r t h a n the same N H L s u b t y p e m a y i n d i -cate the existence of c o m m o n u n d e r l y i n g genetic s u s c e p t i b i l i t y factors for l y m p h o i d cancers . L y m p h o m a fami l i es , i n a d d i t i o n to b e i n g rare , are o f ten s m a l l , m a k i n g t h e m u n s u i t a b l e for p o s i t i o n a l c l o n i n g studies a n d to date , no l y m p h o m a genes have been f ound b y m a p p i n g i n l y m p h o i d cancer fami l ies . I n contras t , c a n d i d a t e gene based assoc ia t i on studies ' u s i n g N H L cases a n d contro ls m a y ident i fy s u s c e p t i b i l i t y factors t h a t are i m p o r t a n t i n the deve lopment of N H L . A s s o c i a t i o n studies also have the advantage t h a t t h e y are robus t over a range of penetrances a n d i n the presence of genetic heterogeneity , b o t h l i k e l y i n N H L . C y t o g e n e t i c a l l y , several N H L t u m o u r s are k n o w n to have charac te r i s t i c chro -m o s o m a l t r a n s l o c a t i o n s . O n e e x a m p l e of th i s is the t (14;18)(q32;q21) t r a n s l o c a t i o n o f ten seen i n f o l l i cu lar l y m p h o m a . S i m i l a r l y , m a n t l e ce l l l y m p h o m a s of ten h a r b o u r a t ( l l ; 1 4 ) ( q l 3 ; q 3 2 ) t r a n s l o c a t i o n . M a n y N H L subtypes are now c h a r a c t e r i z e d b y the existence of specif ic t r a n s l o c a t i o n s [31]. T h e occurrence of t r a n s l o c a t i o n s i n m a n y different c l i n i c a l subtypes of N H L seems to p o i n t to a c o m m o n m e c h a n i s t i c theme t h a t m a y be ref lected i n c o m m o n genetic s u s c e p t i b i l i t y factors t h a t act across different subtypes of N H L . W e hypothes i ze t h a t i t is not the presence of a specif ic t r a n s l o c a t i o n , b u t the p r o p e n s i t y to undergo t r a n s l o c a t i o n s t h a t m a y be s h a r e d b y different c l i n i c a l subtypes of l y m p h o m a . T h i s m a y i m p l y a n u n d e r l y i n g defect i n the c e l l u l a r mechan i sms t h a t protec t against the occurrence of t r a n s l o c a t i o n s . T h u s , genes i n v o l v e d i n D N A repa i r or surve i l lance for d a m a g e d D N A m a y p l a y a role i n N H L suscept ib i l i t y . T h i s c o u l d result i n different c l i n i c a l subtypes of N H L a r i s i n g f r o m the same germl ine defect t h a t results i n different specif ic t r a n s l o c a t i o n s a n d therefore heterogeneous phenotypes . T o test th is hypothes i s , we w i l l c ombine a l l c l i n i c a l subtypes i n t o a s ingle g r o u p , t o be c o m p a r e d to the contro ls i n a c o m b i n e d genet ic ana lys i s i n a d d i t i o n to a n a l y z i n g the subtypes separate ly w h e n sample n u m -bers p e r m i t . 1.2 Double stranded break repair Sequence changes i n s o m a t i c D N A are o f ten deleter ious a n d a s ingle a l t e r a t i o n m a y resul t i n a change i n the a m o u n t of p r o t e i n p r o d u c e d , or change a s ingle a m i n o a c i d , w h i c h may, i n t u r n , l ead to the onset of m a l i g n a n c y . T o pro tec t against t h i s , cells have safeguards t h a t m i n i m i z e the r i s k of m u t a t i o n a n d genomic i n s t a b i l i t y [37]. Sequence changes i n the D N A m a y be caused b y a n u m b e r of different factors , i n c l u d i n g spontaneous c h e m i c a l changes, r e p l i c a t i o n errors or d a m a g e in f l i c t ed b y e i ther endogenous or exogenous D N A d a m a g i n g agents such as r a d i a t i o n [32]. T h e s e D N A d a m a g i n g agents m a y cause s ingle or doub le s t r a n d e d breaks i n the D N A . D o u b l e s t r a n d e d breaks ( D S B s ) are h i g h l y c y t o t o x i c a n d have the p o t e n t i a l t o cause m u t a t i o n s i n D N A as a resul t of rearrangements or due to i n t r o d u c t i o n of errors d u r i n g r epa i r . I n response to these breaks , the ce l l e i ther tr ies t o r e p a i r the damage or i n i t i a t e s p r o g r a m m e d ce l l d e a t h (apoptosis ) i f the damage is extens ive . D S B s do , however , o c cur n a t u r a l l y i n the ce l l as p a r t of the n o r m a l course of meios is , to fac i l i ta te s t r a n d exchange between homologous chromosomes . T h e y are also f o rmed a n d sealed i n l y m p h o c y t e s d u r i n g V ( D ) J r e c o m b i n a t i o n to generate m a t u r e T - c e l l receptor a n d B - c e l l i m m u n o g l o b u l i n genes [62]. D S B s are p a r t i c u l a r l y dangerous i f t h e y o c cur d u r i n g t h e r e p l i c a t i o n of the ce l l . If b r o k e n chromosomes are c a r r i e d t h r o u g h m i t o s i s , the acentr i c chromosome fragments w i l l not d i v i d e evenly be tween the daughter cells. T o prevent t h i s , the re -sponse of the ce l l t o D S B s is charac te r i zed b y the a c t i v a t i o n of c e l l cyc le checkpo ints , w h i c h are r e g u l a t o r y mechan i sms t h a t do not a l l ow the i n i t i a t i o n of a new phase of the ce l l cyc le before the prev ious one is c o m p l e t e d , or t e m p o r a r i l y arrest c e l l -cyc le progress ion i n response to stress. T h i s ce l l - cyc le arrest is u s u a l l y a c c o m p a n i e d b y counterac t ive measures to ba lance c e l l u l a r m e t a b o l i s m . T h u s , the occurrence of D S B s i n the ce l l tr iggers a n e t w o r k of s i g n a l i n g p a t h w a y s i n a cascade t h a t is o r ches t ra ted p r i m a r i l y b y a s ingle c r i t i c a l p r o t e i n k inase — A T M [62]. W h e n a D S B occurs i n a ce l l , a n e laborate process is set i n m o t i o n , t h a t recru i t s a n d act ivates several pro te ins i n v o l v e d i n D S B repa i r . T h e M R N ( M R E 1 1 -R A D 5 0 - N B S 1 ) c o m p l e x p lays a c e n t r a l ro le i n sensing D N A D S B s . A T M , as p a r t of the B R C A l - a s s o c i a t e d genome surve i l lance c o m p l e x ( B A S C ) also p a r t i c i p a t e s i n the de te c t i on of D S B s . A T M is t h o u g h t to p h o s p h o r y l a t e the h is tone H 2 A X , the p h o s p h o r y l a t e d f o r m of w h i c h is referred to as ryH2AX. I n a d d i t i o n , o ther pro te ins , such as the m e d i a t o r of D N A d a m a g e checkpo int 1 ( M D C l ) associate w i t h *yH2AX t r i g g e r i n g c o n f o r m a t i o n a l changes i n the h igher -order c h r o m a t i n s t r u c t u r e , w h i c h , together w i t h A T M causes the p 5 3 - b i n d i n g p r o t e i n ( 5 3 B P 1 ) to l oca l i ze to the D S B . O t h e r pro te ins t h a t co - loca l ize to the s ite of the D S B i n c l u d e B R C A 1 a n d R A D 5 4 . T h e D N A ends are also able to s t i m u l a t e a c t i v a t i o n of A T M , w h i c h i n t u r n tr iggers a cascade of s i g n a l i n g p a t h w a y s t h a t p h o s p h o r y l a t e a n u m b e r of d o w n s t r e a m targets . A T M a n d some of i t s targets are i m p o r t a n t no t o n l y for sens ing the break , b u t also for ce l l cyc le r e g u l a t i o n a n d i n the r e p a i r process itself . Some of these pro te ins , such as the M R N c o m p l e x have a d d i t i o n a l roles w h e r e b y t h e y fac i l i ta te the a b i l i t y of A T M to p h o s p h o r y l a t e i ts substrates [36, 74]. T h e r e are two m a j o r p a t h w a y s for the repa i r of D S B s i n the ce l l : n o n -homologous end j o i n i n g ( N H E J ) a n d homologous r e c o m b i n a t i o n ( H R ) . These processes are s u m m a r i z e d i n F i g u r e 1.1. T h e two p a t h w a y s , however , are i n t r i c a t e l y l i n k e d w i t h enough f l e x i b i l i t y t o a l l ow for r e d u n d a n c y a n d backups s h o u l d one factor or p a t h w a y f a i l . 1.2.1 N o n - h o m o l o g o u s e n d j o i n i n g N o n - h o m o l o g o u s e n d j o i n i n g ( N H E J ) is a p a t h w a y for the r epa i r of D S B s t h a t funct ions at a l l stages of the ce l l cyc le b u t is of p a r t i c u l a r i m p o r t a n c e i n G 0 / G 1 . I t appears to be a r a t h e r imprec i se p a t h w a y a n d often a l lows the loss of nucleot ides at the site of the D S B . T h i s is l i k e l y due to the process ing of the ends of the D N A t h a t cannot be l i g a t e d d i rec t ly . T h e process of N H E J requires four steps [78]: • D e t e c t i o n of the D S B • F o r m a t i o n of a m o l e c u l a r br idge to p h y s i c a l l y h o l d the two ends together • E n d process ing to m a k e n o n - m a t c h i n g or d a m a g e d ends c o m p a t i b l e • L i g a t i o n T h e core N H E J m a c h i n e r y consists of the K u 7 0 / 8 0 he te rod imer , the D N A -dependent p r o t e i n k inase c a t a l y t i c s u b u n i t ( D N A - P K c s ) , X R C C 4 a n d D N A ligase I V [78]. N H E J is i n i t i a t e d b y the b i n d i n g of K u to a d o u b l e s t r a n d e d D N A end . K u is a h e t e r o d i m e r c o n s i s t i n g of two s u b u n i t s , K u 7 0 a n d K u 8 0 , w h i c h enc irc le the D N A . T h e b i n d i n g of K u aids i n the r e c r u i t m e n t a n d a c t i v a t i o n of D N A - P K c s . search f 3. DNA synthesis n n i XBCC4 \ DNA (gas« i j i i i ' i M i i r , . . . 1. DNAtgas* | 2. rmatnsn ^ i n 1 1 1 1 1 h 1 1 1 1 1 i m n End |.:inn., DcrrafirnM lo^s of 33 m ofafwnucfcwtid**} I Homologous rftcorabmatton ' (Error F i g u r e 1.1: T w o p a t h w a y s of doub le s t r a n d e d break repa i r are i l l u s t r a t e d here. T h e p a n e l o n the r ight depic ts non -homologous e n d j o i n i n g , w h i l e t h a t o n the left shows homologous r e c o m b i n a t i o n . A s seen i n th i s f igure, A T M is at the a p e x of the s i g n a l i n g cascade t r iggered i n response t o d o u b l e s t r a n d e d breaks i n the D N A . T h i s f igure has been a d a p t e d f r o m H o e i j m a k e r s et al [32]. It is h y p o t h e s i z e d t h a t D N A - P K c s m a i n t a i n s the two D N A ends i n close p h y s i c a l p r o x i m i t y u n t i l t h e y are re jo ined . I t is also t h o u g h t to r e c r u i t the D N A ligase I V / X R C C 4 c o m p l e x , w h i c h is the f u n c t i o n a l e n z y m e m a c h i n e r y t h a t is respons ib le for the r e j o i n i n g or l i g a t i o n of the two D N A ends. T h e l igase a c t i v i t y of t h i s c o m p l e x is g r e a t l y enhanced b y i ts i n t e r a c t i o n w i t h K u [40, 78]. U n t i l recently , i t was t h o u g h t t h a t A T M was d ispensable for N H E J . T h i s was because cells def ic ient for A T M repa i r a m a j o r i t y of t h e i r D S B s n o r m a l l y . R e c e n t advances i n techniques for the de te c t i on of D S B r e p a i r have l e d to the d iscovery t h a t these cel ls fa i l t o repa i r a p p r o x i m a t e l y 1 0 % of the breaks i n d u c e d b y X a n d 7 rays . T h i s subset of breaks r e m a i n s u n r e p a i r e d even after ex tended per iods of t i m e . It has been suggested t h a t A T M a n d i ts substrates : A r t e m i s , H 2 A X , 5 3 B P 1 a n d the M R N c o m p l e x , are spec i f i ca l ly r e q u i r e d for the r epa i r of these breaks [40]. A r t e m i s is a recent ly d iscovered p r o t e i n t h o u g h t to be one of the enzymes t h a t processes D N A ends p r i o r t o l i g a t i o n a n d is respons ib le for resect ion of s ing l e - s t randed overhangs v i a i ts endonuclease a c t i v i t y [78]. T h u s , f o l l owing exposure to i o n i z i n g r a d i a t i o n , 8 0 - 9 0 % of the D S B s are re -j o i n e d u s i n g the N H E J core components : K u , X R C C 4 a n d D N A ligase I V . T h i s end-process ing is fast a n d A T M - i n d e p e n d e n t . A p p r o x i m a t e l y 1 0 % of the D S B s , n o r m a l l y r e p a i r e d w i t h s low k i n e t i c s , require A T M a c t i v i t y ( s u m m a r i z e d b y L o b r i c h a n d Jeggo [40]). 1.2.2 H o m o l o g o u s r e c o m b i n a t i o n H o m o l o g o u s r e c o m b i n a t i o n ( H R ) is the exchange of D N A sequence between h o m o l o -gous D N A molecules . H R c a n repa i r D S B s b y u s i n g the u n d a m a g e d sister c h r o m a t i d as a t e m p l a t e . There fore , H R operates p r i m a r i l y i n late S a n d G 2 because of the a v a i l a b i l i t y of s ister c h r o m a t i d s [33]. T h e use of a t e m p l a t e means t h a t H R genera l ly results i n the accurate repa i r of the D S B . T h e r e p a i r of D S B s u s i n g H R requires three steps [81]: • p r e - s y n a p s i s , w h i c h is the p r e p a r a t i o n of the D N A e n d • synapsis or the f o r m a t i o n of a j o i n t molecule between the p r e p a r e d D N A end a n d a doub le s t r a n d e d homologous t e m p l a t e • p o s t - s y n a p s i s , where the D N A s t r a n d is r e p a i r e d a n d consequent ly the re -c o m b i n e d molecules are separated . H R is a s low process t h a t requires the R A D 5 2 group of pro te ins c o n s i s t i n g of R A D 5 1 , R A D 5 2 , R A D 5 4 a n d the M R N c o m p l e x cons i s t ing of M R E 1 1 , R A D 5 0 a n d N B S 1 . T h e i n i t i a l response to D S B s involves the M R N c o m p l e x , w h i c h has af f in i ty for D N A ends i n a d d i t i o n to nuclease a c t i v i t i e s a n d the a b i l i t y t o m i g r a t e a l o n g D N A [75, 81]. T h i s c o m p l e x is t h o u g h t to p l a y a p a r t i n the u n w i n d i n g a n d process ing of the b r o k e n D N A s t r a n d to expose a s ing l e - s t rand overhang onto w h i c h a recombinase c a n be l oaded . N B S 1 , i n p a r t i c u l a r , appears to be i m p o r t a n t for t r a n s m i t t i n g s ignals f r o m the D N A d a m a g e sensors to the M R N c o m p l e x . T h e M R N c o m p l e x also funct ions as a b r idge , keep ing the two D N A ends i n close p r o x i m i t y to fa c i l i ta te r e j o in ing . B R C A l a n d B R C A 2 are also i n v o l v e d at a n ear ly p o i n t of D S B repa i r . B R C A l in terac ts w i t h R A D 5 1 . R A D 5 1 is a recombinase , w h i c h is the key p layer i n H R , a n d i t med ia tes ho -m o l o g y r e c o g n i t i o n a n d exchange of D N A s t rands . S y n a p s i s requires t h a t R A D 5 1 be assembled in to a n u c l e o p r o t e i n filament o n the i n v a d i n g s ing l e - s t randed D N A . T h i s involves the c o o p e r a t i o n of several r e c o m b i n a t i o n med ia to r s i n c l u d i n g R A D 5 2 , R A D 5 4 a n d the R A D 5 1 para logs , R A D 5 1 B , R A D 5 1 C , R A D 5 1 D , X R C C 2 a n d X R C C 3 T h e role of these m e d i a t o r s is not c o m p l e t e l y u n d e r s t o o d yet . T h e y may, however , c o n t r i b u t e to several aspects of synaps i s , s u c h as f a c i l i t a t i n g the l o c a t i o n of h o m o l -ogous sequences, a n d e x t e n d i n g h e t e r o d u p l e x D N A i n the j o i n t molecu le [81]. O t h e r pro te ins t h o u g h t to be i m p o r t a n t to H R are those assoc iated w i t h B l o o m S y n d r o m e ( B L M ) a n d W e r n e r S y n d r o m e ( W R N ) . B L M is p h o s p h o r y l a t e d b y A T M a n d forms foci w i t h R A D 5 1 after exposure to i o n i z i n g r a d i a t i o n . B L M is also p a r t of the B R C A l - a s s o c i a t e d g e n o m e surve i l lance c o m p l e x ( B A S C ) , w h i c h acts as a sensor of D N A d a m a g e ( s u m m a r i z e d b y V a l e r i e a n d P o v i r k [74]). C o m p l e t i o n of the r epa i r of a D S B v i a H R requires s e p a r a t i o n of the p r o d -uct D N A molecules . T h i s occurs v i a s t r u c t u r e specif ic endonucleases t h a t cleave H o l l i d a y j u n c t i o n s , the D N A s t ruc tures where the four s t r a n d s of two d u p l e x D N A molecules are crossed. T h i s m a y y i e l d e i ther a crossover or a non-crossover event. T h e gaps are t h e n f i l l ed i n b y a D N A po lymerase a n d the breaks sealed b y l igase (reviewed i n [74, 81]). 1.3 A T M 1.3.1 A t a x i a t e l a n g i e c t a s i a A t a x i a t e lang iec tas ia ( A T ) is a progress ive neurodegenerat ive disease t h a t occurs ear ly i n c h i l d h o o d [24]. It is a n a u t o s o m a l recessive d isorder caused b y m u t a t i o n s i n a s ingle gene, a t a x i a t e lang ie c tas ia m u t a t e d ( A T M ) [56]. T h e charac ter i s t i c s of A T i n c l u d e loss of cerebel lar f u n c t i o n , o f ten present ing as progressive speech defects a n d a t a x i c or u n c o o r d i n a t e d movements [5, 24]. A n o t h e r s y m p t o m of the disease is the presence of te langiectases , o f ten o c u l a r i.e. d i l a t e d b l o o d vessels of the eye. O t h e r s y m p t o m s i n c l u d e i m m u n e defects a n d s ter i l i t y , r e s u l t i n g f r o m defects i n ear ly meios is . A t the ce l lu lar leve l , A T is a c h r o m o s o m a l i n s t a b i l i t y s y n d r o m e , a n d is charac te r i zed b y a defective D N A damage response. T h e cells of A T pat ients are sensit ive to i o n i z i n g r a d i a t i o n a n d other agents t h a t i n d u c e doub le s t r a n d e d breaks . T h e s e cells show a n e levated n u m b e r of c h r o m o s o m a l a b e r r a t i o n s i n response to such agents. T h e y show decreased a b i l i t y to ac t iva te the D N A damage response n e twork a n d defective ce l l cyc le r e g u l a t i o n i n response to D N A damage [35]. A T pat i ents are also pred isposed to m a l i g n a n c i e s , i n c l u d i n g l y m p h o m a s a n d l e u k a e m i a [30]. C a n c e r occurs i n a p p r o x i m a t e l y 30 -40% of A T i n d i v i d u a l s a n d 10 -15% of A T pat i ents develop l y m p h o i d ma l ignanc i es , o f ten i n v o l v i n g rearrangements at T - c e l l receptor l o c i [30, 44]. I n a d d i t i o n , s o m a t i c m u t a t i o n s i n A T M have been seen i n some sporad i c cancers , i n c l u d i n g l e u k a e m i a [9, 30]. M o u s e mode ls t h a t have b o t h copies of A T M k n o c k e d out show a h i g h i n c i -dence of l y m p h o i d t u m o u r s , w h i c h f requent ly c o n t a i n t r a n s l o c a t i o n s at T - c e l l recep-t o r l o c i . T h i s ind icates t h a t the D S B s p r o d u c e d d u r i n g V ( D ) J r e c o m b i n a t i o n m i g h t be respons ib le for these c h r o m o s o m a l aber ra t i ons [3, 82]. H e t e r o z y g o u s carr iers of A T M m u t a t i o n s m a y have a n e levated r i s k of cancer , l a rge ly of the breast , a l t h o u g h there are c o n f l i c t i n g d a t a i n the l i t e r a t u r e (see [11, 76] for example ) . These c o n f l i c t i n g d a t a may, i n p a r t , be e x p l a i n e d b y the different types of m u t a t i o n s t h a t affect A T M . A T i n d i v i d u a l s u s u a l l y have nonsense m u t a t i o n s i n the A T M gene, w h i c h resul t i n t r u n c a t e d , unusab le forms of the p r o t e i n . T h u s , carr iers of these m u t a t i o n s have decreased levels of f u n c t i o n a l A T M i n t h e i r cel ls . Some missense m u t a t i o n s , however , resul t i n cells t h a t c o n t a i n b o t h f u n c t i o n a l a n d f u l l - l e n g t h n o n - f u n c t i o n a l vers ions of the A T M molecu le . T h e inc idence of cancer i n these cases m a y be h igher because of a d o m i n a n t negat ive effect exer ted b y the inac t i ve p r o t e i n , w h i c h m a y resu l t i n even lower levels of f u n c t i o n a l A T M t h a n i n the case of nonsense m u t a t i o n s [26, 60]. A l t h o u g h the role of A T M i n l y m p h o m a has been es tab l i shed i n the context of A T , the frequency of A T M germl ine v a r i a n t s i n sporad i c l y m p h o m a pat i ents has not been d e t e r m i n e d . O u r hypothes i s is t h a t i n h e r i t e d v a r i a t i o n i n A T M is at least p a r t l y respons ib le for s u s c e p t i b i l i t y to N H L at the p o p u l a t i o n leve l . 1.3.2 Functions of A T M T h e p r o d u c t of the a t a x i a t e lang ie c tas ia m u t a t e d ( A T M ) gene is a m e m b e r of the p h o s p h a t i d y l i n o s i t o l 3 - O H k inase - l ike k inase ( P I K K ) f ami ly . T h e A T M p r o t e i n re-sides p r e d o m i n a n t l y i n the nucleus of d i v i d i n g cel ls . It is a ser ine - threonine p r o t e i n k inase a n d p lays a c e n t r a l role i n b o t h the de te c t i on of a n d response to D N A d a m -age, as w e l l as ce l lu lar recovery a n d s u r v i v a l . T h i s involves the m e d i a t i o n of ce l l cyc le checkpo ints , apoptos i s a n d D N A repa i r . T h e P I K K f a m i l y of prote ins also inc ludes A T M a n d R a d - 3 - r e l a t e d ( A T R ) a n d D N A - P K , w h i c h a l o n g w i t h A T M , are be l i eved to act i n large complexes t h a t m o n i t o r the genome for D N A damage . U p o n e n c o u n t e r i n g s u c h damage , t h e y s i gna l to o ther pro te ins a n d c o o r d i n a t e the c e l l u l a r response. O n e such damage sens ing c o m p l e x is B A S C , w h i c h inc ludes A T M , B R C A l a n d the M R N c o m p l e x , i n a d d i t i o n to o thers . A T M is genera l ly t h o u g h t to be i m p o r t a n t for sensing d a m a g e caused b y i o n i z i n g r a d i a t i o n a n d t h a t i n d u c e d b y r a d i o m i m e t i c drugs . C e l l cyc le checkpoints C e l l cyc le checkpo ints prevent the ce l l f r o m proceed ing t h r o u g h the ce l l cyc le i n the presence of u n - r e p a i r e d D S B s t h a t c o u l d p o t e n t i a l l y be dangerous . T h e purpose of these checkpo ints is t o m a i n t a i n genomic i n t e g r i t y a n d t h e y are therefore i n v o l v e d i n a l l stages of the ce l l cycle . A T M regulates the G l / S , i n t r a - S a n d G 2 / M checkpo ints of the ce l l cyc le . T h u s , i t i n h i b i t s the ce l l cyc le t h r o u g h the a c t i v a t i o n of c e l l -cyc le checkpo ints i n response to D N A damage , w h i c h m a y a l l ow t i m e to r epa i r s u c h lesions, b u t m a y also l e a d to p e r m a n e n t ce l l cyc le arrest . F o r ins tance , A T M - d e p e n d e n t p h o s p h o r y l a t i o n of p53 is r e q u i r e d for the G l checkpo int of the ce l l cyc le . A T M also p h o s p h o r y l a t e s C H K 2 , w h i c h i n t u r n phos -p h o r y l a t e s p53 . T h i s p h o s p h o r y l a t i o n of p53 s tab i l i zes i t , e n a b l i n g the a c t i v a t i o n of a n u m b e r of p53-respons ive pro te ins , w h i c h resul t i n ce l l cyc le arrest i n G l / S or , a l t e rnat ive ly , apoptos i s [62]. A T M phosphory la tes C H K 2 i n response to i r r a d i a t i o n d u r i n g the G 2 phase of the ce l l cyc le . C H K 2 , i n t u r n , affects a m u l t i t u d e of d o w n s t r e a m targets a n d c o n -sequent ly prevents the G 2 / M t r a n s i t i o n . T h e s e are j u s t examples of the c o m p l e x i t y of the c o n t r o l t h a t A T M exerts over the ce l l cyc le checkpo ints . S ign i f i cant c ross ta lk a n d b a c k u p m e c h a n i s m s exist between A T M a n d A T R i n the G l a n d G 2 checkpo int c o n t r o l [62, 74]. P r o t e i n s such as C H K 2 , B R C A l a n d N B S 1 , w h i c h are also substrates of A T M , are examples of pro te ins t h a t are i m p o r t a n t for the c o n t r o l of the i n t r a -S phase checkpo int . C e l l s d e r i v e d f r o m A T i n d i v i d u a l s have defective ce l l - cyc le checkpo ints [51, 62]. D o u b l e s t r a n d e d b r e a k r e p a i r A T M is p a r t i c u l a r l y i m p o r t a n t for the de te c t i on a n d repa i r of doub le s t r a n d e d D N A breaks . I n u n d a m a g e d cells , the i n a c t i v e A T M p r o t e i n exists i n the f o r m of d imers or h igher -order m u l t i m e r s . T h e p r o t e i n phosphatase 2 A is t h o u g h t to be i n v o l v e d i n m a i n t a i n i n g the p r o t e i n i n m u l t i m e r i c f o r m . I n response to i o n i z i n g r a d i a t i o n a n d D S B s i n the D N A , A T M undergoes a u t o p h o s p h o r y l a t i o n at Ser -1981 . T h i s d i s r u p t s the i n t e r a c t i o n between the p r o t e i n molecules a n d p r o t e i n phosphatase 2 A , r e s u l t i n g i n the f o r m a t i o n of ac t ive m o n o m e r s t h a t subsequent ly i n i t i a t e a s i g n a l i n g cascade t h a t p h o s p h o r y l a t e s several d o w n s t r e a m substrates [27, 40, 62]. W h e n a c t i v a t e d , A T M c a n d i r e c t l y associate w i t h the M R N c o m p l e x , so ca l l ed because of i t s three p r i n c i p a l c o m p o n e n t prote ins : M R E 1 1 , R A D 5 0 a n d N B S 1 [73]. T h e r e is recent evidence for independent func t i ons of the M R N c o m p l e x as c o m p a r e d to a c o m p l e x f o rme d of j u s t M R E 1 1 a n d R A D 5 0 ( M R ) . T h e M R c o m p l e x act ivates A T M , w h i c h i n t u r n act ivates p53 . I n response to the M R N c o m p l e x , h o w -ever, A T M act ivates the checkpo int k inase C H K 2 . T h i s i n t e r a c t i o n c a n , therefore, c o n t r o l s i g n a l i n g b y af fect ing th i s A T M subs t ra te [36]. O t h e r key substrates of A T M i n c l u d e h is tone H 2 A X , m e d i a t o r of damage checkpo int 1 ( M D C 1 ) , p 5 3 - b i n d i n g p r o -t e i n 1 ( 5 3 B P 1 ) a n d breast - cancer assoc iated 1 ( B R C A l ) . A f t e r D N A damage , these factors are r e c r u i t e d to the site of D S B s a n d i n i t i a t e a n A T M - d e p e n d e n t s i g n a l i n g cascade t h a t leads to the r e s o l u t i o n of the break t h r o u g h D N A r e p a i r , or , i n the case of excessive D N A damage , ce l l d e a t h , o f ten t h r o u g h p 5 3 - m e d i a t e d apoptos i s [44, 62]. I n response to h i g h doses of i o n i z i n g r a d i a t i o n , A T M is c leaved b y a p r o -tease, genera t ing a k i n a s e - i n a c t i v e molecule t h a t re ta ins i t s D N A b i n d i n g a c t i v i t y . T h i s c leaved p r o d u c t c o u l d act as a n i n h i b i t o r of D N A damage s i g n a l i n g a n d repa i r a n d thus d irect the ce l l t o w a r d apoptos i s ins tead of s u r v i v a l [74]. D a m a g e i n d e p e n d e n t g e n o m i c s tabi l i ty Telomeres are n u c l e o p r o t e i n complexes t h a t protec t chromosome ends f r o m be -i n g ident i f i ed a n d processed as D S B s . M a m m a l i a n te l omer i c D N A consists of (TTAGGG)n repeats . I n c o m p l e t e e n d r e p l i c a t i o n b y D N A po lymerases results i n the progressive s h o r t e n i n g of te lomeres w i t h each ce l l d i v i s i o n . O n c e te lomeres be-come c r i t i c a l l y short , these ends e l i c i t ce l l - cyc le arrest , senescence, or apoptos i s , thereby l i m i t i n g the r e p l i c a t i v e life s p a n of cells. C h r o m o s o m e e n d - t o - e n d fusions c a n also o c cur , l e a d i n g to genomic i n s t a b i l i t y [43]. Te lomere l e n g t h c a n be s t a b i l i z e d or increased b y te lomerase , a n R N A - d e p e n d e n t D N A p o l y m e r a s e t h a t conta ins the te lomerase reverse t r a n s c r i p t a s e h o l o e n z y m e (TERT) a n d a n essent ia l R N A t e m -p l a t e (TERC). I n a d d i t i o n to i t s role i n D N A D S B r e p a i r , A T M is f u n c t i o n a l l y l i n k e d to the m a i n t e n a n c e of te lomere l e n g t h a n d in tegr i ty . T h i s is a process t h a t is c r i t i c a l t o ag ing a n d cancer . A T cells show defective te lomere m a i n t e n a n c e , r e s u l t i n g i n acce l -e ra ted te lomere s h o r t e n i n g , the f o r m a t i o n of c h r o m o s o m a l end - t o - end fusions a n d reduced life s p a n [45]. I n a d d i t i o n , mice l a c k i n g b o t h A T M a n d T e r c d i s p l a y t e l o m -ere s h o r t e n i n g , increased genomic i n s t a b i l i t y ; r a p i d a g i n g a n d p r e m a t u r e d e a t h [80]. T h e s e f indings s t r o n g l y suggest a ro le for A T M i n te lomere m a i n t e n a n c e a l t h o u g h the precise m e c h a n i s m for th i s has not been e l u c i d a t e d . 1.3.3 Genomic structure of A T M A T M is a large gene, m a p p e d to chromosome l l q 2 2 - 2 3 [25]; the genomic D N A spans 150kb a n d conta ins 62 c o d i n g exons r e s u l t i n g i n a n m R N A of a p p r o x i m a t e l y 13kb. T h e 9 .2kb o p e n r e a d i n g f rame encodes a 3 7 0 k D a p r o t e i n w i t h 3056 a m i n o ac ids . T h e A T M t r a n s c r i p t e x h i b i t s a v a r i e t y of 5' u n t r a n s l a t e d regions ( U T R s ) f o rmed b y a l t e r n a t i v e s p l i c i n g a n d a s ingle e x o n s p a n n i n g 3.5kb const i tutes the 3' U T R [57]. T h e s t r u c t u r e of A T M is i l l u s t r a t e d i n F i g u r e 1.2. T h e A T M gene shares a b i d i r e c t i o n a l p r o m o t e r w i t h E 1 4 / N P A T , a house-keep ing gene t h a t is expressed i n a l l t issues. T h e 5' ends of the 2 genes are w i t h i n 700bp of one another [14]. 1.4 Association studies C a n c e r is a c o m p l e x disease t h a t is the resul t of c o m p l e x in terac t i ons between m a n y genes a n d e n v i r o n m e n t a l factors . Its i n h e r i t a n c e is l i k e l y t o be , i n a vast m a j o r i t y of cases, po lygen i c i.e. r e s u l t i n g f r o m the c o m b i n e d effects of m a n y genes, each of w h i c h c o u l d have a modest i n d i v i d u a l effect [53]. F a m i l y - b a s e d l inkage studies have ident i f i ed disease genes w i t h rare , h i g h l y p e n e t r a n t alleles. H e r e , most carr iers of a p a r t i c u l a r al lele d i s p l a y the p h e n o t y p e . T h e s e s tudies , however, l ack the a b i l i t y t o detect alleles t h a t confer m o d e r a t e r i s k s , w h i c h are l i k e l y t o be the n o r m i n c o m m o n , m u l t i - f a c t o r i a l diseases such as mos t cancers . A s s o c i a t i o n studies are the m a i n a l t e r n a t i v e to f a m i l y - b a s e d studies . H e r e , the frequency of a genet ic v a r i a n t i n i n d i v i d u a l s w i t h disease (cases) is c o m p a r e d to t h a t i n i n d i v i d u a l s w i t h o u t the disease (controls ) . A l l e l i c a ssoc ia t i on is present w h e n the genotype frequency is s i gn i f i cant ly different i n the cases a n d contro ls [16, 53]. M o s t assoc ia t i on studies to date have been based o n c a n d i d a t e genes t h a t are selected u s i n g e x i s t i n g knowledge of t h e i r f u n c t i o n a n d p u t a t i v e role i n disease. E x a m p l e s of s u c h genes are those i n v o l v e d i n apoptos i s , ce l l - cyc le c o n t r o l , car c inogen m e t a b o l i s m , D N A repa i r , or those k n o w n to be s o m a t i c a l l y a l tered i n cancer . Be fore the c o m p l e t i o n of the h u m a n genome pro jec t , a ssoc ia t i on studies were l i m i t e d b y the s m a l l n u m b e r of genes w i t h k n o w n p o l y m o r p h i s m s . I n re -cent years , the i d e n t i f i c a t i o n of a large n u m b e r of s ing le -nuc leot ide p o l y m o r p h i s m s ( S N P s ) across the genome has great ly increased the scope of assoc ia t i on studies . T h i s , i n c o m b i n a t i o n w i t h the deve lopment of cheaper , h i g h - t h r o u g h p u t g e n o t y p i n g m e t h o d s , has l ed to a n e x p l o s i o n i n the n u m b e r of assoc ia t i on studies b e i n g c a r r i e d chrll (q22.3) |15.4| E P12| III I I SI I chrll: | Chromosome Band BC061584 107650000| 107700000| Chromosome Bands Localized by FISH Mapping Clones 11q22.3 UCSC Known Genes (June, 05) Based on UniProt, RefSeq, and GenBank mRNA I illilllllll 1111 IHI 111 l in i l i l 111 IIII IIHHHII Mi ATM Hi ATM j-BC007023 M G C 3 3 9 4 8 K <<<<<<<< <t< < < m j i M i i i n H H 1 1 1 1 1 1 - 1 1 1 = i F i g u r e 1.2: T h e genomic s t r u c t u r e of the a t a x i a te langiec tas ia m u t a t e d ( A T M ) gene is i l l u s t r a t e d i n t h i s screen c a p t u r e f r o m the U C S C genome browser [13]. It consists of 62 c o d i n g exons, t w o n o n - c o d i n g exons i n the 5 ' U T R a n d one n o n - c o d i n g exon (3.5kb) i n the 3 ' U T R . out [53]. T h i s has also a l lowed the s t u d y of specif ic c a n d i d a t e p o l y m o r p h i s m s to be rep laced b y more comprehens ive e x a m i n a t i o n of c a n d i d a t e genes to t r y to d e t e r m i n e i f a n y a l le l i c v a r i a n t s of t h a t gene are assoc iated w i t h the disease. 1.4.1 L i n k a g e d i s e q u i l i b r i u m a n d t a g S N P s F o r t u n a t e l y , the alleles of S N P s t h a t are p h y s i c a l l y close to each o ther i n the genome o f ten t e n d to be cor re la ted w i t h each o ther . T h i s p h e n o m e n o n is ca l l ed l inkage d i s e q u i l i b r i u m ( L D ) . It is f o r m a l l y def ined as the co -occurrence of p a r t i c u l a r alleles at n e a r b y sites, o n the same h a p l o t y p e , more o f ten t h a n is expec ted b y chance . W h e n a n e w v a r i a n t occurs o n a chromosome , each genera t i on a l l ows a n o p p o r t u n i t y for r e c o m b i n a t i o n to occur . I n the absence of r e c o m b i n a t i o n , t h e new v a r i a n t w i l l be t r a n s m i t t e d a l o n g w i t h the p a r t i c u l a r h a p l o t y p e o n w h i c h i t o c c u r r e d . A f t e r m a n y generat ions of r e c o m b i n a t i o n , th i s v a r i a n t w i l l be separated f r o m some p a r t s of i t s ances t ra l h a p l o t y p e , b u t b l o cks of l inkage d i s e q u i l i b r i u m s t i l l pers i s t , separated b y " r e c o m b i n a t i o n h o t s p o t s " . T h u s , the v a r i a n t w i l l r e m a i n i n a sso c ia t i on w i t h m a r k e r alleles at nearby l o c i [53]. T h i s p h e n o m e n o n is i l l u s t r a t e d i n F i g u r e 1.3. T h e genetic v a r i a t i o n across a reg ion c a n therefore be c a p t u r e d b y a l i m i t e d n u m b e r of t a g g i n g S N P s . T h e a b i l i t y of one S N P to reflect or act as a p r o x y for the genotype of the o ther depends o n the s t r e n g t h of L D between t h e m . T w o p a i r w i s e measures of L D are D ' a n d r 2 . B o t h measures range f r o m 0 (no L D ) to 1 ( ' complete ' L D ) . D ' is a p a i r w i s e measure of L D def ined such t h a t the va lue of D ' is 1 i f the n u m b e r of h a p l o t y p e s observed is less t h a n the t h e o r e t i c a l n u m b e r of hap lo types . T h i s is o f ten desc r ibed as ' complete L D ' be tween the l o c i . F o r a n y two l o c i the t h e o r e t i c a l n u m b e r of h a p l o t y p e s is 4. I f three or fewer h a p l o t y p e s are observed , WM W/M W//A Y////A VAVA V////A WA/A V//M V//M V////A  I VAAA VAAA VAAA VAAA VA//A W///A W//A VAVA V////A I F i g u r e 1.3: L i n k a g e d i s e q u i l i b r i u m ( L D ) prov ides the genetic basis for mos t assoc ia -t i o n s tudies . T h i s f igure shows a ) T w o copies of a n ances t ra l chromosome . b ) A f t e r a few generat ions of r e c o m b i n a t i o n , chromosomes c o n t a i n b lo cks of L D where r e c o m b i -n a t i o n has not o c c u r r e d , separated by " r e c o m b i n a t i o n ho t spo ts . " A r r o w s represent areas where r e c o m b i n a t i o n has o c curred . c ) E v e n after m a n y generat ions , th i s p a t -t e r n of r a n d o m r e c o m b i n a t i o n is not u s u a l l y observed , a n d b l o cks of L D pers is t . H e r e , a f u n c t i o n a l m u t a t i o n , "*" i n the f igure, occurs o n a specif ic h a p l o t y p e . O v e r m a n y generat ions of r e c o m b i n a t i o n , i t r emains i n assoc ia t i on w i t h n e a r b y l o c i . these l o c i w o u l d be i n ' complete L D ' [53]. O n e d i sadvantage of D ' is t h a t i t does not take i n t o account the frequency of the var iants . T h u s , two l o c i m a y be i n ' c omple te L D ' b u t m a y occur at s u b s t a n t i a l l y different frequencies. U n d e r these c i r cumstances , ne i ther S N P w i l l serve as a surrogate m a r k e r for the other . r 2 is a pa i rwise measure of L D t h a t takes i n t o account b o t h r e c o m b i n a t i o n a n d al lele frequencies. It represents the s t a t i s t i c a l c o r r e l a t i o n between two sites, a n d takes the value of 1 i f o n l y 2 of the 4 poss ib le hap lo types are present [53]. T h u s , a n r 2 va lue of 1 w o u l d i n d i c a t e t h a t k n o w i n g the al lele at one locus w o u l d a l l ow the inference of the al lele at the o ther locus . T h e most efficient set of markers for a n assoc ia t i on s t u d y c a n therefore be selected based o n the p a i r w i s e L D between l o c i a n d the h a p l o t y p e d i v e r s i t y observed i n the s t u d y p o p u l a t i o n . 1.4.2 S t r e n g t h s a n d w e a k n e s s e s o f a s s o c i a t i o n s t u d i e s O n e of the p i t f a l l s of a ssoc ia t i on studies is t h a t m a n y of the p u b l i s h e d repor ts i n the l i t e r a t u r e have not been con f i rmed i n subsequent studies . T h i s m a y be due to e i ther false pos i t ives i n the i n i t i a l r epor t or false negatives i n subsequent s tudies . Some of the errors c o m m o n l y seen i n assoc ia t i on s tudies are d iscussed here. R a n d o m e r r o r T h e mos t c o m m o n reason for false pos i t ives is r a n d o m chance. T h e p r o b a b i l i t y of a t y p e I error is quant i f i ed i n t e rms of the leve l of s t a t i s t i c a l s igni f icance . F o r ins tance , a p -va lue <0.05 ind i ca tes t h a t one i n 20 pos i t i ve assoc iat ions w i l l be due to chance . T h e m u l t i p l e t e s t i n g i m p l i c i t i n a c a s e / c o n t r o l s t u d y enhances the r i s k of a ssoc ia t i on due to chance . T h u s , w h i l e repeated analyses u s i n g different subgroups of a s t u d y p o p u l a t i o n are a v a l i d r oute to generate hypotheses , these hypotheses m u s t subsequent ly be tested i n a d d i t i o n a l p a t i e n t p o p u l a t i o n s . It is i m p o r t a n t to realise t h a t i n a p p r o p r i a t e c o r r e c t i o n for m u l t i p l e t e s t i n g i n e v i t a b l y leads to e i ther increased fa lse -pos i t ive results due to a weak cor rec t i on , or decreased s t a t i s t i c a l power to detect effects o w i n g to a n over ly s t r ingent c o r re c t i on [16]. T h e a d o p t i o n of more s t r ingent s igni f icance levels (for example p < 1 0 ~ 4 ) has been r e c o m m e n d e d [6], i n a d d i t i o n to nove l ways of c a l c u l a t i n g s igni f icance. O n e g roup [77] i n t r o d u c e d the idea of ass ign ing a p r i o r p r o b a b i l i t y t h a t the assoc ia t i on between the genet ic v a r i a n t a n d the disease is r e a l , w h i c h i n a d d i t i o n to the p -va lue w o u l d f o r m the " fa lse -pos i t ive r e p o r t p r o b a b i l i t y " . O t h e r c o m m o n l y used correct ions i n c l u d e the B o n f e r r o n i c o r r e c t i o n a n d the Fa lse D i s c o v e r y R a t e ( F D R ) [6]. T h e power of a l l these m e t h o d s decreases as the n u m b e r of tests increases. However , the loss of power is less w i t h the F D R m e t h o d t h a n w h e n u s i n g the B o n f e r r o n i c o r r e c t i o n , w h i c h m a y he lp to avo id a n over ly conservat ive c o r re c t i on for t y p e I errors [6]. P o p u l a t i o n s trat i f i cat ion O t h e r sources of false pos i t ives i n c l u d e p o p u l a t i o n s t r a t i f i c a t i o n . I f contro ls are not selected f r o m the same p o p u l a t i o n as the false pos i t i ve m a y o c c u r as a resul t of differences i n al lele frequencies be tween p o p u l a t i o n s . T h i s c a n , to some extent , be avo ided b y m a t c h i n g contro ls to the cases o n p lace of residence a n d e thn i c i ty . O t h e r m e t h o d s , s u c h as genomic c o n t r o l , have also been suggested as a means to a v o i d th i s p a r t i c u l a r source of error . I n t h i s m e t h o d , m u l t i p l e p o l y m o r p h i s m s t h r o u g h o u t the genome are tested to e s t imate the effect of c o n f o u n d i n g [17]. S m a l l s a m p l e size False negatives m a y o c c u r due to a lack of s t a t i s t i c a l power . It is i m p o r t a n t t h a t c a s e / c o n t r o l s tudies are suf f ic ient ly large to detect assoc iat ions of rea l i s t i c size [53]. O t h e r factors t h a t are l i k e l y to c o n t r i b u t e to the incons i s tency of assoc ia t i on s t u d -ies i n c l u d e o v e r - i n t e r p r e t a t i o n of m a r g i n a l f indings i n s m a l l sample sizes a n d the p u b l i c a t i o n b ias t o w a r d pos i t i ve resu l ts . M u l t i p l e t e s t i n g , m u l t i - l o c u s assoc ia t i on , b a c k g r o u n d L D levels a n d large-scale s t u d y des ign are o ther factors t h a t m a y affect the des ign of a c a s e / c o n t r o l based assoc ia t i on s tudy . I n th i s s tudy , we d e t e r m i n e d the extent of sequence v a r i a t i o n i n the p r o m o t e r , c o d i n g a n d u n t r a n s l a t e d regions of A T M i n 86 N H L pat ients . W e t h e n used a subset of the S N P s thus ident i f i ed , a n d m u l t i - S N P hap lo types i n genet ic assoc ia t ion tests i n v o l v i n g 798 cases a n d 793 contro ls to d e t e r m i n e i f germl ine v a r i a n t s i n th i s gene confer a n e levated r i s k of N H L . 2 Materials and Methods T h i s s tudy was approved b y the j o int C l i n i c a l Research a n d E t h i c s B o a r d of the B r i t i s h C o l u m b i a C a n c e r A g e n c y a n d the U n i v e r s i t y of B r i t i s h C o l u m b i a (See A p -p e n d i x C ) . A l l subjects gave w r i t t e n in formed consent. 2.1 Cases and controls T h e samples were col lected as p a r t of a s tudy by D r s . J o h n S p i n e l l i a n d R i c k G a l l a g h e r . A l l N H L cases aged 20-79 diagnosed i n B r i t i s h C o l u m b i a d u r i n g the p e r i o d between M a r c h 2000 a n d F e b r u a r y 2004 a n d res id ing i n the greater V a n c o u v e r or greater V i c t o r i a m e t r o p o l i t a n areas were ascertained f rom the B C C a n c e r R e g i s t r y a n d i n v i t e d to par t i c ipa te . H I V pos i t ive cases a n d those who were unable to give in formed consent were exc luded . C o n t r o l s were ob ta ined f rom the C l i e n t R e g i s t r y of the B . C . M i n i s t r y of H e a l t h a n d were frequency matched to cases by age, sex a n d residence w i t h i n the same areas. D e t a i l e d sel f -reported i n f o r m a t i o n on the e thn i c i t y of each of the four grandparents for each i n d i v i d u a l was col lected. Sub jec ts prov ided a b l o o d , sa l iva or m o u t h w a s h sample . T h e character ist i cs of a l l cases a n d controls T a b l e 2.1: T h e subtypes of N H L present i n the case g roup a n d the a b b r e v i a t i o n s used for t h e m . N H L s u b t y p e A b b r e v i a t i o n B cel l N H L Diffuse large B ce l l l y m p h o m a D L B C L F o l l i c u l a r s m a l l c leaved F S C L F o l l i c u l a r m i x e d a n d F o l l i c u l a r large ce l l F M a n d F L M a r g i n a l zone l y m p h o m a a n d low grade M Z L a n d B ce l l of m u c o s a assoc iated l y m p h o i d t i s sue , M A L T M a n t l e ce l l l y m p h o m a M C L S m a l l l y m p h o c y t i c l y m p h o m a • S L L L y m p h o p l a s m a c y t i c l y m p h o m a L P L M i s c e l l a n e o u s B ce l l l y m p h o m a s M i s c B C L T cel l N H L M y c o s i s fungoides M F P e r i p h e r a l T ce l l l y m p h o m a P T C L M i s c e l l a n e o u s T ce l l l y m p h o m a s M i s c T C L are l i s t e d i n T a b l e 2.2. T a b l e 2.1 l i s ts the subtypes of N H L a n d the a b b r e v i a t i o n s used here. 2.2 D N A extraction S u b j e c t s ' p e r i p h e r a l b l o o d samples were co l lec ted i n 4 tubes , 2 w i t h e thylene d i -a m i n e te t ra -ace t i c a c i d ( E D T A ) as a n a n t i c o a g u l a n t a n d 2 w i t h a c id - c i t ra te -dex t rose ( A C D ) . T h e A C D tubes were used to iso late l y m p h o c y t e s u s i n g F i c o l l - H y p a q u e . T h e s e cells were t h e n frozen. T h e buffy coat layers of who le b l o o d i n E D T A tubes were t r e a t e d w i t h Puregene R B C lys is s o l u t i o n ( G e n t r a S y s t e m s , M N , U S A ) a n d ce l l w a s h . G e n o m i c D N A was e x t r a c t e d f r o m samples u s i n g the Puregene D N A iso-l a t i o n k i t a c c o r d i n g to the m a n u f a c t u r e r ' s i n s t r u c t i o n s ( G e n t r a S y s t e m s , M N , U S A ) . D N A was e x t r a c t e d b y other m e m b e r s of the l a b o r a t o r y , i n c l u d i n g S t e p h e n L e a c h , T a b l e 2.2: C h a r a c t e r i s t i c s of the case of a b b r e v i a t i o n s . a n d c o n t r o l samples . See T a b l e 2.1 for a l i s t Cases C o n t r o l s T o t a l M a l e 464 423 887 F e m a l e 334 370 704 T o t a l 798 793 1591 age range (years) 20-49 150 208 358 50-59 193 169 362 60-69 215 206 421 7 0 + 240 210 450 E t h n i c i t y C a u c a s i a n 626 613 1239 A s i a n 80 90 170 S o u t h A s i a n 29 37 66 M i x e d / O t h e r 36 34 70 U n k n o w n 27 19 46 N H L s u b t y p e B ce l l N H L 722 D L B C L 193 F S C L 138 F M / F L 78 M Z L / M A L T 93 M C L 47 S L L / C L L 42 L P L 42 M I S C B C L 89 T ce l l N H L 75 M F 39 P T C L 27 M I S C T C L 9 M I S C 1 R o z m i n J a n o o - G i l a n i , J o h a n n a Sch inas , Jenn i f e r Roger a n d Jenni fer Jeyes. 2.3 Variant detection sequencing A l l 62 c o d i n g exons (9170 bp) of A T M , 11095 b p of i n t r o n sequence adjacent to c o d i n g exons, 1741 b p of 5' sequence, a n d 3663 b p of 3' u n t r a n s l a t e d reg ion ( U T R ) were P C R a m p l i f i e d w i t h a t o t a l of 78 p r i m e r pa i r s . T h e sequences of a l l p r i m e r s used i n th i s s tudy , a n d t h e i r a n n e a l i n g t empera tures are s h o w n i n TableA.l. E x o n s were n u m b e r e d a c c o r d i n g to e s tab l i shed convent i on [73]. C o d i n g exons were a m p l i f i e d u s i n g p r i m e r s des igned i n the i n t r o n i c sequences near the e x o n b o u n d a r i e s to a l l ow re -sequenc ing across spl ice sites. T h e 3' U T R was a m p l i f i e d i n o v e r l a p p i n g segments f o l l owing c o n f i r m a t i o n of the size of th i s reg ion . P r i m e r s were selected f r o m the A T M genomic sequence (accession n u m -ber B C 0 6 1 5 8 4 ) , r e t r i eved f r o m the U C S C genome browser [13] u s i n g the p r o g r a m P r i m e r 3 [55]. F o r w a r d a n d reverse p r i m e r s i n c o r p o r a t e d the - 2 1 M 1 3 F ( T G T A A A A C -G A C G G C C A G T ) or M 1 3 R ( C A G G A A A C A G C T A T G A C ) extens ions , respect ive ly , at t h e i r 5' ends. P o l y m e r a s e c h a i n react ions ( P C R ) were c a r r i e d out i n a v o l u m e of 20^1 c o n t a i n i n g l O n g genomic D N A t e m p l a t e , I m M MgSO^, 0 . 5 / x M of each P C R p r i m e r , 2 m M d N T P s , l x P f x a m p l i f i c a t i o n buffer a n d 0 . 2 5 U P l a t i n u m P f x D N A p o l y m e r a s e ( Inv i t rogen , O N , C a n a d a ) . T h i r t y cycles of 30s at 9 4 ° C , 30s at a p r i m e r p a i r specif ic a n n e a l i n g t e m p e r a t u r e of 50 — 6 5 ° C a n d 1 m i n a t .68°C , were p e r f o r m e d i n p r o g r a m m a b l e t h e r m o c y c l e r s ( M J R e s e a r c h , M A , U S A ) . A 5fil a l i q u o t of each P C R r e a c t i o n was r u n o n a 2 % agarose gel t o c o n f i r m the size of the P C R p r o d u c t . T h e r e m a i n i n g 15/xZ of P C R p r o d u c t was p u r i f i e d u s i n g A m P u r e m a g n e t i c beads ( A g e n -cour t B ios c i ence , M A , U S A ) a n d e lu ted i n a v o l u m e of 30/J.l of T E ( T r i s - E D T A , p H 8.0) a c c o r d i n g to the m a n u f a c t u r e r ' s i n s t r u c t i o n s . A a l iquot of p u r i f i e d P C R p r o d u c t was t h e n cycle sequenced u s i n g B i g D y e T e r m i n a t o r M i x V 3 . 1 at l/24th c h e m i s t r y i n 4/iZ reac t ions ( A p p l i e d B i o s y s t e m s , C A , U S A ) . B o t h f o r w a r d ( - 2 1 M 1 3 F p r i m e r ) a n d reverse ( M 1 3 R p r i m e r ) d i rec t i ons were sequenced. C y c l e sequenc ing react ions cons is ted of 50 cycles of 10s at 96°C , 5s at 52°C ( - 2 1 M 1 3 F f o r w a r d p r i m e r ) or 43 °C ( M 1 3 R reverse p r i m e r ) , a n d 3 m i n at 60°C. R e a c t i o n p r o d u c t s were p r e c i p i t a t e d w i t h i s o p r o p y l a l c oho l a n d resuspended i n 10/j.I of doub le d i s t i l l e d water before l o a d i n g o n A B I 3 7 0 0 or A B I 3 7 3 0 x l c a p i l l a r y sequencers. Sequence reads were base -ca l led u s i n g P h r e d [18] a n d sequence reads assembled w i t h reference sequences u s i n g P h r a p [19]. Sequence reads were assem-b l e d in to C o n s e d b y D i a n a P a l m q u i s t at the G e n o m e Sciences C e n t r e . C o n t i g s of sequence traces c o r r e s p o n d i n g to each e x o n were e x a m i n e d u s i n g P o l y P h r e d [49] for de te c t i on of heterozygotes a n d v i s u a l i s e d i n C o n s e d [28] to f a c i l i ta te v e r i f i c a t i o n of sequence var iants b y e x a m i n a t i o n of i n d i v i d u a l traces . 2.4 Prediction of haplotypes and choice of tagSNPs T o es t imate hap lo types f r o m the sequence d a t a , I used P H A S E v2 .0 [66, 67]. L i n k a g e d i s e q u i l i b r i u m across the r eg i on was d e t e r m i n e d u s i n g H a p l o v i e w [4]. H a p l o t y p e t a g -g i n g S N P s or t a g S N P s were selected u s i n g 4 p u b l i c l y ava i lab le p r o g r a m s : T a g S N P s [68], S N P t a g g e r [34], B E S T (Best E n u m e r a t i o n of S N P Tags) [61] a n d T a g ' n ' T e l l [12]. 2.5 Genotyping Taqman a l l e l i c d i s c r i m i n a t i o n assays were des igned u s i n g A s s a y s - b y - D e s i g n S i W ( A p -p l i e d B i o s y s t e m s , C A , U S A ) . T h e sequences of the p r i m e r s a n d probes are s h o w n i n T a b l e A . 2 . G e n o t y p i n g react ions were c a r r i e d out i n 5/zZ vo lumes o n 384-wel l p lates . E a c h w e l l c o n t a i n e d l O n g of d r i e d d o w n genomic D N A . T h e r e a c t i o n c o n -t a i n e d 2.5/xZ of Ta<?man U n i v e r s a l P C R M a s t e r M i x , 0.125fxl of 40x A s s a y M i x a n d 2.375fJ,l of d i s t i l l e d water . T a g m a n a l le l i c d i s c r i m i n a t i o n assays [39] were p e r f o r m e d u s i n g a n A B I 7 9 0 0 H T . T h i s i n v o l v e d t h e r m o c y c l i n g for 10 m i n at 95°C , fo l lowed b y 40 cycles of 15s at 92°C to d e n a t u r e a n d 1 m i n at 60 °C to a n n e a l a n d e x t e n d . 2.6 Statistical analysis S t a t i s t i c a l analyses were c a r r i e d out b y a n ep idemio log i s t A m y M a c A r t h u r , u n d e r the d i r e c t i o n of D r . J o h n S p i n e l l i , i n c o l l a b o r a t i o n w i t h us. A s a first step, the g e n o t y p i n g d a t a was s u b j e c t e d to tests for H a r d y - W e i n b e r g e q u i l i b r i u m . T h e genotype frequencies for each c o m m o n v a r i a n t were tested i n the c o n t r o l samples to d e t e r m i n e i f t h e y were different f r o m the expec ted f requen-cies (based o n the m i n o r al lele f requency) . R a r e var iants were e x c l u d e d f r o m th is ana lys i s . T h e p r i m a r y assoc ia t i on ana lys i s used u n i v a r i a t e fo l lowed b y m u l t i v a r i a t e log is t i c regression mode l s to es t imate the odds ra t i o s for deve lopment of N H L of each of the c o m m o n S N P s . T h e 6 rare v a r i a n t s were tested as one group . L o g i s -t i c regression analys i s was c a r r i e d out u s i n g the S P S S package [65]. M u l t i v a r i a t e analyses were ad jus t e d for age (20-49, 50-59, 60-69, a n d 7 0 + year groups ) , sex, p lace of residence (Vancouver , V i c t o r i a ) a n d e t h n i c i t y ( C a u c a s i a n , A s i a n , S o u t h A s i a n , M i x e d / O t h e r / U n k n o w n ) . W h e n the n u m b e r of cases was insuff ic ient (n < 5) , the rare homozygotes were c o m b i n e d w i t h the heterozygotes for ana lys i s . Tests for t r e n d were p e r f o r m e d w h e n sufficient n u m b e r s of h o m o z y g o u s alleles were present . W e used P H A S E v2.0 [66, 67] a n d H a p l o . s t a t s (part of the R s t a t i s t i c a l sys-t e m [59]) t o deduce hap lo types p r o b a b i l i s t i c a l l y f r o m genotype d a t a . P H A S E uses a B a y e s i a n m e t h o d a n d was used to e s t imate h a p l o t y p e s for the se lec t ion of t a g S N P s . H a p l o . s t a t s es t imates hap lo types w i t h the e x p e c t a t i o n - m a x i m i z a t i o n ( E M ) algo-r i t h m . H a p l o t y p e s were a n a l y z e d as ca tegor i ca l var iab les as i f each h a p l o t y p e was a specif ic al lele of a m u l t i - a l l e l e m a r k e r . T o correct for m u l t i p l e t e s t i n g , the False D i s c o v e r y R a t e ( F D R ) m e t h o d p r o p o s e d b y B e n j a m i n i a n d H o c h b e r g [6] was used. T h i s compares the p-values f r o m the tests for t r e n d to a correc ted range of s igni f icance values . 3 Results and Discussion 3.1 Variant detection sequencing F o r var iant detec t ion , I used 86 case samples to represent i n d i v i d u a l s w i t h different N H L subtypes a n d ethnic i t ies . Sporad i c l y m p h o m a s of B - c e l l o r i g in occur more frequently t h a n those of T - c e l l o r i g i n . T h i s , however, is i n contrast w i t h A T , where T - c e l l mal ignanc ies are more frequent [71]. T h u s , the var iant detect ion group was enr iched for T - c e l l cases. T h e 86 samples consisted of 30 cases w i t h T - c e l l based N H L (a l l the cases avai lable at the t ime) a n d 56 cases w h o h a d B - c e l l t u m o u r s . F o r the la t ter , the youngest i n d i v i d u a l s were selected since the ir disease m a y be more l ike ly to have a genetic rather t h a n e n v i r o n m e n t a l basis. T h e samples used for var iant detect ion sequencing are l i s ted i n T a b l e 3.1. C o n t r o l samples are not in c luded in var iant discovery. T h i s selection process is l ike ly to ident i fy c o m m o n var iants , or those o c c u r r i n g at a low frequency that are s t rong ly associated w i t h N H L . T h e r e are conf l i c t ing d a t a i n the l i t e ra ture about the l ength of the 3 ' U T R of T a b l e 3.1: C h a r a c t e r i s t i c s of the v a r i a n t de te c t i on sequenc ing set. a l i s t of a b b r e v i a t i o n s . See T a b l e 2.1 for T - c e l l B - c e l l T o t a l M a l e 19 31 50 F e m a l e 11 25 36 T o t a l 30 56 86 age range (years) 20-49 10 56 66 50-59 8 0 8 60-69 5 0 5 70+ 7 0 7 E t h n i c i t y C a u c a s i a n 20 35 55 A s i a n 4 8 12 S o u t h A s i a n 3 2 5 M i x e d / O t h e r 3 10 13 U n k n o w n 0 1 1 N H L s u b t y p e A l l B ce l l N H L 56 D L B C L 18 F S C L 16 F M / F L 7 M Z L / M A L T 4 M C L 3 S L L / C L L 1 M I S C B C L 7 A l l T ce l l N H L 30 M F 18 P T C L 8 M I S C T C L 4 A T M [57]. T h e longest r e p o r t e d 3 ' U T R spans 3.5 k b . T o ver i fy t h i s , I used r a n d o m p r i m e d h u m a n c D N A p r e v i o u s l y generated b y D r . K a r e n N o v i k f r o m a n o r m a l b l o o d sample . I used 5 p r i m e r p a i r s des igned to s p a n the 3.5 k b reg ion t h o u g h t to c o n s t i t u t e the 3 ' U T R . A s contro ls t o check for genomic c o n t a m i n a t i o n , I also used one p a i r of p r i m e r s des igned across exons, a n d one p a i r des igned w i t h i n in t rons . I t h e n l ooked for P C R p r o d u c t u s i n g each of these 7 p r i m e r pa i rs i n b o t h c D N A a n d genomic D N A . T h e results f r o m t h i s e x p e r i m e n t are s h o w n i n F i g u r e 3.1 a n d c o n f i r m t h a t the 3 ' U T R is at least 3.5 k b l ong . T h e overa l l average success rate for g o o d q u a l i t y sequence reads i n the v a r i a n t de tec t i on sequenc ing was 97 .87%. G o o d q u a l i t y sequences were denned as sequences t h a t c o u l d be u n a m b i g u o u s l y scored for v a r i a n t s across the reg ion of interest u s i n g C o n s e d . T h e success rate was c a l c u l a t e d per locus a n d r a n g e d between 88 .37% (for a n a m p l i c o n w i t h a mono -nuc l eo t ide repeat p r e v e n t i n g g o o d reads i n one d i rec t i on ) a n d 100%. T h i s was c a l c u l a t e d as the n u m b e r of samples for w h i c h there were g o o d q u a l i t y sequence reads over the reg ion of interest i n at least one d i r e c t i o n . E l e v e n a m p l i c o n s h a d mono -nuc l eo t ide stretches t h a t prevented g o o d q u a l i t y sequence reads i n one d i r e c t i o n . F i v e a m p l i c o n s c o n t a i n e d s m a l l inser t ions or delet ions t h a t o c c u r r e d i n more t h a n 5% of i n d i v i d u a l s , c a u s i n g the sequence reads to be s u p e r i m p o s e d , even t h o u g h they were of g o o d qua l i ty . G o o d q u a l i t y reads were o b t a i n e d i n b o t h f o rward a n d reverse d i rec t i ons for mos t s a m p l e - a m p l i c o n c o m b i n a t i o n s . V a r i a n t d iscovery sequenc ing revealed 79 var iants . F o r t y - f i v e of these (57%) were t r a n s i t i o n s , 25 (31.6%) were t ransvers ions , a n d 9 (11.4%) were s m a l l inser t ions or de let ions . F i f t y - t w o of the 79 v a r i a n t s were n o n - c o d i n g a n d the other 27 were i n the c o d i n g reg ion . O f the c o d i n g changes, 18 resul t i n n o n - s y n o n y m o u s changes. O n e of these results i n a p r e m a t u r e s top c o d o n a n d five were c lassi f ied by P o l y P h e n 1 Genomic DNA F i g u r e 3.1: V e r i f i c a t i o n of the l e n g t h of the 3 ' U T R of A T M . (i) T h i s s chemat i c shows the pos i t i ons of the p r i m e r pa irs used. P r i m e r pa i rs 6 a n d 7 are w i t h i n the o p e n r e a d i n g f rame ( O R F ) a n d are not s h o w n here. T h e 3.5 k b t h o u g h t to c o n s t i t u t e the 3 ' U T R of A T M has 2 p o l y - A stretches , a n d two A l u inser t ions , w h i c h are i l l u s t r a t e d i n t h i s f igure, (ii) I n th i s agarose gel , the p a n e l o n the left conta ins P C R , p r o d u c t s f r om c D N A . L a n e s a c o n t a i n c D N A , a n d lanes b are negat ive contro ls . T h e p a n e l o n the r ight conta ins pos i t i ve contro ls i.e. P C R p r o d u c t s f r om genomic D N A . I n b o t h panels , lanes 1 t h r o u g h 5 are P C R p r o d u c t s generated u s i n g p r i m e r pa i rs 1 t h r o u g h 5 as s h o w n above. L a n e 6 conta ins P C R p r o d u c t generated u s i n g p r i m e r s des igned across the exons, t o c o n f i r m t h a t the p r o d u c t is f r o m c D N A , w h i l e lane 7 conta ins P C R p r o d u c t f r o m p r i m e r s des igned i n the i n t r o n s , to ru le out genomic c o n t a m i n a t i o n . [69, 70], a p r o g r a m for p r e d i c t i o n of the sever i ty of a m i n o a c i d s u b s t i t u t i o n s , as " p o s s i b l y or p r o b a b l y d a m a g i n g " ; these v a r i a n t s were a l l rare ( m i n o r al lele f requency = 0 .54-1.1%). Seven of the 86 (8.1%) N H L pat i ents were heterozygous at these l o c i . T h e results of the v a r i a n t de te c t i on phase are s u m m a r i z e d i n T a b l e 3.2 a n d F i g u r e 3.2. A v i s u a l r epresentat i on of the genotypes is presented i n F i g u r e 3.3. O f the 79 var iants l i s t e d i n T a b l e 3.2, 49 (61.25%) were observed o n l y once. T h e frequency of s ingletons , v a r i a n t s observed o n l y once i n the d a t a set, is s i m i l a r to t h a t p r e v i o u s l y ident i f i ed i n A T M i n samples f r o m unaffected i n d i v i d u a l s [72]. E l e v e n var iants h a d a m i n o r al le le f requency of 5% or greater . Table 3.2: Variants identified by variant detection sequencing of A T M i n the germline D N A of 86 non-Hodgkin lymphoma patients. V a r i -ants indicated by an * or ** indicate the deletion or insertion of 1 or 2 nucleotides, respectively. A 1 indicates changes that were predicted to be deleterious to protein function. Variants that had a markedly different minor allele frequency in our study than in publicly available data are marked by a 2 . S N P Name of Variant F lanking sequence Nucleotide C o d o n A m i n o A c i d Obs minor Number change change change allele freq 1 -5144 T / A 2 cctcct / atcccg T / A N / A N / A 45.60% 2 -4807 G / T aagagg / tgtggg G / T N / A N / A ' 0.54% 3 -4519 A / G 2 tggcca /gcggga A / G N / A N / A 47.28% 4 -4406 C / T t t c t g c / t g c t g g C / T N / A N / A 0.54% 5 -2541 C / T c t t c c / t g g g a a C / T N / A N / A 0.54% 6 -2299 A / T t c a a a / t t a a c a A / T N / A N / A 3.23% 7 IVS4 (4-36) d e l ( A A ) g a a a t a a / * * g t g t g d e l ( A A ) N / A N / A 41.48% 8 X 5 (146) C / G agattc / gcaaac C / G T C C / T G C Ser 49 C y s 0.54% 9 IVS6 (+48) C / T a c t g t c / t g c g t g C / T N / A N / A 0.56% 10 X 7 (378) T / A atggat /aacagt T / A G A T / G A A Asp 126 G l u 1.10% 11 X8 (544) G / C a a g a t g / c t t c a t G / C G T T / C T T Val 182 L e u 0.56% 12 X 8 (657) T / C c a g t g t / c g c g a g T / C T G T / T G C C y s 219 C y s 1.11% 13 X 9 (735) C / T g c t g t c / t a a c t t C / T G T C / G T T Val 245 V a l 0.57% 14 IVS9 (+24) T / G t g t t t t / g g a a t t T / G N / A N / A 0.57% 15 X l l (1176) C / G ctaggc /g tggga C / G G G C / G G G G l y 392 G l y 0.56% 16 X l l (1229) T / C t c t t g t / c g c c t t T / C G T G / G C G Val 410 A l a 0.56% 17 X12 (1541) G / A tcaggg /atagtt G / A G G T / G A T G l y 514 A s p 1 0.53% 18 IVS14 ( - 5 5 ) T / G a c a t a t / g a a g g c T / G N / A N / A 3.76% 19 X15 (1986) T / C g a c t t t / c t t a a c T / C T T T / T T C Phe 662 Phe 0.54% 20 X15 (2119) T / C a c t c a t / c c t g a g T / C T C T / C C T Ser 707 P r o 1.61% 21 IVS15 (-67) T / C t g t t c t / c t a c a a T / C N / A N / A 3.23% 22 X16 (2127) T / C g a g a t t / c a c a a a T / C A T T / A T C He 709 He 0.54% 23 X16 (2220) A / G g a a g c a / g t a t a a A / G G G A / G C G A l a 740 A l a 0.54% 24 IVS16 (+77) G / A t g t t t g / a g a a g a G / A N / A N / A 1.63% 25 X19 (2635) A / G g t a c c a / gtaggt A / G A T A / G T A lie 879 Val 0.54% 26 IVS19 (-16) G / T aatgag / t tgett G / T N / A N / A 0.54% 27 IVS20 (+28) ins(A) c t c t t * / a g g a t t ins(A) N / A N / A 0.54% 28 IVS20 (-17) del (T) t t t t t t / * c c c t c del(T) N / A N / A 0.56% 29 IVS21 (-20) T / G g a a c t t / g t t t t t T / G N / A N / A 0.56% 30 X23 (3118) A / G taagaa /gtggcc A / G A T G / G T G Met 1040 Val 0.56% 31 X24 (3161) C / G t g a t c c / g t t a t t C / G C C T / C G T Pro 1054 A r g 1 1.10% 32 IVS24 (-8) del (T) t t g c t t / * g t t t t del (T) N / A N / A 9.68% Continued on next page Table 3.2 — Continued from previous page S N P Name of Variant F l a n k i n g sequence Nucleotide C o d o n A m i n o A c i d Obs minor Number change change change allele freq 33 IVS25 (-12) d e l ( A ) 2 t t t a a a / * t t t c t dol(A) N / A N / A 44.02% 34 X26 (3468) G / A ctgacg /at tgat G / A A C G / A C A T h r 1156 T h r 0.54% 35 IVS28 (+40) G / A t g a a t g / a a t a t g G / A N / A N / A 0.54% 36 X29 (4009) A / G t a t t c a / g t t a g t A / G A T T / G T T He 1337 Val 0.58% 37 X30 (4138) C / T c a c c t c / t a t t t t C / T C A T / T A T His 1380 T y r 0.55% 38 X31 (4258) C / T t t c t t c / t t t g c c C / T C T T / T T T Leu 1420 Phe 1.08% 39 X31 (4424) A / G t c a c t a / g t a t c a A / G T A T / T G T T y r 1475 C y s 1 0.54% 40 X32 (4578) C / T a t a c c c / t c t t g C / T C C C / C C T Pro 1526 Pro 3.23% 41 IVS33 (-20) A / G aaagcaa /ggt tac A / G N / A N / A 1.63% 42 IVS35 (+42) C / G aactgc /gggatc C / G N / A N / A 0.54% 43 IVS35 (+82) ins(A) a c t g t a * / a t g t t t ins(A) N / A N / A 0.54% 44 IVS38 (-15) G / C g a t t t g / c t t t g t G / C N / A N / A 0.54% 45 IVS38 (-8) T / C t t g t a t / c a t t c t T / C N / A N / A 2.17% 46 X39 (5557) G / A t c c a a g / a a t a c a G / A G A T / A A T Asp 1853 A s n 10.33% 47 X39 (5630) T / C a c a c t t / c c t c g c T / C T T C / T C G Phe 1877 Ser 0.54% 48 X40 (5697) C / A c g a t g c / a t g t t t C / A T G C / T G A C y s 1899 s t o p 1 0.54% 49 X40 (5753) G / C g a g a a g / c a c a a a G / C A G A / A C A A r g 1918 T h r 1 0.54% 50 X41 (5793) T / C g a t g c t / c t t c t g T / G G C T / G C C A l a 1931 A l a 0.54% 51 IVS41 (+71) A / G t a a a g a / g t t t a t A / G N / A N / A 1.22% 52 IVS45 (-54) T / C a c a t g t / c a t a t c T / C N / A N / A 1.09% 53 IVS46 (-36) d e l ( T T C T ) a c c t c t t c t / * * * * t t a t d e l ( T T C T ) N / A N / A 0.54% 54 IVS48 (-69) i n s ( A T T ) 2 c t t t c * * * / a t t a t t a t i n s ( A T T ) N / A N / A 46.24% 55 X54 (7775) C / G aagctc /g tcagc C / G T C T / T G T Ser 2592 C y s 1 0.54% 56 IVS54 (+30) G / A c t t t t a g / a a a g t g G / A N / A N / A 0.54% 57 IVS62 (-55) T / C 2 a g a t a t / c g t t g a T / C N / A N / A 39.36% 58 IVS63 (-43) A / T g a t t a a / t a a t g t A / T N / A N / A 0.53% 59 (9200) C / G t c a t t c / g a g c c t C / G N / A N / A 0.55% 60 (9443) G / A aggccg /aaggtg G / A N / A N / A 0.54% 61 (9711) C / A a a a a a c / a a g a a a C / A N / A N / A 0.54% 62 (9718) T / G g a a a c t / g t a t t t T / G N / A N / A 45.16% 63 (9721) T / C a c t t a t / c t t g g a T / C N / A N / A 3.76% 64 (10684) C / T t t g c t c / t t g t c a C / T N / A N / A 0.54% 65 (10774) T / C 2 cctcct / cgagta T / C N / A N / A 44.68% 66 (11052) C / G g t g t t c / g t g t t g C / G N / A N / A 0.53% 67 (11250) C / T a t t a a c / t a a a t g C / T N / A N / A 0.53% 68 (11369) C / G t t g a t c / g t c c t c C / G N / A N / A 0.53% 69 (11390) A / G c c c c t a / g a a a c c A / G N / A N / A 0.53% 70 (11394) C / T t a a a a c / t c a a t c C / T N / A N / A 0.53% 71 (11571) A / G aggaaa /gtgcag A / G N / A N / A 0.53% 72 (11777) T / C t a t t c t / c a a t c a T / C N / A N / A 0.56% 73 (11810) C / T a t t t a c / t a t a c a C / T N / A N / A 0.56% 74 (11884) ins(T) t t t t t * / t g t a a t ins(T) N / A N / A 1.11% 75 (12242) C / T a g t a t c / t t a a c t C / T N / A N / A 1.61% 76 (12306) A / G g g t c a a / g t g a a a A / G N / A N / A 1.08% 77 (12368) T / G a g t t g t / g g t c c a T / G N / A N / A 1.05% Continued on next page Table 3.2 — Continued from previous page S N P Name of Variant F l a n k i n g sequence Nucleotide C o d o n A m i n o A c i d Obs minor Number change change change allele freq 78 (12563) T / G g g a c a t / g c g t a a T / G N / A N / A 44.74% 79 (12583) T / C t a g t c t / c t t t a a T / C N / A N / A 1.05% 3.2 Germline variation in A T M and comparison with literature j I ident i f i ed 52 v a r i a n t sites i n 16499 base pa i rs sequenced i n the n o n - c o d i n g regions of A T M (1 i n 317) a n d 27 v a r i a n t sites of 9170 sequenced sites i n the c o d i n g reg i on of A T M (1 i n 340) , w h i c h c o r r e s p o n d to nuc leot ide d i v e r s i t y values of 3 . 3 4 X 1 0 - 4 i n the n o n - c o d i n g regions a n d 6 . 4 5 X 1 0 - 5 i n the c o d i n g regions of A T M . T h e n u -cleot ide d i v e r s i t y is def ined here as the average n u m b e r of nuc leot ide differences or s u b s t i t u t i o n s per site for a g roup of D N A sequences a n d is a measure of the degree of p o l y m o r p h i s m w i t h i n a p o p u l a t i o n [46]. O t h e r groups have recent ly d e t e r m i n e d the nuc leot ide v a r i a t i o n i n A T M i n a v a r i e t y of reference samples . T h o r s t e n s o n et al., [72] s t u d i e d 93 h u m a n samples u s i n g d e n a t u r i n g h i g h - p e r f o r m a n c e l i q u i d c h r o m a t o g r a p h y . T h e y ident i f i ed 88 v a r i a n t sites i n h u m a n A T M , 17 of w h i c h were frequent (minor al lele f requency >4 .5%) . A large n u m b e r (12) of these c o m m o n v a r i a n t s were also observed i n o u r v a r i a n t de te c t i on sequenc ing . T h e y in ferred h a p l o t y p e s u s i n g these 17 sites a n d f o u n d h i g h b u t i n c o m p l e t e l inkage d i s e q u i l i b r i u m over more t h a n 133kb of the gene. T h e i r s t u d y p o p u l a t i o n i n c l u d e d i n d i v i d u a l s f r o m seven m a j o r h u m a n p o p u l a t i o n s . T h o r s t e n s o n a n d colleagues c a l c u l a t e d nuc leot ide d i v e r s i t y for c o d i n g a n d n o n - c o d i n g regions. I c o m p a r e d our nuc leo t ide d i v e r s i t y values to those of T h o r s t e n -son et al., a n d f o u n d the r a t i o t o be 1:1.92 i n the c o d i n g regions a n d 1:1.41 i n the 79 variants 6 i n 5 ' U T R 52 non-coding I 1 27 coding 25 intronic 21 in 3 ' U T R 9 synonymous 18 non-synonymous 6 deleterious F i g u r e 3.2: A T M v a r i a n t types . A s u m m a r y of A T M v a r i a n t s observed a n d t h e i r f u n c t i o n a l classes F i g u r e 3.3: V i s u a l genotypes . G e n o t y p e d a t a for the 79 p o l y m o r p h i c sites i n A T M , o b t a i n e d b y re -sequenc ing of g e rml ine D N A for 86 i n d i v i d u a l s . E a c h v a r i a n t is represented b y a c o l u m n , a n d each i n d i v i d u a l b y a r ow . T h e v a r i a n t s are a r r a n g e d i n t h e i r genomic order . B l u e squares represent the c o m m o n h o m o z y g o t e , r e d squares represent heterozygotes a n d ye l l ow squares , the rare homozygo te . M i s s i n g d a t a are i n d i c a t e d b y gray squares. T h e 7 t a g S N P s are m a r k e d w i t h b l a c k arrows . T h e one v a r i a n t t h a t fa i l ed assay des ign is m a r k e d w i t h a n aster isk . T h e 6 rare p u t a t i v e l y de leter ious v a r i a n t s t h a t were g e n o t y p e d are m a r k e d b y r e d arrows . T h i s f igure was generated u s i n g V G 2 [47, 48]. n o n - c o d i n g regions. T h e h igher d ivers i ty , i n b o t h c o d i n g a n d n o n - c o d i n g regions, f ound b y T h o r s t e n s o n et al., is , i n large p a r t , e x p l a i n e d b y the presence of A f r i c a n p o p u l a t i o n s i n t h e i r s tudy . T h o r s t e n s o n a n d col leagues f ound four - f o ld greater se-quence d i v e r s i t y i n A f r i c a n p o p u l a t i o n s w h e n c o m p a r e d to n o n - A f r i c a n p o p u l a t i o n s . W e d i d not s t u d y any i n d i v i d u a l s of A f r i c a n e thn i c i t y , since t h e y are u n c o m m o n i n the p o p u l a t i o n of B r i t i s h C o l u m b i a . T h e results for the C a u c a s i a n , A s i a n a n d S o u t h A s i a n groups are c o m p a r a b l e where the n u m b e r a n d frequency of v a r i a n t s c o u l d be c o m p a r e d . B o t h studies f o u n d t h a t the c o d i n g regions of A T M have m u c h lower nuc leot ide d i v e r s i t y t h a n the n o n - c o d i n g regions. T h o r s t e n s o n a n d col leagues c o m p a r e d the sequence d i v e r s i t y i n the n o n c o d i n g regions of A T M to t h a t of 14 o ther genes a n d f o und these to be c o m p a r a b l e . T h i s ind i cates t h a t the m u t a t i o n ra te at the c h r o m o s o m a l r eg i on c o n t a i n i n g A T M is not lower t h a n t h a t of o ther genes, w h i c h leads to the l i k e l i h o o d of the lower sequence d i v e r s i t y i n the c o d i n g reg i on of A T M b e i n g due to select ive pressure for m a i n t a i n i n g the p r o t e i n sequence. B o n n e n a n d col leagues [8] sequenced 29 n o n - c o d i n g regions of A T M i n 5 i n d i v i d u a l s a n d ident i f i ed 17 n o n - c o d i n g S N P s . T h e y t h e n g e n o t y p e d 14 of those v a r i a n t s i n 295 i n d i v i d u a l s a n d in ferred 22 hap lo types . O n l y 6 of these h a p l o t y p e s o c c u r r e d at a f requency of 5% or greater . L i k e T h o r s t e n s o n et al., [72], B o n n e n et al., f o u n d h i g h L D over 142kb i n a l l the p o p u l a t i o n s s t u d i e d , a n d perfect d i s e q u i l i b r i u m i n the E u r o p e a n A m e r i c a n p o p u l a t i o n . S ince B o n n e n et al., i dent i f i ed o n l y n o n -c o d i n g v a r i a n t s i n different regions t h a n the ones we resequenced, we are u n a b l e to c o m p a r e our specif ic v a r i a n t s w i t h those ident i f ied b y t h e m . O t h e r groups [1, 38] have also observed h i g h L D at the A T M locus . O u r L D f ind ings , therefore , are cons is tent w i t h the d a t a i n the l i t e r a t u r e (see C h a p t e r 3.4.2). 3.3 tagSNP selection for genotyping D u r i n g the v a r i a n t de te c t i on sequenc ing of g e r m l i n e D N A f r o m 86 N H L pat i ents , 79 v a r i a n t s were ident i f i ed . O f these 79 v a r i a n t s , o n l y 11 h a d a m i n o r al lele f requency of greater t h a n 5%. S ince N H L , l ike most cancers , is a c o m p l e x disease, i t is l i k e l y t h a t most genet ic v a r i a t i o n assoc iated w i t h the disease is weak or i n c o m p l e t e l y p e n e t r a n t . It is also l i k e l y not the o n l y cancer v a r i a n t i n any p a r t i c u l a r g roup of pa t i ents . T h u s , the increased r i s k due to any one such v a r i a n t is expec ted to be low. T h e m i n i m u m detectable odds ra t i os for the de te c t i on of genet ic factors af fect ing N H L s u s c e p t i b i l i t y d e p e n d o n the n u m b e r of cases a n d contro ls . W i t h 800 cases a n d 800 contro l s , a m i n o r al lele f requency of 5% a l lows the de te c t i on of a n odds r a t i o of 1.54 (g iven 8 0 % power a n d a t w o - t a i l e d s igni f i cance of 5%) A s u m m a r y of the m i n i m u m detectab le odds ra t i o s o b t a i n a b l e b y o u r s t u d y is presented i n T a b l e 3.3. T h e 11 v a r i a n t s t h a t h a d a m i n o r al lele f requency of 5% or m o r e were i n c l u d e d i n the e s t i m a t i o n of hap lo types u s i n g P H A S E 2.0 [66, 67]. T h e s e 11 v a r i a n t s s p a n n e d 146 k b of genomic sequence a n d were i n h i g h l inkage d i s e q u i l i b r i u m ( L D ) w i t h each o ther . T e n hap lo types were ident i f i ed as s h o w n i n T a b l e 3.4. These h a p l o t y p e s were used to d e t e r m i n e h a p l o t y p e t a g g i n g S N P s ( t a g S N P s ) . A l l the m e t h o d s used agreed o n the se lect ion of a m i n i m a l h a p l o t y p e t a g g i n g set c o n s i s t i n g of 7 var iants . T h e 7 S N P s selected i n th i s w a y were -5144 A / T , -4519 G / A , I V S 4 (+36) d e l ( A A ) , I V S 2 4 (-8) d e l ( T ) , I V S 2 5 (-12) i n s ( A ) , X 3 9 (5557) G / A a n d I V S 6 2 (-55) C / T . A c o m b i n a t i o n of d a t a f r o m a l l of these sites a l lows each h a p l o t y p e ( cons is t ing of 11 var iants ) to be u n i q u e l y ident i f i ed . I n a d d i t i o n , 6 of the 68 rare v a r i a n t s ident i f i ed were p r e d i c t e d to be de l e te r i -ous to p r o t e i n f u n c t i o n a n d were geno typed i n the c a s e / c o n t r o l g roup . T h e effect o n f u n c t i o n was p r e d i c t e d u s i n g P o l y P h e n [54, 69, 70]. P r e d i c t i o n s m a d e b y P o l y P h e n T a b l e 3.3: S t u d y power. T h i s table summar izes the m i n i m u m detectable odds ra t i o ( M D O R ) for th is s tudy of n o n - H o d g k i n l y m p h o m a , g iven 8 0 % power a n d a two-t a i l e d signif icance of 5%. M D O R s were ca l cu la ted by D r . J o h n S p i n e l l i for sample sizes of 800 cases a n d 800 controls , w i t h an equal number controls frequency matched by age a n d gender. A l l e l e frequency M D O R (800 cases i n controls a n d 800 controls) 0.005 2.93 0.010 2.27 0.020 1.86 0.050 1.54 0.100 1.41 0.200 1.33 0.300 1.32 are based on sequence a n n o t a t i o n , sequence p r e d i c t i o n , m u l t i p l e a l ignment or s t ruc -ture , depend ing on the i n f o r m a t i o n avai lable . E a c h of the 6 var iants selected was seen i n at least one pat ient sample a n d was pred i c ted by P o l y P h e n to be either " P o s s i b l y " or " P r o b a b l y D a m a g i n g " to p ro te in func t i on . These var iants are l i s ted i n T a b l e 3.5. O f the 13 var iants selected for genotyp ing , on ly one, I V S 2 5 (-12) i n s ( A ) , fa i led i n assay design. T h i s was due to i ts l o ca t i on w i t h i n a repet i t ive region. 3.4 Genotyping results T w e l v e var iants were genotyped i n 798 cases a n d 793 controls . These in c luded 6 c o m m o n t a g S N P s a n d 6 rare var iants tha t were pred i c ted to be deleterious to p ro te in func t i on . T a b l e 3.4: H a p l o t y p e s p r e d i c t e d u s i n g the 11 c o m m o n v a r i a n t s of A T M . T h e m i n o r al le le frequencies ( M A F ) of the v a r i a n t s are l i s t e d u n d e r the v a r i a n t n a m e . T h e 10 e s t imated hap lo types are n u m b e r e d H 1 - H 1 0 . T h e f requency at w h i c h each of these v a r i a n t s was observed i n t h e v a r i a n t de tec t i on sequencing set is l i s t e d as the O b s e r v e d F r e q u e n c y ( O b s F r e q ) . Obs Freq -5144 A / T -4519 G / A IVS4(+36) de l (AA) IVS24(-8) del(T) IVS25(-12) ins(A) X39(5557) G / A IVS48(-69) d e l ( A T T ) IVS62(-55) C / T (9718) T / G (10774) C / T (12563) T / G M A F 0.44 0.46 0.42 0.10 0.42 0.10 0.45 0.44 0.44 0.44 0.44 HI 0.420 A G Wt Wt Wt G Wt C G C G H2 0.020 A G W t Wt Ins G Wt c G c G H3 0.005 T G Wt wt Ins G Wt T T T T H4 0.006 T G Del wt Ins G Del T T T T H5 0.020 T A Wt wt Ins G Del T T T T H6 0.010 T A Wt wt Ins A Del T T T T H7 0.006 T A Wt Del Ins G Del T T T T H8 0.090 T A wt Del Ins A Del T T T T H9 0.006 T A Del Wt Wt G Del T T T T H10 0.410 T A Del Wt Ins G Del T T T T Tab le 3.5: P u t a t i v e l y deleterious var iants . T h e s ix var iants t h a t were selected for genotyp ing based on the i r p u t a t i v e func t i ona l effect are l i s ted here. S N P n a m e A m i n o A c i d C h a n g e F r e q u e n c y i n sequencing set 1 X12(1541)G/A G l y 5 1 4 A s p 0 .53% 2 X 2 4 ( 3 1 6 1 ) C / G P r o l 0 5 4 A r g 1.10% 3 X31(4424)A/G T y r l 4 7 5 C y s 0.54% 4 X 4 0 ( 5 6 9 7 ) C / A C y s l 8 9 9 S t o p 0.54% 5 X 4 0 ( 5 7 5 7 3 ) G / C A r g l 9 1 8 T h r 0.54% 6 X 5 4 ( 7 7 7 5 ) C / G Ser2592Cys 0 .57% 3.4.1 Quality control of genotyping data G e n o t y p e calls were made us ing the default condi t ions of the S D S 2.2 software ( A p p l i e d B iosys tems , C A , U S A ) a n d a 9 5 % q u a l i t y value cut-off. A n example of the o u t p u t f rom the software is shown i n F i g u r e 3.4. If a sample cannot be c leanly grouped in to one of the genotype groups , i t is ca l led " U n d e t e r m i n e d " . Therefore , we do not have a genotype ca l l for these samples. T h e overal l no - ca l l rate was 3.7% a n d was s i m i l a r for cases a n d controls . T h i s was the percentage of samples that h a d an " U n d e t e r m i n e d " ca l l . A s a qua l i ty check, we c o m p a r e d the genotype a n d sequence cal ls at these 12 loc i for the 86 i n d i v i d u a l s sequenced a n d these agreed completely . W e also checked M e n d e l i a n inher i tance of each var iant i n five three-generat ion C E P H famil ies (Centre d ' E t u d e d u P o l y m o r p h i s m e H u m a i n co l lect ion , C o r i e l l C e l l Repos i tor ies , N J , U S A ) that were genotyped as p a r t of the qua l i ty contro l process. N o discrepancies were found. In a d d i t i o n , for 235 i n d i v i d u a l s w i t h a low y i e ld of D N A f rom the i r first sample , a second sample (b lood or sal iva) was ob ta ined a n d b o t h were genotyped . Here , we found an average d iscrepancy rate of 1.9%. T h i s error rate is expected to be higher MM H 1 - J H t K M -«>IOI« >  1 m Homozy gate 1 rygoti £; I- etero ft • \ t \ / • V l " / Noc all A • - • • / \ J / 1 • Ho mozygote 2 * i Negative contro F i g u r e 3.4: E x a m p l e TaqM&n p l o t generated b y t h e S D S 2.2 software d u r i n g the ana lys i s of t h e g e n o t y p i n g assays. T h e fluorescence of one dye ( V I C ) is p l o t t e d o n the X - a x i s a n d t h a t of the other dye ( F A M ) o n the Y - a x i s . E a c h dye is a t t a c h e d to the p r o b e for one al le le of the v a r i a n t . T h u s , each s a m p l e is g r o u p e d i n t o one of three c lusters represent ing the two homozygotes a n d the heterozygotes , d e p e n d i n g o n w h e t h e r one or b o t h of the dyes are detec ted for t h a t s a m p l e . c o n f i r m t h i s , we repeated 1600 genotypes for the h igher q u a l i t y D N A samples a n d f ound o n l y a 0 .03% d i s c repancy rate . E x t r a p o l a t i n g these average error rates to the ent ire geno typed set gives us a n average error rate of 0 . 5 % (based o n the p r o p o r t i o n of l ow a n d h i g h q u a l i t y D N A samples ) . 3.4.2 Linkage disequilibrium I used the p r o g r a m H a p l o v i e w [4] t o d e t e r m i n e the extent of l inkage d i s e q u i l i b r i u m ( L D ) at the A T M locus . T h e L D was c a l c u l a t e d u s i n g not o n l y the i n i t i a l v a r i a n t de-t e c t i o n sequenc ing d a t a (we used the 11 c o m m o n v a r i a n t s ) , b u t also the g e n o t y p i n g d a t a for the 6 c o m m o n v a r i a n t s . T h e 11 c o m m o n v a r i a n t s ident i f i ed b y sequenc ing s p a n 146kb. T h e 6 c o m m o n g e n o t y p e d v a r i a n t s s p a n 132kb. T h e s e 6 v a r i a n t s were t a g S N P s a n d selected to be a m a x i m a l l y i n f o r m a t i v e set. T h u s , they were not ex-pec ted to have h i g h r2 values w i t h respect to one another , s ince any v a r i a n t t h a t c o u l d serve as a p r o x y for a n o t h e r was d e l i b e r a t e l y e x c l u d e d f r o m the g e n o t y p i n g set. T h e r2 values f r o m the L D c a l c u l a t i o n s are s u m m a r i z e d i n F i g u r e 3.5. P a i r -wise D ' values are not s h o w n , b u t ranged between 0.85 a n d 1 for the 11 c o m m o n v a r i a n t s f r o m sequencing . T h e p a i r w i s e D ' values between the 6 c o m m o n genotyped v a r i a n t s r a n g e d between 0.98 a n d 1 ( d a t a not shown) . S i m i l a r l y h i g h levels of L D have p r e v i o u s l y been s h o w n b y o ther groups [1, 8, 38, 72]. O v e r a l l , the genomic r eg i on at the A T M locus shows h i g h b u t i n c o m p l e t e L D , w h i c l i conf i rms the need for v a r i a n t de te c t i on sequenc ing . T h e h i g h L D va l idates the use of t a g S N P s as m a r k e r s for h a p l o t y p e s , w h i l e the fact t h a t i t is i n c o m p l e t e ind i cates t h a t r e l y i n g c o m p l e t e l y o n p r e v i o u s l y ident i f ied m a r k e r s m a y not have p r o v i d e d the d e p t h t h a t we require for t h i s i n v e s t i g a t i o n . T h e genotype frequencies observed i n c o n t r o l samples for a l l the c o m m o n F i g u r e 3.5: L i n k a g e d i s e q u i l i b r i u m plots s h o w i n g r2 values . T h e s e were generated u s i n g a) sequence der ived genotype d a t a for 11 c o m m o n v a r i a n t s i n 86 N H L i n d i v i d -uals a n d b) genotype d a t a f r o m 798 N H L cases a n d 793 contro ls for 6 t a g S N P s . T h e var iants are l i s t e d i n order of genomic p o s i t i o n . B l o c k s w i t h no va lue i n d i c a t e a n r2 of 1. T h e A T M gene shows h i g h b u t i n c o m p l e t e l inkage d i s e q u i l i b r i u m . (These figures were generated u s i n g H a p l o v i e w [4].) g e n o t y p e d S N P s were v e r y close to those expec ted b y H a r d y - W e i n b e r g e q u i l i b r i u m ( H W E ) ( d a t a not shown) . H W E was not tested for the rare var iants . G e n o t y p e d a t a was also used to es t imate hap lo types u s i n g P H A S E v2 .0 . O n l y the 6 c o m m o n v a r i a n t s were used here. T h e s e h a p l o t y p e s a n d the c o r r e s p o n d i n g frequencies were c o m p a r a b l e to those e s t i m a t e d u s i n g the sequence d a t a ( d a t a not shown) . 3.5 Association tests with common variants 3.5.1 Overall and subtype analyses M u l t i v a r i a t e log is t i c regression was used to ca l cu la te the odds ra t i o s for each of the 6 c o m m o n geno typed v a r i a n t s . T h i s was ad jus ted for age, gender, e t h n i c i t y a n d reg ion of residence. T h i s was done u s i n g the H a p l o . s t a t s package of the R s t a t i s t i c a l s y s t e m [59]. W h e n there were fewer t h a n 5 i n d i v i d u a l s i n the rare h o m o z y g o t e category, t h i s g roup was c o m b i n e d w i t h the heterozygotes for ana lys i s . F o r each of these v a r i a n t s , we c o m p a r e d a l l N H L cases, a l l B - c e l l N H L cases, a n d a l l T - c e l l N H L cases w i t h a l l the contro ls (see T a b l e 3.6). W h e n a n a l y s e d separate ly , none of the 6 c o m m o n v a r i a n t s a p p e a r to confer a s i gn i f i cant ly increased r i s k of N H L . W e also tested the different N H L subtypes , c o m p a r i n g each set of cases to a l l the contro ls . T h e resul ts f r o m th i s ana lys i s are presented i n T a b l e 3.7. T a b l e 3.6: O d d s rat ios for c o m m o n var iants of A T M . O d d s ra t i os were c a l c u l a t e d for each c o m m o n v a r i a n t t h a t was g e n o t y p e d i n the case c o n t r o l group . A l l N H L cases, B ce l l cases a n d T cel l N H L cases were each c o m p a r e d to the c o n t r o l g roup . Variant Genotype Controls n (%) n (%) A l l N H L O R (95% CI) p value n (%) B cell N H L O R (95% CI) p value n (%) T cell N H L O R (95% CI) p value - 5 1 4 4 A / T T / T A / T A / A 237 (31.4%) 365 (48.3%) 153 (20.3%) 229 (29.7%) 391 (50.8%) 150 (19.5%) 1.16 1.03 (0.92-1.47) (0.77-1.39) 0.215 0.824 207 (29.8%) 352 (50.7%) 135 (19.5%) 1.15 1.02 (0.91-1.47) (0.76-1.38) 0.249 0.889 22 (29.3%) 38 (50.7%) 15 (20%) 1.25 (0.71-2.2) 1.14 (0.56-2.32) 0.433 0.712 - 4 5 1 9 G / A A / A G / A G / G 230 (30.7%) 359 (47.9%) 160 (21.4%) 227 (29.5%) 388 (50.4%) 155 (20.1%) 1.15 1.00 (0.91-1.46) (0.75-1.34) 0.237 0.996 205 (29.5%) 351 (50.6%) 138 (19.9%) 1.16 0.98 (0.91-1.47) (0.73-1.32) 0.246 0.897 22 (29.3%) 36 (48%) 17 (22.7%) 1.17 (0.66-2.07) 1.18 (0.59-2.33) 0.584 0.642 IVS4(+36) de l (AA) w t / w t w t / d e l del /del 259 (34.4%) 355 (47.1%) 139 (18.5%) 261 (34.2%) 377 (49.3%) 126 (16.5%) 1.08 0.91 (0.86-1.35) (0.67-1.22) 0.533 . 0.517 230 (33.3%) 347 (50.2%) 114 (16.5%) 1.12 0.94 (0.89-1.42) (0.69-1.28) 0.332 0.68 31 (43.1%) 29 (40.3%) 12 (16.7%) 0.65 (0.38-1.13) 0.64 (0.31-1.31) 0.125 0.224 IVS24(-8) del(T) w t / w t w t / d e l del /del 593 (78.1%) 155 (20.4%) 11 (1.4%) 582 (76.3%) 169 (22.1%) 12 (1.6%) 1.05 1.07 (0.81-1.35) (0.46-2.47) 0.735 0.872 530 (76.9%) 148 (21.5%) 11 (1.6%) 1.00 1.06 (0.77-1.30) (0.45-2.49) 0.995 0.900 51 (69.9%) 22 (30.1%) 1.70 (0.97-2.98) combined with w t / d e l 0.064 X39(55S7) G / A G / G G / A A / A 596 (78.4%) 153 (20.1%) 11 (1.4%) 583 (76.4%) 168 (22%) 12 (1.6%) 1.06 1.06 (0.82-1.37) (0.46-2.44) 0.648 0.892 530 (77%) 147 (21.4%) 11 (1.6%) 1.02 1.05 (0.78-1.32) (0.45-2.46) 0.908 0.920 52 (70.3%) 22 (29.7%) 1.74 (0.99-3.06) combined with G / A 0.053 IVS62(-55) C / T T / T C / T c/c 230 (30.7%) 364 (48.5%) 156 (20.8%) 232 (30.3%) 386 (50.4%) 148 (19.3%) 1.10 0.95 (0.87-1.39) (0.71-1.28) 0.441 0.738 211 (30.5%) 346 (50.1%) 134 (19.4%) 1.08 0.94 (0.85-1.37) (0.69-1.27) 0.553 0.678 21 (28.4%) 39 (52.7%) 14 (18.9%) 1.30 (0.74-2.30) 1.06 (0.51-2.19) 0.364 0.879 T h e different subtypes of N H L t h a t we tested i n c l u d e : • Di f fuse large B ce l l l y m p h o m a ( D L B C L ) • F o l l i c u l a r s m a l l c leaved , f o l l i cu lar large ce l l a n d f o l l i cu lar m i x e d , g r o u p e d t o -gether ( F L ) • M a r g i n a l zone l y m p h o m a a n d l ow grade B ce l l of m u c o s a assoc iated l y m p h o i d t issue , g r o u p e d together ( M Z L / M A L T ) • M a n t l e c e l l l y m p h o m a ( M C L ) • S m a l l l y m p h o c y t i c l y m p h o m a ( S L L ) • L y m p h o p l a s m a c y t i c l y m p h o m a ( L P L ) • M i s c e l l a n e o u s B ce l l l y m p h o m a s ( M i s c B C L ) • M y c o s i s fungoides ( M F ) • P e r i p h e r a l T ce l l l y m p h o m a ( P T C L ) • M i s c e l l a n e o u s T ce l l l y m p h o m a s ( M i s c T C L ) N o t e : M i s c T C L a n d M i s c B C L are heterogeneous groups of subtypes t h a t i n c l u d e a n u m b e r of T - c e l l a n d B - c e l l subtypes t h a t we d i d not have sufficient n u m b e r s to ana lyze i n d i v i d u a l l y . T h e results for M i s c T C L are not r e p o r t e d since there were o n l y 9 cases i n th i s g roup . T h e most s igni f i cant finding was p=0 .007 , w i t h a n odds r a t i o of 0.31 for I V S 4 ( + 3 6 ) d e l ( A A ) i n p e r i p h e r a l T - c e l l l y m p h o m a s . I f we correct for the False d i scovery rate ( F D R ) [6], however , t h i s p -va lue does not r e m a i n s igni f i cant (see Table B.l). P e r i p h e r a l T - c e l l l y m p h o m a s m a k e u p a p p r o x i m a t e l y 1 5 % of l y m -p h o m a a n d consist of a w ide s p e c t r u m of disease, i n c l u d i n g a n a p l a s t i c large T -ce l l l y m p h o m a , I B L - l i k e T - c e l l l y m p h o m a , i n t e s t i n a l T - c e l l l y m p h o m a , a d u l t T - c e l l l e u k a e m i a / l y m p h o m a a n d P T C L unspec i f ied [2, 42]. T h e s e diseases d i s p l a y m a r k e d differences i n b io logy a n d one or a subset of these m a y be assoc iated w i t h A T M . W e do not , i n th is s tudy , have the n u m b e r s to s t u d y the subtypes of P T C L i n d i -v i d u a l l y . T h e r e is some prev ious ev idence for the i n v o l v e m e n t of A T M i n P T C L . F a n g et al, [20] ident i f i ed one u n e q u i v o c a l l y deleter ious A T M m u t a t i o n i n 1 of 10 P T C L t u m o u r s t h a t they e x a m i n e d b y C G H . T h e o r i g i n of th i s m u t a t i o n (germl ine or s omat i c ) was not d e t e r m i n e d . A l t h o u g h the c l i n i c a l or b i o l o g i c a l relevance of t h i s finding cannot be assessed, speci f ic subsets of P T C L m a y be enr i ched for A T M m u t a t i o n s . Table 3.7: Odds ratios for common A T M variants for different sub-types of N H L . Odds ratios were calculated for each variant that was genotyped in the case control group. Six common variants were tested for association using multivariate logistic regression comparing al l cases of a particular subtype of N H L to a l l controls. Variant Genotype Controls n(%) Analysis Group n(%) Cases O R ( 9 5 % CI) p value Analysis Group n(%) Cases O R ( 9 5 % CI) p value - 5 1 4 4 A / T T / T 237 (31.4%) D L B C L 61 (32.6%) F L 61 (28.5%) A / T 365 (48.3%) 93 (49.7%) 1.01 (0.70-1.45) 0.977 110 (51.4%) 1.23 (0.86-1.76) 0.264 A / A 153 (20.3%) 33 (17.6%) 0.85 (0.53-1.38) 0.517 43 (20.1%) 1.13 (0.72-1.77) 0.599 - 4 5 1 9 G / A A / A 230 (30.7%) 61 (32.6%) 58 (27.8%) G / A 359 (47.9%) 92 (49.2%) 0.98 (0.68-1.42) 0.911 108 (51.7%) 1.2 (0.88-1.83) 0.200 G / G 160 (21.4%) 34 (18.2%) 0.82 (0.51-1.31) 0.404 43 (20.6%) 1.12 (0.72-1.76) 0.613 IVS4(+36) w t / w t 259 (34.4%) 59 (32.1%) 77 (36.0%) d e l ( A A ) w t / d e l 355 (47.1%) 96 (52.2%) 1.18 (0.82-1.70) 0.380 97 (45.3%) 0.93 (0.66-1.31) 0.665 de l /de l 139 (18.5%) 29 (15.8%) 0.90 (0.55-1.49) 0.687 40 (18.7%) 0.96 (0.62-1.50) 0.860 IVS24(-8) w t / w t 593 (78.1%) 138 (73.4%) 160 (76.9%) del(T) w t / d e l 155 (20.4%) 45 (23.9%) 1.28 (0.86-1.89) 0.220 48 (23.1%) 0.99 (0.68-1.45) 0.975 de l /de l 11 (1.4%) 5 (2.7%) 1.79 (0.60-5.35) 0.297 combined with w t / d e l X39(5557) G / G 596 (78.4%) 139 (73.9%) 162 (77.1%) G / A G / A , A / A 164 (21.5%) 49 (25.0%) 1.31 (0.90-1.92) 0.161 48 (22.9%) 1.01 (0.69-1.47) 0.966 IVS62(-55) T / T 230 (30.7%) 61 (33.0%) 61 (28.8%) C / T C / T 364 (48.5%) 91 (49.2%) 0.95 (0.66-1.38) 0.790 109 (51.4%) 1.18 (0.82-1.69) 0.371 c /c 156 (20.8%) 33 (17.8%) 0.80 (0.49-1.28) 0.349 42 (19.8%) 1.06 (0.67-1.66) 0.817 Continued on next page Table 3.7 — Continued from previous page Cn Variant Genotype Controls n(%) Analysis Group n(%) Cases O R ( 9 5 % CI) p value Analysis G r o u p n(%) Cases O R ( 9 5 % CI) p value - 5 1 4 4 A / T T / T 237 (31.4%) M Z L / M A L T 28 (32.2%) M C L 15 (34.1%) A / T 365 (48.3%) 43 (49.4%) 1.08 (0.64-1.80) 0.782 20 (45.5%) 0.89 (0.44-1.79) 0.740 A / A 153 (20.3%) 16 (18.4%) 0.88 (0.45-1.70) 0.697 9 (20.5%) 0.87 (0.37-2.08) 0.760 - 4 5 1 9 G / A A / A 230 (30.7%) 28 (32.2%) 14 (30.4%) G / A 359 (47.9%) 42 (48.3%) 1.07 (0.64-1.79) 0.808 23 (50.0%) 1.08 (0.54-2.17) 0.836 G / G 160 (21.4%) 17 (19.5%) 0.86 (0.45-1.64) 0.642 9 (19.6%) 0.88 (0.37-2.10) 0.769 IVS4(+36) w t / w t 259 (34.4%) 28 (31.8%) 10 (23.8%) de l (AA) w t / d e l 355 (47.1%) 44 (50.0%) 1.23 (0.74-2.05) 0.432 24 (57.1%) 1.82 (0.85-3.90) 0.126 del /del 139 (18.5%) 16 (18.2%) 1.11 (0.57-2.15) 0.758 8 (19.0%) 1.58 (0.60-4.17) 0.356 IVS24(-8) w t / w t 593 (78.1%) 65 (70.7%) 39 (83.0%) del(T) w t / d e l 155 (20.4%) 21 (22.8%) 1.22 (0.71-2.10) 0.472 8 (17.0%) 0.58 (0.25-1.34) 0.202 del /del 11 (1.4%) 6 (6.5%) 0.01 (0 .00-8.9X10 6 ) 0.628 combined with w t / d e l X39(5557) G / G 596 (78.4%) 65 (76.5%) 39 (84.8%) G / A G / A , A / A 164 (21.5%) 20 (23.5%) 1.08 (0.62-1.87) 0.784 7(15.2%) 0.57 (0.25-1.32) 0.191 IVS62(-55) T / T 230 (30.7%) 28 (32.9%) 17 (36.2%) C / T C / T 364 (48.5%) 42 (49.4%) 1.01 (0.60-1.70) 0.971 21 (44.7%) 0.81 (0.41-1.58) 0.530 C / C 156 (20.8%) 15 (17.6%) 0.78 (0.40-1.53) 0.473 9 (19.1%) 0.74 (0.32-1.71) 0.476 Continued on next page Table 3.7 — Continued from previous page 00 Variant Genotype Controls n(%) Analysis Group n(%) Cases O R ( 9 5 % CI) p value Analysis G r o u p n(%) Cases O R ( 9 5 % CI) p value - 5 1 4 4 A / T T / T 237 (31.4%) S L L 7 (18.9%) L P L 9 (22.5%) A / T 365 (48.3%) 21 (56.8%) 2.08 (0.86-5.06) 0.105 23 (57.5%) 1.69 (0.76-3.76) 0.197 A / A 153 (20.3%) 9 (24.3%) 1.90 (0.68-5.29) 0.219 8 (20.0%) 1.33 (0.50-3.57) 0.570 - 4 5 1 9 G / A A / A 230 (30.7%) 7 (18.9%) 9 (22.0%) G / A 359 (47.9%) 21 (56.8%) 2.07 (0.85-5.02) 0.109 23 (56.1%) 1.69 (0.76-3.76) 0.198 G / G 160 (21.4%) 9 (24.3%) 1.83 (0.66-5.08) 0.250 9 (22.0%) 1.42 (0.55-3.71) 0.471 IVS4(+36) d e l ( A A ) w t / w t 259 (34.4%) 12 (31.6%) 14 (36.8%) w t / d e l 355 (47.1%) 26 (68.4%) 1.13 (0.55-2.31) 0.735 19 (50.0%) 1.01 (0.49-2.08) 0.971 del /del 139 (18.5%) combined with w t / d e l 5 (13.2%) 0.68 (0.24-1.94) 0.467 IVS24(-8) de)(T) w t / w t 593 (78.1%) 33 (86.8%) 32 (76.2%) wt /de l ,de l /de l 166 (21.5%) 5 (13.2%) 0.47 (0.18-1.23) 0.122 10 (23.8%) 0.73 (0.31-1.72) 0.473 X39(55S7) G / A G / G 596 (78.4%) 33 (86.8%) 30 (81.1%) G / A , A / A 164 (21.5%) 5 (13.1%) 0.47 (0.18-1.25) 0.130 7 (18.9%) 0.80 (0.34-1.90) 0.617 IVS62 (-55) C / T T / T 230 (30.7%) 7 (18.9%) 9 (22.5%) C / T 364 (48.5%) 21 (56.8%) 2.04 (0.84-4.96) 0.115 23 (57.5%) 1.65 (0.74-3.68) 0.217 c/c 156 (20.8%) 9 (24.3%) 1.83 (0.66-5.09) 0.248 8 (20.0%) 1.28 (0.48-3.42) 0.626 Continued on next page Table 3.7 — Continued from previous page Variant Genotype Controls n(%) Analysis G r o u p n(%) Cases O R ( 9 5 % CI) p value Analysis G r o u p n(%) Cases O R ( 9 5 % CI) p value - 5 1 4 4 A / T T / T 237 (31.4%) Misc B C L 26 (30.6%) M F 12 (30.0%) A / T 365 (48.3%) 42 (49.4%) 1.12 (0.66-1.88) 0.684 28 (70.0%) 1.25 (0.61-2.54) 0.546 A / A 153 (20.3%) 17 (20.0%) 1.06 (0.55-2.03) 0.872 combined with A / T - 4 5 1 9 G / A A / A 230 (30.7%) 28 (32.2%) 12 (30.0%) G / A 359 (47.9%) 42 (48.3%) 1.02 (0.61-1.71) 0.931 23 (57.5%) 1.41 (0.68-2.95) 0.357 G / G 160 (21.4%) 17 (19.5%) 0.89 (0.47-1.70) 0.726 5 (12.5%) 0.68 (0.23-2.01) 0.487 IVS4(+36) w t / w t 259 (34.4%) 30 (34.5%) 12 (30.0%) de l (AA) w t / d e l 355 (47.1%) 44 (50.6%) 1.07 (0.65-1.75) 0.799 19 (47.5%) 1.11 (0.52-2.37) 0.788 del /del 139 (18.5%) 13 (14.9%) 0.80 (0.40-1.61) 0.537 9 (22.5%) 1.22 (0.49-3.06) 0.668 IVS24(-8) w t / w t 593 (78.1%) 63 (75.0%) 29 (72.5%) del(T) w t / d e l , d e l / d e l 166 (21.5%) 21 (25.0%) 1.16 (0.68-1.99) 0.582 11 (27.5%) 1.34 (0.64-2.83) 0.438 X39(5557) G / G 596 (78.4%) 62 (73.8%) 29 (72.5%) G / A G / A , A / A 164 (21.5%) 22 (26.2%) 1.28 (0.75-2.18) 0.366 11 (27.5%) 1.37 (0.65-2.89) . 0.413 IVS62(-55) T / T 230 (30.7%) 28 (32.9%) 12 (30.0%) C / T C / T 364 (48.5%) 39 (45.9%) 0.93 (0.55-1.56) 0.772 28 (70.0%) 1.21 (0.59-2.47) 0.602 C / C 156 (20.8%) 18 (21.2%) 0.97 (0.51-1.83) 0.919 combined with C / T Continued on next page Table 3.7 — Continued from previous page OX Variant Genotype Controls n(%) Analysis Group n(%) Cases O R ( 9 5 % CI) p value Analysis Cases G r o u p n(%) O R ( 9 5 % CI) p value - 5 1 4 4 A / T T / T 237 (31.4%) P T C L 6 (23.1%) A / T 365 (48.3%) 13 (50%) 1.46 (0.54-3.95) 0.453 A / A 153 (20.3%) 7 (26.9%) 1.72 (0.56-5.30) 0.345 - 4 5 1 9 G / A A / A 230 (30.7%) 6 (23.1%) G / A 359 (47.9%) 12 (46.2%) 1.37 (0.50-3.76) 0.537 G / G 160 (21.4%) 8 (30.8%) 1.79 (0.60-5.35) 0.297 IVS4(+36) de l (AA) w t / w t 259 (34.4%) 15 (62.5%) w t / d e l , d e l / d e l 494 (65.6%) 9 (37.5%) 0.31 (0.13-0.73) 0.007 IVS24(-8) del(T) w t / w t 593 (78.1%) 16 (66.7%) w t / d e l , d e l / d e l 166 (21.5%) 8 (33.3%) 2.18 (0.86-5.57) 0.102 X39(5557) G / A G / G 596 (78.4%) 17 (68.0%) G / A , A / A 164 (21.5%) 8 (32.0%) 2.28 (0.89-5.87) 0.086 IVS62 (-55) C / T T / T 230 (30.7%) 5 (20.0%) C / T 364 (48.5%) 14 (56.0%) 1.79 (0.63-5.11) 0.277 C / C 156 (20.8%) 6 (24.0%) 1.65 (0.49-5.58) 0.422 3.5.2 Analysis of different ethnicities E a c h i n d i v i d u a l i n the c a s e / c o n t r o l g roup was c lassi f ied u n d e r one of four e t h n i c i t y based groups . T h e s e were C a u c a s i a n , A s i a n , S o u t h A s i a n a n d M i x e d / O t h e r / U n k n o w n E a c h of the geno typed var iants was t h e n a n a l y z e d separate ly for each of these groups . T h e M i x e d / O t h e r / U n k n o w n g r o u p was e x c l u d e d f r o m the ana lys i s , as a n y resul ts o b t a i n e d were u n l i k e l y to be m e a n i n g f u l . T h u s , the odds ra t i os were c a l c u l a t e d u s i n g m u l t i v a r i a t e log is t i c regression c o m p a r i n g C a u c a s i a n N H L cases to C a u c a s i a n contro ls a n d so o n for A s i a n a n d S o u t h A s i a n i n d i v i d u a l s . These c o m p a r i s o n s were ad ju s ted for age, gender a n d reg i on of residence. T h e results are s u m m a r i z e d i n T a b l e 3.8. N o n e of the var iants conferred a s igni f i cant odds r a t i o for a n y of the i n d i v i d u a l e thn ic i t i es . Table 3.8: Odds ratios for common A T M variants for different ethnicities. Odds ratios were calculated for each variant that was genotyped in the case/control group. Six common variants were tested for association using multivariate logistic regression com-paring a l l cases of a particular ethnicity to controls of the same ethnicity. Three separate ethnicities were analysed. Analysis Variant Genotype Controls A l l N H L Cases Group n(%) n(%) 011(95% CI) p value Caucasians - 5 1 4 4 A / T T / T 180 (30.6%) 176 (28.9%) A / T 282 (48.0%) 314 (51.6%) 1.18 (0.91-1.54) 0.219 A / A 126 (21.4%) 118 (19.4%) 0.97 (0.70-1.35) - 0.850 - 4 5 1 9 G / A A / A 176 (30.2%) 176 (29.0%) G / A 279 (47.9%) 311 (51.3%) 1.16 (0.89-1.52) 0.267 G / G 128 (22.0%) 119 (19.6%) 0.94 (0.68-1.30) 0.708 IVS4(+36) w t / w t 214 (36.5%) 207 (34.3%) de l (AA) w t / d e l 274 (46.7%) 305 (50.5%) 1.17 (0.91-1.51) 0.233 de l /de l 99 (16.9%) 92 (15.2%) 0.97 (0.68-1.37) 0.847 IVS24(-8) w t / w t 440 (74.8%) 443 (73.8%) del(T) w t / d e l 140 (23.8%) 150 (25.0%) 1.02 (0.78-1.33) 0.892 del /de l 8 (1.4%) 7 (1.2%) 0.91 (0.33-2.54) 0.854 X39(5557) G / G 443 (74.8%) 444 (74.0%) G / A . G / A 140 (23.6%) 149 (24.8%) 1.02 (0.78-1.33) 0.884 A / A 9 (1.5%) 7 (1.2%) 0.77 (0.29-2.12) 0.621 IVS62(-55) T / T 176 (30.0%) 178 (29.4%) C / T C / T 283 (48.2%) 310 (51.2%) 1.13 (0.86-1.47) 0.388 C / C 128 (21.8%) 117 (19.3%) 0.92 (0.66-1.27) 0.598 Continued on next page Table 3.8 - Continued from previous page Analysis Variant Genotype Controls A l l N H L Cases Group n(%) n(%) O R ( 9 5 % CI) p value A s i a n - 5 1 4 4 A / T T / T 23 (27.1%) 16 (21.3%) A / T 42 (49.4%) 36 (48.0%) 1.29 (0.59-2.85) 0.524 A / A 20 (23.5%) 23 (30.7%) 1.62 (0.67-3.94) 0.288 - 4 5 1 9 G / A A / A 23 (27.4%) 15 (19.7%) G / A 38 (45.2%) 37 (48.7%) 1.56 (0.70-3.47) 0.280 G / G 23 (27.4%) 24 (31.6%) 1.55 (0.64-3.73) 0.333 IVS4(+36) w t / w t 28 (32.9%) 28 (38.4%) de l (AA) w t / d e l 38 (44.7%) 33 (45.2%) 0.95 (0.46-1.96) 0.899 del /de l 19 (22.4%) 12 (16.4%) 0.63 (0.26-1.56) 0.323 IVS62(-55) T / T 21 (25.6%) 17 (22.7%) C / T C / T 40 (48.8%) 35 (46.7%) 1.14 (0.51-2.52) 0.750 C / C 21 (25.6%) 23 (30.7%) 1.31 (0.54-3.15) 0.554 South - 5 1 4 4 A / T T / T 16 (50.0%) 14 (51.9%) A s i a n A / T / T / T 16 (50.0%) 13 (48.1%) 0.97 (0.32-2.99) 0.958 - 4 5 1 9 G / A A / A 13 (41.9%) 13 (48.1%) G / A . G / G 18 (58.0%) 14 (51.9%) 0.81 (0.25-2.56) 0.713 IVS4(+36) w t / w t 4 (12.9%) 5 (18.5%) de l (AA) w t / d e l 17 (54.8%) 12 (44.4%) 0.35 (0.06-1.97) 0.231 de l /de l 10 (32.3%) 10 (37%) 0.64 (0.10-4.16) 0.635 IVS62(-55) T / T 17 (53.1%) 14 (53.8%) C / T C / T , C / C 15 (46.9%) 12 (46.2%) 0.97 (0.31-3.05) 0.954 3.5.3 Analysis of haplotypes I n a d d i t i o n to a n a l y s i n g each of the c o m m o n v a r i a n t s i n d i v i d u a l l y , we also c a l c u l a t e d odds ra t i os for the h a p l o t y p e s f o r m e d b y these v a r i a n t s . T h r e e m a i n h a p l o t y p e s were observed. T h e s e were a n a l y s e d s ingly , w h i l e less frequent h a p l o t y p e s (seen i n < 5 i n d i v i d u a l s ) were g r o u p e d together for ana lys i s . W e c o m p a r e d h a p l o t y p e frequencies i n a l l N H L cases, B - c e l l N H L cases a n d T - c e l l N H L cases i n a d d i t i o n to some of the m a j o r subtypes of N H L to the contro l s . O d d s ra t i os were c a l c u l a t e d u s i n g m u l t i v a r i a t e l og is t i c regression a d j u s t e d for age, gender a n d reg ion , for the C a u c a s i a n cases as c o m p a r e d to the C a u c a s i a n contro ls . T h e resul ts are s u m m a r i z e d i n T a b l e 3.9 a n d are not s igni f i cant for a n y of the c ompar i sons . T h u s , based o n the ana lys i s of these v a r i a n t s , c o m m o n v a r i a n t s i n A T M , or h a p l o t y p e s thereof , do not appear to confer increased s u s c e p t i b i l i t y t o N H L ( w i t h the poss ib le e x c e p t i o n of P T C L ) . T h e power of t h i s s t u d y to detect a n y such assoc iat ions , s h o u l d they ex is t , was r e l a t i v e l y h i g h (as s h o w n i n T a b l e 3.3). I f there were a n effect, i t w o u l d have to confer a n odds r a t i o of less t h a n 1.4 i n order to be b e y o n d the power of th i s s tudy . T h u s , c o m m o n v a r i a n t s i n A T M are u n l i k e l y t o p l a y a s u b s t a n t i a l role i n N H L suscept ib i l i t y . 3.6 Association study with combined rare variants T h e 6 rare var iants geno typed i n th i s s t u d y were selected based o n the i r p u t a t i v e effect o n p r o t e i n f u n c t i o n . A l l 6 were seen i n at least one sample t h a t we sequenced as p a r t of the v a r i a n t de te c t i on phase. F i v e of the 6 were p r e d i c t e d to be " P o s s i -b l y " or " P r o b a b l y d a m a g i n g " to p r o t e i n f u n c t i o n b y P o l y P h e n [54, 69, 70]. T h e s i x t h r esu l t ed i n a p r e m a t u r e s top c o d o n a n d so is p r e d i c t e d to be i n the same T a b l e 3.9: O d d s rat ios for A T M h a p l o t y p e s . O d d s ra t i os were c a l c u l a t e d for each v a r i a n t t h a t was geno typed i n the c a s e / c o n t r o l g roup . H a p l o t y p e s were e s t i m a t e d u s i n g H a p l o . s t a t s [59] based o n these genotypes . H e r e , the results for these h a p -l o types are presented. Severa l sub -groups were tes ted i n a d d i t i o n to the ent ire c a s e / c o n t r o l g roup . See T a b l e 2.1 for a l i s t of a b b r e v i a t i o n s . A n a l y s i s H a p l o t y p e F r e q O R (95% C I ) P - v a l G r o u p A l l N H L A G w t w t G C 0.44 T A w t d e l A T 0.12 1.03 (0.82-1.30) 0.787 T A d e l w t G T 0.41 0.97 (0.83-1.14) 0.729 R a r e 0.02 0.91 (0.50-1.63) 0.740 B - c e l l N H L A G w t w t G C 0.44 T A w t d e l A T 0.12 1.00 (0.78-1.27) 0.983 T A d e l w t G T 0.42 0.99 (0.84-1.16) 0.895 R a r e 0.02 0.86 (0.47-1.57) 0.617 T - c e l l N H L A G w t w t G C 0.44 T A w t d e l A T 0.12 1.52 (0.92-2.52) 0.104 T A d e l w t G T 0.41 0.81 (0.56-1.18) 0.278 R a r e 0.02 1.09 (0.39-3.10) 0.867 D L B C L A G w t w t G C 0.44 T A w t d e l A T 0.12 1.31 (0.92-1.86) 0.133 T A d e l w t G T 0.42 1.03 (0.80-1.32) 0.834 R a r e 0.02 1.25 (0.56-2.75) 0.588 F L A G w t w t G C 0.45 T A w t d e l A T 0.12 0.93 (0.65-1.33) 0.704 T A d e l w t G T 0.42 0.95 (0.75-1.20) 0.661 R a r e 0.02 0.45 (0.14-1.43) 0.176 C a u c a s i a n s A G w t w t G C 0.45 A l l N H L T A w t d e l A T 0.13 1.02 (0.79-1.31) 0.909 T A d e l w t G T 0.40 1.01 (0.85-1.21) 0.891 R a r e 0.01 0.77 (0.29-2.04) 0.597 A s i a n s A G w t w t G C 0.50 A l l N H L T A d e l w t G T 0.41 0.83 (0.53-1.31) 0.432 R a r e 0.09 1.18 (0.54-2.62) 0.676 S A s i a n T A d e l w t G T 0.58 A l l N H L T A w t d e l A T 0.10 0.69 (0.18-2.62) 0.593 A G w t w t G C 0.27 1.62 (0.61-4.33) 0.341 R a r e 0.05 0.78 (0.11-5.50) 0.802 f u n c t i o n a l category as the others . W e d i d not have sufficient power , i n t h i s s tudy , t o ana lyze each of these rare v a r i a n t s i n d i v i d u a l l y . I n s t e a d , we a n a l y z e d t h e m as a f u n c t i o n a l l y de leter ious g roup , h y p o t h e s i z i n g t h a t heterozygos i ty at a n y one of these l o c i confers increased s u s c e p t i b i l i t y t o N H L . A l l assoc iat ions were tested u s i n g m u l t i v a r i a t e l og is t i c regress ion tests , a d ju s te d for age, gender, p lace of residence a n d w h e n a p p r o p r i a t e , e thn i c i t y . T h e frequency of the heterozygotes a n d rare homozygotes was c o m p a r e d to t h a t of the c o m m o n homozygotes i n the assoc ia t i on tests. W e c a l c u l a t e d the odds ra t i os for a l l the N H L cases, a l l B - c e l l N H L cases a n d T - c e l l N H L cases (Table 3.10(a)). W e also c o m p a r e d each of the subtypes of N H L to a l l the contro ls . T h e results of these analyses are s h o w n i n T a b l e 3.10(b). L a s t l y , we a n a l y z e d each m a j o r e t h n i c g roup separate ly as s h o w n i n T a b l e 3.10(c). T h e rare genotypes of the 6 c o m b i n e d v a r i a n t s confer a n odds r a t i o of a p -p r o x i m a t e l y 3 i n two of the subtypes a n a l y z e d . T h e s e were m a n t l e ce l l l y m p h o m a ( O R : 3 . 3 (1.16-9.33), p=0 .026) a n d the M Z L / M A L T category ( O R : 2 . 8 (1.22-6.36), p=0 .015 ) . T h e s e resul ts were s igni f i cant at our selected a of 0.05, w h i c h ind i cates t h a t he terozygos i ty at a n y of the rare alleles assessed m a y confer a n increased suscep-t i b i l i t y t o these diseases ( a l t h o u g h th i s does not r e m a i n s igni f i cant after a d j u s t m e n t for the F D R , see T a b l e B . l ) . T h u s , we have some evidence p o i n t i n g t o the i m -p o r t a n c e of these rare var iants i n rare subtypes of the disease; t h i s m a y s u p p o r t a c o r o l l a r y of the c o m m o n v a r i a n t - c o m m o n disease m o d e l : the m u l t i p l e rare var iants - rare disease m o d e l . Tab le 3.10: O d d s rat ios for poo led analys is of 6 rare , p u t a t i v e l y deleterious var iants of A T M . Resu l t s for categories tha t h a d insufficient numbers (fewer t h a n 5 i n d i -v idua l s w i t h the rare var iants ) are not shown. These inc lude a l l T - c e l l N H L , S L L , L P L , M i s c B C L , M F , P T C L , M i s c T C L a n d analyses for the A s i a n a n d the S o u t h A s i a n e thn ic groups. See Tab le 2.1 for a l ist of abbrev iat ions . I n a l l categories, the rare homozygotes ( Z / Z ) were combined w i t h heterozygotes ( Y / Z ) for analys is a n d c o m p a r e d to the c o m m o n homozygotes ( Y / Y ) . (a) O v e r a l l Analysis Controls Cases Group Genotype n ( % ) n (%) 011(95% CI) p value A l l N H L Y / Y 745 (96.1%) 741 (94.2%) Y / Z , Z / Z 30 (3.9%) 46 (5.8%) 1.54 (0.96-2.48) 0.077 B cell N H L Y / Y 745 (96.1%) 669 (94.1%) Y / Z , Z / Z 30 (3.9%) 42 (5.9%) 1.52 (0.94-2.48) 0.090 (b) Subtypes Analysis Controls Cases Group Genotype n(%) n(%) O R ( 9 5 % CI) p value D L B C L Y / Y 745 (96.1%) 181 (94.3%) Y / Z , Z / Z 30 (3.9%) 11 (5.7%) 1.46 (0.71-3.00) 0.306 F L Y / Y 745 (96.1%) 204 (95.8%) Y / Z , Z / Z 30 (3.9%) 9 (4.2%) 1.08 (0.50-2.34) 0.843 M Z L / M A L T Y / Y 745 (96.1%) 81 (90%) Y / Z , Z / Z 30 (3.9%) 9 (10%) 2.78 (1.22-6.36) 0.015 M C L Y / Y 745 (96.1%) 42 (89.4%) Y / Z , Z / Z 30 (3.9%) 5 (10.6%) 3.28 (1.16-9.33) 0.026 (c) Ethnic i t ies Analysis Group Genotype Controls n(%) Cases n(%) OR(95% CI) p value Caucasian Y / Y Y / Z , Z / Z 577 (96%) 24 (4%) 584 (94.2%) 36 (5.8%) 1.54 (0.91-2.63) 0.111 3.6.1 Mantle cell lymphoma M a n t l e ce l l l y m p h o m a ( M C L ) accounts for 6% of a l l l y m p h o m a s . T h i s is a n e o p l a s m t h a t is c h a r a c t e r i z e d b y s o m a t i c a l t e ra t i ons i n genes t h a t regulate ce l l cyc le a n d D N A damage p a t h w a y s . T h e s e genes t y p i c a l l y i n c l u d e c y c l i n D l (seen i n a l m o s t a l l cases), the I N K 4 a / A R F locus (20% of cases), A T M (40-75% of cases) a n d i n some cases, C H K 2 (rare) ( R e v i e w e d b y F e r n a n d e z et al, [21]). M o r e t h a n 4 0 % of M C L t u m o u r s have been s h o w n to have m u t a t i o n s i n A T M [15]. A l t h o u g h M C L has been k n o w n to have c i r c u l a t i n g cells we do not bel ieve t h a t our results are due to s o m a t i c m u t a t i o n s . T h e p r o p o r t i o n of c i r c u l a t i n g cells w o u l d not be expec ted to be h i g h enough to affect the genotype c a l l . F o r a s a m p l e to be c a l l e d heterozygous b y sequenc ing , i t w o u l d have to have two peaks of a p p r o x i m a t e l y e q u a l he ight at t h a t v a r i a n t s ite . A s m a l l f r a c t i o n of c i r c u l a t i n g cells w o u l d not be sufficient to resul t i n e q u a l p e a k he ights . S i m i l a r l y , d u r i n g g e n o t y p i n g w i t h TaqM&n, samples w i t h f luorescence f r o m b o t h dyes, b u t i n u n e q u a l a m o u n t s , w o u l d not f a l l w i t h i n the heterozygous g r o u p , a n d w o u l d therefore be l a b e l l e d " U n d e t e r m i n e d " . W e c a n , therefore , w i t h some confidence, ru le out the p o s s i b i l i t y of c i r c u l a t i n g ce l l c o n t a m i n a t i o n c a u s i n g the heterozygos i ty . T h e r e is evidence for s o m a t i c m u t a t i o n s i n A T M i n M C L . G r e i n e r a n d co l -leagues [29] e x a m i n e d M C L t u m o u r s i n w h i c h A T M is m u t a t e d a n d f o u n d t h a t these t u m o u r s have a n a l tered gene express ion prof i le w h e n c o m p a r e d to w i l d t y p e A T M M C L . T h e r e have been observat ions of increased c h r o m o s o m a l imba lances i n M C L w i t h i n a c t i v a t e d A T M [15]. C a m a c h o a n d col leagues [15] l o o k e d at 20 M C L t u m o u r spec imens . T h e y ident i f i ed 8 m u t a t i o n s (40% of cases h a d m u t a t i o n s ) . T h e y es tab l i shed the g e rml ine o r i g i n of one m u t a t i o n , w h i c h was heterozygous , w i t h a loss of the w i l d t y p e copy i n the h e m i z y g o u s t u m o u r . T h e y f o u n d t h a t A T M gene m u t a t i o n s seemed to be assoc i -a t e d w i t h M C L t u m o u r s t h a t d i s p l a y e d a large n u m b e r of c h r o m o s o m a l imba lances , suggest ing t h a t A T M i n a c t i v a t i o n m a y favor inc reas ing c h r o m o s o m a l i n s t a b i l i t y i n these l y m p h o m a s . T h u s , a n u m b e r of s tudies [15, 58] f ound t h a t A T M i n a c t i v a t i o n is a frequent occurrence i n M C L a n d have c o n c l u d e d t h a t A T M m u t a t i o n s are a n ear ly factor i n the pathogenesis of M C L . 3.6.2 Marginal zone lymphoma M a r g i n a l zone l y m p h o m a s ( M Z L ) consist of several different ent i t ies t h a t together compr i se a p p r o x i m a t e l y 8% of l y m p h o m a s ( R e v i e w e d b y B e r t o n i a n d Z u c c a [7]): • mucosa -assoc ia ted l y m p h o i d t issue ( M A L T ) • e x t r a n o d a l m a r g i n a l - z o n e l y m p h o m a ( E M Z L ) : T h e s e arise at several different s ites , b u t the gas t r i c l o c a t i o n is the most c o m m o n • n o d a l m a r g i n a l zone B - c e l l l y m p h o m a ( N M Z L ) a n d • sp lenic m a r g i n a l zone l y m p h o m a ( S M Z L ) O u r results i n d i c a t e t h a t rare , p u t a t i v e l y deleter ious v a r i a n t s i n A T M m a y be assoc iated w i t h increased s u s c e p t i b i l i t y t o M Z L . A l t h o u g h there is no prev ious ev idence of t h i s , these changes m a y be assoc iated w i t h j u s t one of the subtypes of M Z L , perhaps one of the less frequent ones. A l t h o u g h these results do not r e m a i n s igni f i cant after a d j u s t m e n t for the F D R , the c o r r e c t i o n , i n th i s case, is l i k e l y t o be over ly s t r ingent . T h e F D R c a l c u l a t i o n assumes t h a t a l l v a r i a n t s / t e s t s are independent , w h i c h i n the case of a locus i n h i g h L D , such as A T M , is not the case. A l t h o u g h some efforts have been m a d e t o address th i s issue [22], there is no consensus o n the s u i t a b i l i t y of the a s s u m p t i o n of independence i n the case of a ssoc ia t i on studies such as th i s one. T h u s , the results of th i s s t u d y show t h a t c o m m o n v a r i a n t s i n the A T M gene, a n d h a p l o t y p e s thereof , do not appear to confer increased s u s c e p t i b i l i t y t o N H L . P T C L m a y be one s u b t y p e of disease t h a t is a n e x c e p t i o n to th i s f i n d i n g . R a r e , p u t a t i v e l y deleter ious var iants m a y also p l a y a p a r t i n l y m p h o m a g e n e s i s of some rare N H L subtypes . 4 Conclusions I n these exper iments , I sequenced the A T M gene i n D N A f rom the b l o o d of 86 n o n - H o d g k i n l y m p h o m a ( N H L ) pat ients . I identi f ied 79 var iants i n this phase of the study. U s i n g haplotypes es t imated f r om sequence d a t a , I ident i f ied 7 t a g S N P s a n d 6 rare var iants tha t were pred i c ted to have a deleterious effect on p r o t e i n func t i on . S i x of the t a g S N P s a n d the 6 rare var iants were t h e n genotyped i n 798 N H L cases a n d 793 controls . A s s o c i a t i o n studies were carr ied out a n d odds rat ios ca l cu la ted for the 6 c o m m o n t a g S N P s separately, as we l l as for the haplo types formed by t h e m . None of these were found to be s igni f i cant ly associated w i t h the overa l l r i sk of N H L . W e therefore conclude that c o m m o n inher i t ed var iants of A T M do not contr ibute to the r isk of N H L i n the general p o p u l a t i o n . One of the c o m m o n var iants was found to be s igni f i cant ly associated w i t h per iphera l T - c e l l l y m p h o m a ( P T C L ) . However , not on ly d i d th is result not r e m a i n s ignif icant after correct ion for m u l t i p l e tes t ing , but also the number of P T C L cases i n this s t u d y was too s m a l l (n=27) to be conclusive . I n a d d i t i o n , the 6 rare var iants were ana lyzed as a single group of p u t a t i v e l y deleterious var iants . T h i s showed a s igni f i cant ly increased r isk of development of two subtypes of N H L , m a n t l e ce l l l y m p h o m a ( M C L ) a n d m a r g i n a l zone l y m p h o m a ( M Z L ) . W e conc lude t h a t the " rare v a r i a n t - r a r e disease" m o d e l , whereby rare , dele-ter ious v a r i a n t s i n the ge rml ine of these i n d i v i d u a l s c o n t r i b u t e s to t h e i r r i s k of deve l op ing these rare subtypes of disease, m a y be re levant for A T M i n some N H L subtypes . T h i s e x p e r i m e n t was nove l i n i t s use of the rare v a r i a n t s . M o s t assoc ia t i on studies do not possess the power to assess the role of v a r i a n t s w i t h a m i n o r al lele f requency of < 5 % . W e i m p l e m e n t e d a m e t h o d whereby these rare v a r i a n t s were g r o u p e d together based o n a f u n c t i o n a l c lass i f i ca t ion . O t h e r s u c h groups m a y i n -c lude genomic p o s i t i o n , for ins tance , v a r i a n t s present i n the same p r o t e i n d o m a i n . T h i s m a y a l l ow for the e l u c i d a t i o n of factors t h a t have a subt le effect o n disease deve lopment , w h i c h is l i k e l y t o be the case i n c o m p l e x diseases, s u c h as cancer . A l l w o r k presented here in was p e r f o r m e d b y myse l f except the e x t r a c t i o n of D N A f r o m p a t i e n t samples , the assembly of sequence c h r o m a t o g r a m s i n t o C o n s e d a n d most of the s t a t i s t i c a l analyses , as s t a t e d i n the a p p r o p r i a t e sect ions. 5 Future Work A n i m m e d i a t e off-shoot of th is exper iment w o u l d be to examine the t u m o u r s of the i n d i v i d u a l s w i t h M C L or M Z L , who were heterozygous, i n the ir germl ine , for one of the 6 rare , p u t a t i v e l y deleterious var iants . A n analys is of th i s t issue w o u l d al low the d e t e r m i n a t i o n of whether the t u m o u r s were hemizygous , i.e. h a d lost the n o r m a l copy of the A T M gene, or i f b o t h copies of the gene were m u t a t e d in the t u m o u r s . T h i s m a y further va l idate the role of A T M i n the lymphomagenes is of these t u m o u r s . F u r t h e r , a larger s t u d y that looked at these subtypes a n d P T C L i n greater d e t a i l c ou ld va l idate our results . T h i s w o u l d also a l low for a better u n d e r s t a n d i n g of the use of rare var iants i n assoc iat ion studies a n d perhaps lead to new approaches on how to useful ly analyze d a t a generated us ing such var iants . F a m i l y members of A T pat ients are thought to have an increased r isk of can -cer. If c o m m o n enough i n the general p o p u l a t i o n , these a n d equivalent var iants may contr ibute to the p o p u l a t i o n b u r d e n of th is disease. C o n f i r m a t i o n of th is h y p o t h -esis w o u l d require m u c h larger scale studies of thousands of i n d i v i d u a l s , perhaps i n v o l v i n g d i r e c t e d re -sequenc ing of the A T M locus . N e w sequenc ing technologies m a y m a k e such e x p e r i m e n t s more feasible. L i k e a l l a ssoc ia t i on s tud ies , t h i s one w o u l d be s t rengthened b y r e p l i c a t i o n . S i m i l a r studies i n different p o p u l a t i o n s , or i n different geographic areas c o u l d c o n f i r m o u r resul ts . Spec i f i ca l ly , r e p l i c a t i o n or f u r t h e r i n v e s t i g a t i o n of the resul ts w i t h P T C L w i t h a larger n u m b e r of cases w o u l d be v a l u a b l e . Bibliography [1] K . A l l e n - B r a d y a n d N . J . C a m p . C h a r a c t e r i z a t i o n of the l inkage d i s e q u i l i b r i u m s t r u c t u r e a n d i d e n t i f i c a t i o n of tagg ing -snps i n five d n a repa i r genes. BMC Cancer, 5:99, 2005. 1471-2407 ( E l e c t r o n i c ) J o u r n a l A r t i c l e . [2] W . Y . A u a n d R . L i a n g . P e r i p h e r a l t - c e l l l y m p h o m a . Curr Oncol Rep, 4 ( 5 ) : 4 3 4 -42, 2002. 1523-3790 ( P r i n t ) J o u r n a l A r t i c l e R e v i e w . [3] C . B a r l o w , S. H i r o t s u n e , R . P a y l o r , M . L i y a n a g e , M . E c k h a u s , F . C o l l i n s , Y . S h i l o h , J . N . C r a w l e y , T . R i e d , D . T a g l e , a n d A . W y n s h a w - B o r i s . A t m -deficient mice : a p a r a d i g m of a t a x i a te lang iec tas ia . Cell, 86 (1 ) :159 -71 , 1996. 0092-8674 J o u r n a l A r t i c l e . [4] J . C . B a r r e t t , B . F r y , M a i l e r J . , a n d M . D a l y . H a p l o v i e w : ana lys i s a n d v i s u a l i z a -t i o n of Id a n d h a p l o t y p e m a p s , h t t p : / / w w w . b r o a d . m i t . e d u / m p g / h a p l o v i e w / i n d e x . p h p , 2005. [5] S. G . B e c k e r - C a t a n i a a n d R . A . G a t t i . A t a x i a - t e l a n g i e c t a s i a . Adv Exp Med Biol, 495 :191 -8 , 2001. 0065-2598 ( P r i n t ) J o u r n a l A r t i c l e R e v i e w . [6] Y . B e n j a m i n i a n d Y . H o c h b e r g . C o n t r o l l i n g the false d i scovery ra te : A p r a c t i c a l a n d p o w e r f u l a p p r o a c h to m u l t i p l e t e s t ing . Journal of the Royal Statistical Society. Series B (Methodological), 5 7 ( l ) : 2 8 9 - 3 0 0 , 1995. [7] F . B e r t o n i a n d E . Z u c c a . S ta te -o f - the -ar t t h e r a p e u t i c s : m a r g i n a l - z o n e l y m -p h o m a . J Clin Oncol, 23(26) :6415-20 , 2005. 0 7 3 2 - 1 8 3 X ( P r i n t ) J o u r n a l A r t i c l e R e v i e w . [8] P . E . B o n n e n , M . D . S tory , C . L . A s h o r n , T . A . B u c h h o l z , M . M . W e i l , a n d D . L . N e l s o n . H a p l o t y p e s at a t m ident i f y coding-sequence v a r i a t i o n a n d i n d i c a t e a reg ion of extensive l inkage d i s e q u i l i b r i u m . Am J Hum Genet, 67 (6 ) :1437-51 , 2000. 0002-9297 J o u r n a l A r t i c l e . [9] J B o u l t w o o d . A t a x i a t e lang ie c tas ia gene m u t a t i o n s i n l e u k a e m i a a n d l y m -p h o m a . J Clin Pathol, 54(7) :512-516, 2001. [10] C . C . B o u r g u e t , S. G r u f f e r m a n , E . D e l z e l l , E . R . D e L o n g , a n d H . J . C o h e n . M u l t i p l e m y e l o m a a n d f a m i l y h i s t o r y of cancer , a case - contro l s tudy . Cancer, 56(8) :2133-9 , 1985. 0008-543x J o u r n a l A r t i c l e . [11] A . B r o e k s , J . H . U r b a n u s , A . N . F l o o r e , E . C . D a h l e r , J . G . K l i j n , E . J . R u t g e r s , P . D e v i l e e , N . S. R u s s e l l , F . E . v a n L e e u w e n , a n d L . J . v a n ' t Veer . A t m -heterozygous germl ine m u t a t i o n s c o n t r i b u t e to breast cancer - suscept ib i l i ty . Am J Hum Genet, 66 (2 ) :494-500 , 2000. 0002-9297 ( P r i n t ) J o u r n a l A r t i c l e . [12] A . J . B r o o k e s . T a g ' n ' t e l l v2 .0 . h t t p : / / s n p . c g b . k i . s e / t a g n t e l l / . [13] U C S C G e n o m e B r o w s e r . H u m a n m a y 2004 b u i l d , h t t p : / / w w w . g e n o m e . u c s c . e d u / , 2004. [14] P J B y r d , P R C o o p e r , T S t a n k o v i c , H S K u l l a r , G D W a t t s , P J R o b i n s o n , a n d M R T a y l o r . A gene t r a n s c r i b e d f r o m the b i d i r e c t i o n a l a t m p r o m o t e r c o d i n g for a serine r i c h p r o t e i n : a m i n o a c i d sequence, s t r u c t u r e a n d express ion s tudies . Hum. Mol. Genet., 5 (11) :1785-1791 , 1996. [15] E . C a m a c h o , L . H e r n a n d e z , S. H e r n a n d e z , F . T o r t , B . B e l l o s i l l o , S. B e a , F . B o s c h , E . M o n t s e r r a t , A . C a r d e s a , P . L . F e r n a n d e z , a n d E . C a m p o . A t m gene i n a c t i v a t i o n i n m a n t l e ce l l l y m p h o m a m a i n l y occurs b y t r u n c a t i n g m u t a -t i ons a n d missense m u t a t i o n s i n v o l v i n g the p h o s p h a t i d y l i n o s i t o l - 3 k inase do -m a i n a n d is assoc iated w i t h increas ing n u m b e r s of c h r o m o s o m a l imba lances . Blood, 9 9 ( l ) : 2 3 8 - 4 4 , 2002. 0006-4971 ( P r i n t ) J o u r n a l A r t i c l e . [16] L . R . C a r d o n a n d J . I. B e l l . A s s o c i a t i o n s t u d y designs for c o m p l e x diseases. Nat Rev Genet, 2 (2 ) :91 -9 , 2001. 1471-0056 ( P r i n t ) J o u r n a l A r t i c l e R e v i e w . [17] B . D e v l i n , K . R o e d e r , a n d L . W a s s e r m a n . G e n o m i c c o n t r o l , a new a p p r o a c h to genet ic -based assoc ia t i on studies . Theor Popul Biol, 60 (3 ) :155-66 , 2001. 0040-5809 J o u r n a l A r t i c l e R e v i e w R e v i e w , T u t o r i a l . [18] B . E w i n g a n d P . G r e e n . B a s e - c a l l i n g of a u t o m a t e d sequencer traces u s i n g p h r e d . i i . e rror p r o b a b i l i t i e s . Genome Res, 8 (3 ) :186-94 , 1998. 1088-9051 ( P r i n t ) J o u r n a l A r t i c l e . [19] B . E w i n g , L . H i l l i e r , M . C . W e n d l , a n d P . G r e e n . B a s e - c a l l i n g of a u t o m a t e d sequencer traces u s i n g p h r e d . i . a c curacy assessment. Genome Res, 8(3) : 175 -85 , 1998. 1088-9051 ( P r i n t ) J o u r n a l A r t i c l e . [20] N . Y . F a n g , T . C . G r e i n e r , D . D . We isenburger , W . C . C h a n , J . M . Vose , L . M . S m i t h , J . O . A r m i t a g e , R . A . M a y e r , B . L . P i k e , F . S. C o l l i n s , a n d J . G . H a c i a . O l i g o n u c l e o t i d e m i c r o a r r a y s d e m o n s t r a t e the h ighest f requency of a t m m u t a t i o n s i n the m a n t l e ce l l s u b t y p e of l y m p h o m a . Proc Natl Acad Sci USA, 100(9) :5372-7 , 2003. 0027-8424 ( P r i n t ) J o u r n a l A r t i c l e . [21] V . F e r n a n d e z , E . H a r t m a n n , G . O t t , E . C a m p o , a n d A . R o s e n w a l d . P a t h o g e n -esis of m a n t l e - c e l l l y m p h o m a : a l l oncogenic roads l ead to d y s r e g u l a t i o n of ce l l cyc le a n d d n a d a m a g e response p a t h w a y s . J Clin Oncol, 23 (26) :6364-9 , 2005. 0 7 3 2 - 1 8 3 X ( P r i n t ) J o u r n a l A r t i c l e R e v i e w . [22] M . A . F e r r e i r a , L . O ' G o r m a n , P . L e Souef, P . R . B u r t o n , B . G . Toe l l e , C . F . R o b e r t s o n , P . M . V i s s c h e r , N . G . M a r t i n , a n d D . L . Duf fy . R o b u s t e s t i m a t i o n of e x p e r i m e n t w i s e p values a p p l i e d to a genome scan of m u l t i p l e a s t h m a t r a i t s identi f ies a new reg ion of s igni f i cant l inkage o n chromosome 2 0 q l 3 . Am J Hum Genet, 77 (6 ) :1075-85 , 2005. 0002-9297 ( P r i n t ) J o u r n a l A r t i c l e . [23] J . F . F r a u m e n i , W . W e r t e l e c k i , W . A . B l a t t n e r , R . D . J e n s e n , a n d B . G . L e v -e n t h a l . V a r i e d man i f e s ta t i ons of a f a m i l i a l l y m p h o p r o l i f e r a t i v e d i sorder . Am J Med, 59 (1 ) :145 -51 , 1975. 0002-9343 J o u r n a l A r t i c l e . [24] R . A . G a t t i , S. B e c k e r - C a t a n i a , H . H . C h u n , X . S u n , M . M i t u i , C . H . L a i , N . K h a n l o u , M . B a b a e i , R . C h e n g , C . C l a r k , Y . H u o , N . C . U d a r , a n d R . K . Iyer . T h e pathogenesis of a t a x i a - t e l a n g i e c t a s i a , l e a r n i n g f r o m a r o s e t t a stone. Clin Rev Allergy Immunol, 20 (1 ) :87-108 , 2001. 1080-0549 ( P r i n t ) J o u r n a l A r t i c l e R e v i e w . [25] R . A . G a t t i , I . B e r k e l , E . B o d e r , G . B r a e d t , P . C h a r m l e y , P . C o n c a n n o n , F . E r -soy, T . F o r o u d , N . G . Jaspers , K . L a n g e , a n d et a l . L o c a l i z a t i o n of a n a t a x i a -t e lang ie c tas ia gene to chromosome l l q 2 2 - 2 3 . Nature, 336(6199) :577-80 , 1988. 0028-0836 ( P r i n t ) J o u r n a l A r t i c l e . [26] R . A . G a t t i , A . T w a r d , a n d P . C o n c a n n o n . C a n c e r r i s k i n a t m heterozygotes : a m o d e l of p h e n o t y p i c a n d m e c h a n i s t i c differences between missense a n d t r u n -c a t i n g m u t a t i o n s . Mol Genet Metab, 68 (4 ) :419-23 , 1999. 1096-7192 ( P r i n t ) J o u r n a l A r t i c l e R e v i e w . [27] A . A . G o o d a r z i , J . C . J o n n a l a g a d d a , P . D o u g l a s , D . Y o u n g , R . Y e , G . B . M o o r h e a d , S. P . L e e s - M i l l e r , a n d K . K . K h a n n a . A u t o p h o s p h o r y l a t i o n of a t a x i a - t e l a n g i e c t a s i a m u t a t e d is r egu la ted b y p r o t e i n phosphatase 2a. Embo J, 23 (22) :4451-61 , 2004. 0261-4189 ( P r i n t ) J o u r n a l A r t i c l e . [28] D . G o r d o n , C . A b a j i a n , a n d P . G r e e n . C o n s e d : a g r a p h i c a l t o o l for sequence finishing. Genome Res, 8 (3 ) :195-202 , 1998. 1088-9051 J o u r n a l A r t i c l e . [29] T . C . G r e i n e r , C . D a s g u p t a , V . V . H o , D . D . We i senburger , L . M . S m i t h , J . C . L y n c h , J . M . Vose , K . F u , J . O . A r m i t a g e , R . M . B r a z i e l , E . C a m p o , J . D e l a b i e , R . D . G a s c o y n e , E . S. Jaffe , H . K . M u l l e r - H e r m e l i n k , G . O t t , A . R o s e n w a l d , L . M . S t a u d t , M . Y . I m , M . W . K a r a m a n , B . L . P i k e , W . C . C h a n , a n d J . G . H a c i a . M u t a t i o n a n d genomic de l e t i on s ta tus of a t a x i a t e lang iec tas ia m u t a t e d (atm) a n d p53 confer specif ic gene express ion profi les i n m a n t l e ce l l l y m p h o m a . Proc Natl Acad Sci USA, 103(7) :2352-7 , 2006. 0027-8424 ( P r i n t ) J o u r n a l A r t i c l e . [30] F . G u m y - P a u s e , P . W a c k e r , a n d A . P . S a p p i n o . A t m gene a n d l y m p h o i d m a -l ignanc ies . Leukemia, 18(2) :238-42 , 2004. 0887-6924 J o u r n a l A r t i c l e R e v i e w R e v i e w , T u t o r i a l . [31] N a n c y L e e H a r r i s , H a r a l d S t e i n , S a r a h E . C o u p l a n d , M i c h a e l H u m m e l , R i c -cardo D a l l a F a v e r a , L a u r a P a s q u a l u c c i , a n d W i n g C . C h a n . N e w approaches to l y m p h o m a diagnos is . Hematology, 2 0 0 1 ( l ) : 1 9 4 - 2 2 0 , 2001. [32] J . H . H o e i j m a k e r s . G e n o m e m a i n t e n a n c e m e c h a n i s m s for p r e v e n t i n g cancer . Nature, 411 (6835) :366-74 , 2001. 0028-0836 ( P r i n t ) J o u r n a l A r t i c l e R e v i e w . [33] R . D . J o h n s o n a n d M . J a s i n . S is ter c h r o m a t i d gene convers ion is a p r o m i n e n t d o u b l e - s t r a n d break repa i r p a t h w a y i n m a m m a l i a n cel ls . Embo J, 19 (13 ) :3398-407, 2000. 0261-4189 ( P r i n t ) J o u r n a l A r t i c l e . [34] X . K e a n d L . R . C a r d o n . Ef f i c ient selective screening of h a p l o t y p e t a g snps . Bioinformatics, 19(2) :287-8 , 2003. 1367-4803 J o u r n a l A r t i c l e . [35] M . F . L a v i n , G . B i r r e l l , P . C h e n , S. K o z l o v , S. S c o t t , a n d N . G u e v e n . A t m s i g n a l i n g a n d genomic s t a b i l i t y i n response to d n a damage . Mutat Res, 569(1-2 ) :123-32 , 2005. 0027-5107 J o u r n a l A r t i c l e . [36] J . H . L e e a n d T . T . P a u l l . D i r e c t a c t i v a t i o n of the a t m p r o t e i n k inase b y the m r e l l / r a d 5 0 / n b s l c o m p l e x . Science, 304(5667) :93-6 , 2004. 1095-9203 ( E l e c t r o n i c ) J o u r n a l A r t i c l e . [37] N . C . L e v i t t a n d I . D . H i c k s o n . C a r e t a k e r t u m o u r suppressor genes t h a t de-fend genome integr i ty . Trends Mol Med, 8 (4 ) :179-86 , 2002. 1471-4914 ( P r i n t ) J o u r n a l A r t i c l e R e v i e w . [38] A . L i , Y . H u a n g , a n d M . S w i f t . N e u t r a l sequence v a r i a n t s a n d h a p l o t y p e s at the 150 k b a t a x i a - t e l a n g i e c t a s i a locus . Am J Med Genet, 86 (2 ) :140-4 , 1999. 0148-7299 ( P r i n t ) J o u r n a l A r t i c l e . [39] K . J . L i v a k . A l l e l i c d i s c r i m i n a t i o n u s i n g f luorogenic probes a n d the 5' nuclease assay. Genet Anal, 14(5-6): 143 -9 , 1999. J o u r n a l A r t i c l e . [40] M . L o b r i c h a n d P . A . Jeggo . H a r m o n i s i n g the response to dsbs: a n e w s t r i n g i n the a t m bow. DNA Repair (Amst), 4 (7 ) :749-59 , 2005. 1568-7864 ( P r i n t ) J o u r n a l A r t i c l e R e v i e w . [41] H . T . L y n c h , J . N . M a r c u s , D . D . We i senburger , P . W a t s o n , M . L . F i t z s i m -m o n s , H . G r i e r s o n , D . M . S m i t h , J . L y n c h , a n d D . P u r t i l o . G e n e t i c a n d i m -m u n o p a t h o l o g i c a l f indings i n a l y m p h o m a fami ly . Br J Cancer, 59 (4 ) :622-6 , 1989. 0007-0920 J o u r n a l A r t i c l e . [42] M . M a t s u o k a . P t c l : lessons f r o m a d u l t t - c e l l l e u k e m i a . Int J Hematol, 76 S u p p l 2 :116-7 , 2002. 0925-5710 ( P r i n t ) J o u r n a l A r t i c l e R e v i e w . [43] M . J . M c E a c h e r n , A . K r a u s k o p f , a n d E . H . B l a c k b u r n . Te lomeres a n d t h e i r c o n t r o l . Annu Rev Genet, 34 :331-358 , 2000. 0066-4197 ( P r i n t ) J o u r n a l A r t i c l e R e v i e w . [44] P . J . M c K i n n o n . A t m a n d a t a x i a t e lang iec tas ia . EMBO Rep, 5 (8 ) :772-6 , 2004. 1469-221x J o u r n a l A r t i c l e R e v i e w R e v i e w , T u t o r i a l . [45] J . A . M e t c a l f e , J . P a r k h i l l , L . C a m p b e l l , M . Stacey, P . B i g g s , P . J . B y r d , a n d A . M . T a y l o r . A c c e l e r a t e d te lomere s h o r t e n i n g i n a t a x i a t e lang iec tas ia . Nat Genet, 13(3) :350-3 , 1996. 1061-4036 ( P r i n t ) J o u r n a l A r t i c l e . [46] M . N e i a n d J . C . M i l l e r . A s i m p l e m e t h o d for e s t i m a t i n g average n u m b e r of nuc leo t ide s u b s t i t u t i o n s w i t h i n a n d between p o p u l a t i o n s f r o m r e s t r i c t i o n d a t a . Genetics, 125(4) :873 -9 , 1990. 0016-6731 ( P r i n t ) J o u r n a l A r t i c l e . [47] D . A . N i c k e r s o n . D i s p l a y i n g genotype d a t a : V i s u a l genotypes . http://pga. gs.Washington.edu/VG2.html, 2005. [48] D . A . N i c k e r s o n , S. L . T a y l o r , K . M . We iss , A . G . C l a r k , R . G . H u t c h i n s o n , J . S t e n g a r d , V . S a l o m a a , E . V a r t i a i n e n , E . B o e r w i n k l e , a n d C . F . S i n g . D n a sequence d i v e r s i t y i n a 9 .7-kb reg i on of the h u m a n l i p o p r o t e i n l ipase gene. Nat Genet, 19(3) :233-40, 1998. 1061-4036 ( P r i n t ) J o u r n a l A r t i c l e . [49] D . A . N i c k e r s o n , V . O . T o b e , a n d S. L . T a y l o r . P o l y p h r e d : a u t o m a t i n g the de te c t i on a n d g e n o t y p i n g of s ingle nuc leo t ide s u b s t i t u t i o n s u s i n g f luorescence-based resequencing . Nucleic Acids Res, 25 (14 ) :2745-51 , 1997. 0305-1048 J o u r -n a l A r t i c l e . [50] N a t i o n a l C a n c e r I n s t i t u t e of C a n a d a . C a n a d i a n cancer s ta t i s t i c s . http://www.ncic.cancer.ca/vgn/images/portal/cit_86751114/31/ 23/935505938cw_2006stats_en. pdf. pdf, 2006. [51] T . K . P a n d i t a . A m u l t i f a c e t e d role for a t m i n genome m a i n t e n a n c e . Expert Rev Mol Med, 2003 :1 -21 , 2003. 1462-3994 N e w s . [52] D . M . P a r k i n , F . B r a y , J . Fer lay , a n d P . P i s a n i . G l o b a l cancer s ta t i s t i c s , 2002. CA Cancer J Clin, 55 (2 ) :74-108 , 2005. 0007-9235 ( P r i n t ) J o u r n a l A r t i c l e . [53] P . D . P h a r o a h , A . M . D u n n i n g , B . A . P o n d e r , a n d D . F . E a s t o n . A s s o c i a -t i o n studies for finding c a n c e r - s u s c e p t i b i l i t y genet ic v a r i a n t s . Nat Rev Cancer, 4 ( l l ) : 8 5 0 - 6 0 , 2004. 1474-175X ( P r i n t ) J o u r n a l A r t i c l e R e v i e w . [54] P o l y P h e n : p r e d i c t i o n of f u n c t i o n a l effect of h u m a n n s S N P s . http://tux. embl-heidelberg.de/raimensky/. [55] S. R o z e n a n d H . Skaletsky . P r i m e r 3 . http://frodo.wi.mit.edu/cgi-bin/ primer3/primer3_www.cgi, 2000. [56] K . S a v i t s k y , A . B a r - S h i r a , S. G i l a d , G . R o t m a n , Y . Z i v , L . V a n a g a i t e , D . A . T a g l e , S. S m i t h , T . U z i e l , S. Sfez, a n d et a l . A single a t a x i a t e lang ie c tas ia gene w i t h a p r o d u c t s i m i l a r t o p i - 3 k inase . Science, 268(5218) :1749-53 , 1995. 0036-8075 J o u r n a l A r t i c l e . [57] K S a v i t s k y , M P l a t z e r , T U z i e l , S G i l a d , A S a r t i e l , A R o s e n t h a l , O E l r o y -S t e i n , Y S h i l o h , a n d G R o t m a n . A t a x i a - t e l a n g i e c t a s i a : s t r u c t u r a l d i v e r s i t y of u n t r a n s l a t e d sequences suggests c o m p l e x p o s t - t r a n s c r i p t i o n a l r e g u l a t i o n of a t m gene express ion . Nucl. Acids Res., 25(9 ) :1678-1684, 1997. [58] C . Schaffner, I . Id ler , S. S t i l genbauer , H . D o h n e r , a n d P . L i c h t e r . M a n t l e ce l l l y m p h o m a is charac te r i zed b y i n a c t i v a t i o n of the a t m gene. Proc Natl Acad Sci USA, 97 (6 ) :2773-8 , 2000. 0027-8424 ( P r i n t ) J o u r n a l A r t i c l e . [59] D . J . S c h a i d . H a p l o . s t a t s , http://mayoresearch.mayo.edu/mayo/research/ biostat/schaid.cfm. [60] S. P . S c o t t , R . B e n d i x , P . C h e n , R . C l a r k , T . D o r k , a n d M . F . L a v i n . Missense m u t a t i o n s b u t not a l le l i c v a r i a n t s a l ter the f u n c t i o n of a t m b y d o m i n a n t i n t e r -ference i n pat i ents w i t h breast cancer . Proc Natl Acad Sci USA, 99 (2 ) :925-30 , 2002. 0027-8424 J o u r n a l A r t i c l e . [61] P . S e b a s t i a n i , R . L a z a r u s , S. T . We iss , L . M . K u n k e l , I. S. K o h a n e , a n d M . F . R a m o n i . M i n i m a l h a p l o t y p e t a g g i n g . Proc Natl Acad Sci USA, 100(17) :9900-5, 2003. 0027-8424 J o u r n a l A r t i c l e . [62] Y . S h i l o h . A t m a n d r e l a t e d p r o t e i n kinases : sa feguard ing genome integr i ty . Nat Rev Cancer, 3 (3 ) :155-68 , 2003. 1474-175x J o u r n a l A r t i c l e R e v i e w . [63] O . S h p i l b e r g , M . M o d a n , B . M o d a n , A . C h e t r i t , Z . F u c h s , a n d B . R a m o t . F a m i l i a l aggregat ion of h a e m a t o l o g i c a l neoplasms: a c o n t r o l l e d s tudy . Br J Haematol, 8 7 ( l ) : 7 5 - 8 0 , 1994. 0007-1048 J o u r n a l A r t i c l e . [64] A m e r i c a n C a n c e r Society . S t a t i s t i c s for 2005. http: //www. c a n c e r . o r g , 2005. [65] S P S S , http://www.spss.com. [66] M . Stephens a n d P . D o n n e l l y . A c o m p a r i s o n of bayes ian m e t h o d s for h a p l o t y p e r e c o n s t r u c t i o n f r o m p o p u l a t i o n genotype d a t a . Am J Hum Genet, 73 (5 ) :1162-9 , 2003. 0002-9297 J o u r n a l A r t i c l e . [67] M . Stephens , N . J . S m i t h , a n d P . D o n n e l l y . A new s t a t i s t i c a l m e t h o d for h a p l o t y p e r e c o n s t r u c t i o n f r o m p o p u l a t i o n d a t a . Am J Hum Genet, 68 (4 ) : 978 -89, 2001. 0002-9297 J o u r n a l A r t i c l e . [68] D . O . S t r a m , C . A . H a i m a n , J . N . H i r s c h h o r n , D . A l t s h u l e r , L . N . K o l o n e l , B . E . H e n d e r s o n , a n d M . C . P i k e . C h o o s i n g h a p l o t y p e - t a g g i n g snps based o n u n p h a s e d genotype d a t a u s i n g a p r e l i m i n a r y s a m p l e of u n r e l a t e d sub jec ts w i t h a n e x a m p l e f r o m the m u l t i e t h n i c cohort s tudy . Hum Hered, 5 5 ( l ) : 2 7 - 3 6 , 2003. 0001-5652 J o u r n a l A r t i c l e . [69] S. S u n y a e v , V . R a m e n s k y , a n d P . B o r k . T o w a r d s a s t r u c t u r a l basis of h u m a n n o n - s y n o n y m o u s s ingle nuc leo t ide p o l y m o r p h i s m s . Trends Genet, 16 (5 ) :198 -200, 2000. 0168-9525 J o u r n a l A r t i c l e R e v i e w R e v i e w , T u t o r i a l . [70] S. S u n y a e v , V . R a m e n s k y , I. K o c h , 3 r d L a t h e , W . , A . S. K o n d r a s h o v , a n d P . B o r k . P r e d i c t i o n of deleter ious h u m a n alleles. Hum Mol Genet, 10(6) :591-7 , 2001. 0964-6906 J o u r n a l A r t i c l e . [71] A . M . T a y l o r , J . A . M e t c a l f e , J . T h i c k , a n d Y . F . M a k . L e u k e m i a a n d l y m p h o m a i n a t a x i a te lang iec tas ia . Blood, 87 (2 ) :423-38 , 1996. 0006-4971 J o u r n a l A r t i c l e R e v i e w . [72] Y . R . T h o r s t e n s o n , P . S h e n , V . G . T u s h e r , T . L . W a y n e , R . W . D a v i s , G . C h u , a n d P . J . Oefner . G l o b a l ana lys i s of a t m p o l y m o r p h i s m reveals s igni f i cant func -t i o n a l c o n s t r a i n t . Am J Hum Genet, 69 (2 ) :396-412 , 2001. 0002-9297 J o u r n a l A r t i c l e . [73] T a m a r U z i e l , K i n n e r e t S a v i t s k y , M a t t h i a s P l a t z e r , Y a e l Z i v , T a l H e l b i t z , M i c h a e l N e h l s , T h o m a s B o e h m , A n d r e R o s e n t h a l , Yose f S h i l o h , a n d G a l i t R o t -m a n . G e n o m i c o r g a n i z a t i o n of the a t m gene. Genomics, 33 (2 ) :317-320 , 1996. T Y - J O U R . [74] K . V a l e r i e a n d L . F . P o v i r k . R e g u l a t i o n a n d m e c h a n i s m s of m a m m a l i a n doub le -s t r a n d break repa i r . Oncogene, 22(37) :5792-812, 2003. 0950-9232 ( P r i n t ) J o u r -n a l A r t i c l e R e v i e w . [75] D . C . v a n G e n t , J . H . H o e i j m a k e r s , a n d R . K a n a a r . C h r o m o s o m a l s t a b i l i t y a n d the d n a d o u b l e - s t r a n d e d break c o n n e c t i o n . Nat Rev Genet, 2 (3 ) :196-206, 2001. 1471-0056 ( P r i n t ) J o u r n a l A r t i c l e R e v i e w . [76] I. Vorechovsky , D . R a s i o , L . L u o , C . M o n a c o , L . H a m m a r s t r o m , A . D . W e b s t e r , J . Z a l o u d i k , G . B a r b a n t i - B r o d a n i , M . J a m e s , G . R u s s o , a n d et a l . T h e a t m gene a n d s u s c e p t i b i l i t y t o breast cancer : ana lys i s of 38 breast t u m o r s reveals no ev idence for m u t a t i o n . Cancer Res, 56 (12) :2726-32 , 1996. 0008-5472 ( P r i n t ) J o u r n a l A r t i c l e . [77] S. W a c h o l d e r , S. C h a n o c k , M . G a r c i a - C l o s a s , L . E l G h o r m l i , a n d N . R o t h m a n . A s s e s s i n g the p r o b a b i l i t y t h a t a p o s i t i v e r epor t is false: a n a p p r o a c h for mo le -c u l a r ep idemio l ogy s tudies . J Natl Cancer Inst, 96 (6 ) :434-42 , 2004. 1460-2105 (E le c t ron i c ) J o u r n a l A r t i c l e . [78] E . W e t e r i n g s a n d D . C . v a n G e n t . T h e m e c h a n i s m of non -homologous e n d -j o i n i n g : a synops is of synaps i s . DNA Repair (Amst), 3 ( l l ) : 1 4 2 5 - 3 5 , 2004. 1568-7864 ( P r i n t ) J o u r n a l A r t i c l e R e v i e w . [79] P . H . W i e r n i k , S. Q . W a n g , X . P . H u , P . M a r i n o , a n d E . P a i e t t a . A g e of onset evidence for a n t i c i p a t i o n i n f a m i l i a l n o n - h o d g k i n ' s l y m p h o m a . Br J Haematol, 1 0 8 ( l ) : 7 2 - 9 , 2000. 0007-1048 J o u r n a l A r t i c l e . [80] K . K . W o n g , R . S. M a s e r , R . M . B a c h o o , J . M e n o n , D . R . C a r r a s c o , Y . G u , F . W . A l t , a n d R . A . D e P i n h o . Te l omere d y s f u n c t i o n a n d a t m def ic iency compromises o r g a n homeostas is a n d accelerates ageing. Nature, 421 (6923 ) :643 -8, 2003. 0028-0836 ( P r i n t ) J o u r n a l A r t i c l e . [81] C . W y m a n , D . R i s t i c , a n d R . K a n a a r . H o m o l o g o u s r e c o m b i n a t i o n - m e d i a t e d d o u b l e - s t r a n d break repa i r . DNA Repair (Amst), 3(8-9) : 827 -33 , 2004. 1568-7864 ( P r i n t ) J o u r n a l A r t i c l e R e v i e w . [82] Y . X u , T . A s h l e y , E . E . B r a i n e r d , R . T . B r o n s o n , M . S. M e y n , a n d D . B a l t i -more . T a r g e t e d d i s r u p t i o n of a t m leads to g r o w t h r e t a r d a t i o n , c h r o m o s o m a l f r a g m e n t a t i o n d u r i n g meios is , i m m u n e defects, a n d t h y m i c l y m p h o m a . Genes Dev, 10(19) :2411-22, 1996. 0890-9369 ( P r i n t ) J o u r n a l A r t i c l e . Appendix A Primers and probes A . l P C R primers L i s t e d here are the sequences a n d a n n e a l i n g t e m p e r a t u r e s of the p r i m e r s used to P C R a m p l i f y the regions of interest of A T M , p r i o r t o sequenc ing . A . 2 Tag M a n sssays T h i s tab l e l i s ts the sequence of the p r i m e r s a n d probes used to genotype the 12 selected var iants of A T M . T h e s e p r i m e r s a n d probes were des igned a n d generated b y Assays — by — DesignSM ( A p p l i e d B i o s y s t e m s , C A , U S A ) . T a b l e A . l : P C R p r i m e r s a n d c o n d i t i o n s amplicon annealing product Forward primer Reverse primer temperature(C) size (bp) sequence sequence Promoter 60 530 T G T A A A A C G A C G G C C A G T t c c t t c t g t c c a g c a t a g c c C A G G A A A C A G C T A T G A C a a c a c t g c c c c a a a a c a t t c Leader exon l a 60 535 T G T A A A A C G A C G G C C A G T t t c c g t c c t c a g a c t t g g a g C A G G A A A C A G C T A T G A C g a c a g a c t g g g t c g c a c a c E x o n 3 61 531 T G T A A A A C G A C G G C C A G T t c a g t t c c g c c a a c a t a c t g C A G G A A A C A G C T A T G A C t t t t c c c c a t g g t g t g a c t t Intron 3 61 576 T G T A A A A C G A C G G C C A G T c t g c t g c c c a g a t a t g a c t t c C A G G A A A C A G C T A T G A C c c c a t c c c c c a a c a c t a t t a E x o n 4 60 318 T G T A A A A C G A C G G C C A G T t t t t t c a c a c c t c t t t c t c t c t a C A G G A A A C A G C T A T G A C c a g g c g c t t a a a t t t c t c a a E x o n 5 60 408 T G T A A A A C G A C G G C C A G T t g c t g c c g t c a a c t a g a a c a C A G G A A A C A G C T A T G A C t g c c a a a t t c a t a t g c a a g g E x o n 6 58 375 T G T A A A A C G A C G G C C A G T g c t c t t t g t g a t g g c a t g a a C A G G A A A C A G C T A T G A C c a a a c t t a t g c a a c a g t t a a g t c c E x o n 7 58 335 T G T A A A A C G A C G G C C A G T g c c a t t c c a a g t g t c t t a t t t t t C A G G A A A C A G C T A T G A C c a g a g t g c t t t c t t t g g t g a a g E x o n 8 60 377 T G T A A A A C G A C G G C C A G T c c t t t t t c t g t a t g g g a t t a t g g C A G G A A A C A G C T A T G A C c t g a g t c t a a a a c a t g g t c t t g c E x o n 9 61 480 T G T A A A A C G A C G G C C A G T t t t t c t t t c a g c a t a c c a c t t c a C A G G A A A C A G C T A T G A C t c a a c c a g a g a a a t c c a g a g g E x o n 10 57 383 T G T A A A A C G A C G G C C A G T t a a c g c t g a t g c a g c t t g a c C A G G A A A C A G C T A T G A C c a c a g g t t t t a a a a g c c c a a a E x o n 11 57 417 T G T A A A A C G A C G G C C A G T g t g c t g t t c c a c t c c a a c c t C A G G A A A C A G C T A T G A C g g t t t g g g g g t a g a c a a a t g E x o n 12 60 601 T G T A A A A C G A C G G C C A G T t g g a a a t g a t g g t g a t t c t c t C A G G A A A C A G C T A T G A C t g a t c a g g g a t a t g t g a g t g t g E x o n 13 60 388 T G T A A A A C G A C G G C C A G T g c c a g g c a c t g t c c t g a t a C A G G A A A C A G C T A T G A C a a a g c c a t c t g g c a t c a a a t E x o n 14 58 331 T G T A A A A C G A C G G C C A G T t c t a g g a t c c a a a t t t t a g a a g t c a a C A G G A A A C A G C T A T G A C t g c a g c t a c t a c c c a g c t a a a a E x o n 15 62 513 T G T A A A A C G A C G G C C A G T a g t c t t t g a a t g a t g t a g a t a c t a g g C A G G A A A C A G C T A T G A C c c t t t a c t g c c a c t t t g c t t E x o n 16 58 415 T G T A A A A C G A C G G C C A G T t c c a g g a t a t g c c a c c t t t a C A G G A A A C A G C T A T G A C a a a g a g a a a g g g t t a a c c t g c a t E x o n 17 58 321 T G T A A A A C G A C G G C C A G T t c a a a g t a c a c t g t a a a a a g c a a t a c t C A G G A A A C A G C T A T G A C t t g t g a c a a t c c c a c t g c a c E x o n 18 61 335 T G T A A A A C G A C G G C C A G T a g a a a a c a c t g t c t g c c a a g a a C A G G A A A C A G C T A T G A C t g g c c t t a a t t t c c a c a t t t E x o n 19 56 361 T G T A A A A C G A C G G C C A G T g t g c c c a g c c t g a t t a g g t a C A G G A A A C A G C T A T G A C g a g g c c t c t t a t a c t g c c a a a E x o n 20 56 418 T G T A A A A C G A C G G C C A G T t g a a g a g g a g g a a a t t t g a g t t C A G G A A A C A G C T A T G A C t c t t c a a a g a c a c c a t g t g a t t c E x o n 21 60 335 T G T A A A A C G A C G G C C A G T g c a c c c g g c c t a t g t t t a C A G G A A A C A G C T A T G A C t c t t g g t c a c g a c g a t a c a a E x o n 22 60 396 T G T A A A A C G A C G G C C A G T g a t g t t c t t g a a c t t c t g a a a c c a C A G G A A A C A G C T A T G A C t g g a t a c a a a a c t t g c a t t c g E x o n 23 62 247 T G T A A A A C G A C G G C C A G T t t c a g t g a g t t t t c t g a g t g c t t t C A G G A A A C A G C T A T G A C t c a t t a a c a a a c a a a g a c t g c t t t a E x o n 24 62 317 T G T A A A A C G A C G G C C A G T g c a g t c t t t g t t t g t t a a t g a g t a a t C A G G A A A C A G C T A T G A C t t g c a a c t g t g a g c t g t t a c t a t g E x o n 25 61 305 T G T A A A A C G A C G G C C A G T t g c t t t g g a a a g t a g g g t t t g C A G G A A A C A G C T A T G A C a c c a a a c t t g g t g a a g t a a t t t a t g E x o n 26 61 380 T G T A A A A C G A G G G C C A G T t c t g g a g t t c a g t t g g g a t t t t a C A G G A A A C A G C T A T G A C a g t g c c a c t c a g a a a a t c t a g c E x o n 27 61 418 T G T A A A A C G A C G G C C A G T c c a t c t c a t a g a t g a g g a a a t c a a C A G G A A A C A G C T A T G A C c t g g t g a g g g g a c t t g c t a a E x o n 28 58 507 T G T A A A A C G A C G G C C A G T c t t t a a t g c t g a t g g t a t t a a a a c a g C A G G A A A C A G C T A T G A C t g a t t a c c a c a a g c t a a g t t t c a a E x o n 29 61 296 T G T A A A A C G A C G G C C A G T t g a g c t g t c t t g a c g t t c a c a C A G G A A A C A G C T A T G A C a g a c a t t g a a g g t g t c a a c c a E x o n 30 56 406 T G T A A A A C G A C G G C C A G T g g t t t t t g a a t t t g g g g g t t a C A G G A A A C A G C T A T G A C t g a a t g t t t t c c t t t t t a a t t a t g a g Exon 31 58 407 T G T A A A A C G A C G G C C A G T a a a g t g t a t t t a t t g t a g c c g a g t a t C A G G A A A C A G C T A T G A C g c g g a c a g a g t g a g t c t t t g E x o n 32 62 395 T G T A A A A C G A C G G C C A G T g g c a t a t a a g a a t t a g a g a t g c t g a a c C A G G A A A C A G C T A T G A C c a t a a a a c a c t c a a a t c c t t c t a a c a E x o n 33 61 418 T G T A A A A C G A C G G C C A G T a a g c t g g g t a t c t t a g a c g t a a C A G G A A A C A G C T A T G A C t g c t a g a g c a t t a c a g a t t t t t g a E x o n 34 58 420 T G T A A A A C G A C G G C C A G T a a c a t t g t a g g g t t t g c a g t C A G G A A A C A G C T A T G A C a a a c c a a g a g c a a g a c t t t g E x o n 35 56 374 T G T A A A A C G A C G G C C A G T t g a g c t a c t c a t g a c t t a a a a c c t C A G G A A A C A G C T A T G A C a t a c t a c a g g c a a c a g a a a a c a t a E x o n 36 62 453 T G T A A A A C G A C G G C C A G T g g t t a a t t c t t g a a g t a c a g a a a a a c a C A G G A A A C A G C T A T G A C a a g a a t t t t c a t a a a g a c a c t g a g a t t E x o n 37 62 315 T G T A A A A C G A C G G C C A G T c a g t g g a g g t t a a c a t t c a t c a a g C A G G A A A C A G C T A T G A C t g a c c c a c a g c a a a c a g a a E x o n 38 62 389 T G T A A A A C G A C G G C C A G T t t g t g t a g g a a a g g t a c a a t g a t t t c C A G G A A A C A G C T A T G A C a a c a g t t t g a g t g g g g g t g a E x o n 39 58 379 T G T A A A A C G A C G G C C A G T a t a t g t c a a c g g g g c a t g a a C A G G A A A C A G C T A T G A C g g g a t t c c a t c t t a a a t c c a t c E x o n 40 58 285 T G T A A A A C G A C G G C C A G T g a a g g a a g a a g g t g t g t a a g c a C A G G A A A C A G C T A T G A C t g c a a c a c c t t c a c c t a a a a t E x o n 41 58 415 T G T A A A A C G A C G G C C A G T g g a a a t g t g g t t t t t g g g a a t C A G G A A A C A G C T A T G A C g g g a a c a g g a g g c a a a a t a a E x o n 42 56 272 T G T A A A A C G A C G G C C A G T a g t a t a t g t a t t c a g g a g c t t c c a a C A G G A A A C A G C T A T G A C g g c a t c t g t a c a g t g t c t a t a a c a a a E x o n 43 58 298 T G T A A A A C G A C G G C C A G T t t g g g a g t t a c a t a t t g g t a a t g a t a c C A G G A A A C A G C T A T G A C g c t t t g g g t t t t a c a c a c a c a t E x o n 44 56 272 T G T A A A A C G A C G G C C A G T a a t t c t g t t t a t g a a g g a g t t a t g t g C A G G A A A C A G C T A T G A C c c a a c a t a c t g a a a t a a c c t c a g c E x o n 45 52 392 T G T A A A A C G A C G G C C A G T c c a g c t g a t a t t t t g g g a t t t t C A G G A A A C A G C T A T G A C g a g a a a a a c a g t t g t t g t t t a g a a t g a E x o n 46 56 284 T G T A A A A C G A C G G C C A G T t t g t c c t t t g g t g a a g c t a t t t C A G G A A A C A G C T A T G A C t t c a g a a a a g a a g c c a t g a c a E x o n 47 52 308 T G T A A A A C G A C G G C C A G T a a a c a t t t a t t t c c c t g a a a a c c C A G G A A A C A G C T A T G A C t g c c c g g c c t a t a g t t t t t a t E x o n 48 56 481 T G T A A A A C G A C G G C C A G T t t c t a g t c t t g t c a c t a c a a a a g t t c c C A G G A A A C A G C T A T G A C t t c c c t c a g g c t t t c t g t t t E x o n 49 54 409 T G T A A A A C G A C G G C C A G T c c t c a a t g a a t g g t a g t t g c C A G G A A A C A G C T A T G A C a c t a a t t t c a a g g c t c t a a t a a a a E x o n 50 60 284 T G T A A A A C G A C G G C C A G T a t g a a g g g c a g t t g g g t a c a C A G G A A A C A G C T A T G A C g a t c t t g a t g a a a a g a t g a a g c a t E x o n 51 56 410 T G T A A A A C G A C G G C C A G T a c c t t a a t t t g a g t g a t t c t t t a g C A G G A A A C A G C T A T G A C g a c c a a g t c a c t c t t t c t a t g c E x o n 52 60 425 T G T A A A A C G A C G G C C A G T c c c t g g g a t a a a a a c c c a a c C A G G A A A C A G C T A T G A C t c c t g a c a t c a a g g g g c t t a E x o n 53 56 350 T G T A A A A C G A C G G C C A G T t t g t g c t a a t a g a g g a g c a c t g t c C A G G A A A C A G C T A T G A C t c c a t t t c t t a g a g g g a a t g g t a t E x o n 54 61 426 T G T A A A A C G A C G G C C A G T c g c t c t a c c c a c t g c a g t a t c C A G G A A A C A G C T A T G A C c c a g c c t t g a a c c g a t t t t a E x o n 55 58 390 T G T A A A A C G A C G G C C A G T t t g t g c a t a a a t t c t g t t t t t c t c C A G G A A A C A G C T A T G A C g c t t t t g g a t t a c g t t t g t g a t t E x o n 56 60 285 T G T A A A A C G A C G G C C A G T t g c t t g a c c t t c a a t g c t g t C A G G A A A C A G C T A T G A C g c c a a t a t t t a a c c a a t t t t g a c c E x o n 57 60 385 T G T A A A A C G A C G G C C A G T t c a c a t c g t c a t t t g t t t c t c t g C A G G A A A C A G C T A T G A C a a g a c a a a a t c c c a a a t a a a g c a g E x o n 58 60 285 T G T A A A A C G A C G G C C A G T c t a t t c t c a g a t g a c t c t g t g t t t t t C A G G A A A C A G C T A T G A C c c a a c c a a a t g g c a t c t t t E x o n 59 62 341 T G T A A A A C G A C G G C C A G T a a a t g c t t t g c a c t g a c t c t g a C A G G A A A C A G C T A T G A C g c t g t c a g c t t t a a t a a g c c a t t E x o n 60 62 420 T G T A A A A C G A C G G C C A G T c t g t t c a t c t t t a t t g c c c c t a C A G G A A A C A G C T A T G A C c a c t a t c a t c c c c c t g c a a c E x o n 61 60 326 T G T A A A A C G A C G G C C A G T c a t c a t t t a a g t a g g c t a a a a a t c c t C A G G A A A C A G C T A T G A C a g g c a a a c a a c a t t c c a t g a E x o n 62 60 315 T G T A A A A C G A C G G C C A G T c g t a g g t a a c a t g t g g t t t c t t g C A G G A A A C A G C T A T G A C c c c a g c c c a t g t a a t t t t g a E x o n 63 60 270 T G T A A A A C G A C G G C C A G T a g c a t a g g c t c a g c a t a c t a c a c C A G G A A A C A G C T A T G A C g a c t t c c t g a t g a g a t a c a c a g t c t E x o n 64 58 326 T G T A A A A C G A C G G C C A G T t g g c t t a t t t g t a t g a t a c t g g t t c t C A G G A A A C A G C T A T G A C a a g g c c t t g g g a a t a a g a a a a E x o n 65 60 414 T G T A A A A C G A C G G C C A G T t g c a a a c g a a a t c t c a g g t g C A G G A A A C A G C T A T G A C t g g c a g g t t a a a a a t a a a g g c t a 3 ' U T R l 55 617 T G T A A A A C G A C G G C C A G T a t g g a a a g c t t g g g t g t g a t C A G G A A A C A G C T A T G A C a g g a a a a a t c c a a a t a a g t t t c t g 3 ' U T R 2 58 574 T G T A A A A C G A C G G C C A G T t g g t c t t a a g g a a c a t c t c t g c C A G G A A A C A G C T A T G A C t g g a c a g t a c a g a a g g g c t t a a a 3 ' U T R 3 60 508 T G T A A A A C G A C G G C C A G T t g g a t t t t t c c t a g t a a g a t c a c t c a C A G G A A A C A G C T A T G A C c a a a a c t g g a t g a a c a g c c t a t c 3 ' U T R 4 60 493 T G T A A A A C G A C G G C C A G T t t t c a g a t c t c t g t t t c t t g a t g t c C A G G A A A C A G C T A T G A C g c c a t a a a g g t g g g a c a c a t 3 ' U T R 5 60 735 T G T A A A A C G A C G G C C A G T c c a a g g c a a a c a c a c t t c c t C A G G A A A C A G C T A T G A C c t a a g c c c t t c c c t t c c a a c 3 ' U T R 6 58 517 T G T A A A A C G A C G G C C A G T c t g g t t t t t c a t t c c c c t c a C A G G A A A C A G C T A T G A C g g g g a c a g a g a a a t g t t c c a 3 ' U T R 7 58 527 T G T A A A A C G A C G G C C A G T c c c c t c a t t t t t g a c c g t a a C A G G A A A C A G C T A T G A C t c t c c a g a a g t c a a a c c a a g a a 3 ' U T R 8 61 533 T G T A A A A C G A C G G C C A G T c a a a t g g g t g a t t g a g c t t t c C A G G A A A C A G C T A T G A C g c a t a c c a t g c a a g g c t a a a g 3 ' U T R 9 58 481 T G T A A A A C G A C G G C C A G T t g t c t t t a a g a a a g c c c t g a a a C A G G A A A C A G C T A T G A C c c t c a t t t g t c c t t g g c a g t 3 ' U T R I O 61 520 T G T A A A A C G A C G G C C A G T c a g g g t t g c c a t t g t a U c c C A G G A A A C A G C T A T G A C t c t c c c t t a a t c t g g a c a c a a c 3 ' U T R l l 57 521 T G T A A A A C G A C G G C C A G T a a g t t g t c c a a g g c a a g a a g a C A G G A A A C A G C T A T G A C g a t a a t t t c a t t a a g g t g c a a t t a a a a 3 ' U T R 1 2 62 422 T G T A A A A C G A C G G C C A G T a g g t t c a c a a a c t c t t g g t c a C A G G A A A C A G C T A T G A C c c t t a g a a c g a g t c c c a t g c 00 T a b l e A . 2 : TaqM&n p r i m e r s a n d probes . A s s a y s designee o n the oppos i te s t r a n d are m a r k e d w i t h a n * S N P V I C Probe 6 F A M Probe Forward Primer Reverse Primer C o m m o n / h t S N P s S N P 1 -5144 A / T C C C T C C A T C C C G C G A C C C T C C T T C C C G C G C A G C A T A G C C G G G T C C A A G C C C G G C T T G T A T T G G G T A A S N P 3 -4519 G / A * C T C C C G C G G C C A C C C T C C C G T G G C C A C C G T G G C T A A C G G A G A A A A G A A G G C A G A T C C C G A C T C C T C T C S N P 7 IVS4 (+36) d e l ( A A ) * T T A C T A A T C A C A C T T A T T T C A A T T A C T A A T C A C A C * * A T T T C A A T G A T A G A G C T A C A G A A C G A A A G G T A G T A A A G C G C T T A A A T T T C T C A A C T T C T T T C T G A A A A S N P 32 IVS24 (-8) del (T) T T G C T T G C T T G T T T T A A T T G C T T G C T * G T T T T A A G C T T T G G A A A G T A G G G T T T G A A A T T A G A A A A T T C C C T T C G T G T C C T G G A A C A A T C S N P 47 X39 (5557) G / A T T T T A C T C C A A G A T A C A A A T G T T T T A C T C C A A A A T A C A A A T G G T C A G A C T G T A C T T C C A T A C T T G A T T C A C C C T G A A C A T G T G T A G A A A G C A G A T S N P 58 IVS62 (-55) C / T * C A A T G T T G T C A A C G T A T C T A A T G T T G T C A A C A T A T C T C A T A G G C T C A G C A T A C T A C A C A T G A C T C A C A G C A T C T A G A G T C A A A C A C A T T A T A A A R a r e / P o t e n t i a l l y deleterious S N P s S N P 17 X12 (1541) G / A C C A T A A T T C A G G G T A G T T T C C A T A A T T C A G G A T A G T T T A G C T G A A A A C T T T G G C T T A C T T G G A G C T G A C C C A G T A A A T A A C T T C C A G A A S N P 31 X24 (3161) C / G * C C A T T T T G A A T A A G G A T C A G C C A T T T T G A A T A A C G A T C A G A C C A C A G T T C T T T T C C C G T A G G A C T T C A T T T A C A G G A A A G T C T T T T C C C A T T S N P 40 X31 (4424) A / G C T T T G A T T C A C T A T A T C A A C T T T G A T T C A C T G T A T C A A C G C C T T T G T T C T T C G A G A C G T T A T T T A C A G G A T A G A A A G A C T G C T T A T A T A T T G G T C T S N P 49 X40 (5697) C / A T T T T T C C G A T G C T G T T T G T T T C C G A T G A T G T T T G G C A A G A A T G C C T G G G A C T G A G T A G T C C A C A A C A G C A A G C A T T G S N P 50 X40 (5753) G / C T A C A T G A G A A G A C A A A A G C A T G A G A A C A C A A A A G T G T C A G A G T C A G A G C A C T T T T T C C A T C C T A A A C G T A A G A A G C A A C A C T C A S N P 56 X54 (7775) C / G * C A A G C T G A G A G C T T T C A A G C T G A C A G C T T T G G T A G C C A G A A G A A G C A G A A T A A C T T A A A A G G T A C G T A T G T T T A A T C C A A A T A C C T C A Appendix B Correction for multiple testing T h e results of the assoc ia t i on s t u d y were correc ted for the False D i s c o v e r y R a t e ( F D R ) [6]. Table B . l shows the results of t h a t correc t i on . T h i s tab l e shows the s igni f icance values for the odds ra t i os f r o m the assoc ia t i on tests . T h e s e values are c o m p a r e d to values correc ted for the False D i s c o v e r y R a t e ( F D R ) . T o i m p l e m e n t t h i s , for each test , the s igni f icance values are r a n k e d i n order of s igni f i cance . E a c h p -value is t h e n c o m p a r e d to the F D R value c o r r e s p o n d i n g to i t s r a n k . If the observed p -va lue is smal l e r t h a n the F D R va lue , the test is s a i d to r e m a i n s igni f i cant after c o r r e c t i o n for F D R . N o n e of the p-values i n th i s s t u d y r e m a i n e d s igni f i cant after c o r r e c t i o n for F D R , w h i c h m a y be over -conservat ive for a locus w i t h h i g h L D . Table B . i : Significance values for the odds ratios from the association tests corrected for multiple testing using the False Discovery Rate. R a n k F D R F o r m u l a F D R V a r i a n t O b s . p - v a l u e V a r i a n t O b s . p - v a l u e V a r i a n t O b s . p - v a l u e Correction for F D R in the overall N H L types A l l N H L B cell N H L T cell N H L 1 0.05 * (1/7) 0.007 rare6 0.077 rare6 0.090 rVS24(-8)del(T) 0.019 2 0.05 * (2/7) 0.014 X39(5557)G/A 0.651 - 5 1 4 4 A / T 0.746 X39(5557)G/A 0.083 3 0.05 * (3/7) 0.021 - 5 1 4 4 A / T 0.678 X39(5557)G/A 0.884 IVS4(+36)del(AA) 0.142 4 0.05 * (4/7) 0.029 IVS4(+36)del(AA) 0.693 IVS4(+36)del(AA) 0.915 rare6 0.348 5 0.05 * (5/7) 0.036 IVS24(-8)del(T) 0.716 IVS24(-8)del(T) 0.948 - 4 5 1 9 G / A 0.613 6 0.05 * (6/7) 0.043 -4519G/A 0.854 - 4 5 1 9 G / A 0.953 - 5 1 4 4 A / T 0.636 7 0.05 * (7/7) 0.050 IVS62( -55 )C /T 0.862 IVS62( -55 )C /T 0.979 I V S 6 2 ( - 5 5 ) C / T 0.768 Correct on for F D R in tests of the different N H L subtypes D L B C L F L L P L 1 0.05 * (1/7) 0.007 IVS24(-8)del(T) 0.125 - 5 1 4 4 A / T 0.511 rare6 0.113 2 0.05 (2/7) 0.014 X39(5557)G/A 0.163 - 4 5 1 9 G / A 0.519 IVS24(-8)del(T) 0.379 3 0.05 • (3/7) 0.021 rareG 0.306 IVS62( -55 )C /T 0.726 - 4 5 1 9 G / A 0.431 4 0.05 * (4/7) 0.029 IVS62( -55)C /T 0.373 IVS4(+36)del(AA) 0.797 X39(5557)G/A 0.500 5 0.05 * (5/7) 0.036 - 4 5 1 9 G / A 0.441 rare6 0.843 - 5 1 4 4 A / T 0.501 6 0.05 • (6/7) 0.043 - 5 1 4 4 A / T 0.569 IVS24(-8)del(T) 0.988 IVS4(+36)del(AA) 0.547 7 0.05 • (7/7) 0.050 IVS4(+36)del(AA) 0.905 X39(5557)G/A 0.993 I V S 6 2 ( - 5 5 ) C / T 0.559 M C L M F Misc B C L 1 0.05 * (1/7) 0.007 rare6 0.026 X39(5557)G/A 0.573 rare6 0.233 2 0.05 • (2/7) 0.014 IVS24(-8)del(T) 0.179 IVS24(-8)del(T) 0.595 X39(5557)G/A 0.340 3 0.05 * (3/7) 0.021 IVS4( + 36)del(AA) 0.265 IVS62( -55 )C /T 0.617 IVS24(-8)del(T) 0.519 4 0.05 * (4/7) 0.029 X39(5557)G/A 0.280 rare6 0.617 IVS4(+36)del(AA) 0.647 5 0.05 * (5/7) 0.036 IVS62( -55 )C /T 0.447 IVS4( + 36)del(AA) 0.665 - 4 5 1 9 G / A 0.762 6 0.05 • (6/7) 0.043 - 5 1 4 4 A / T 0.738 - 5 1 4 4 A / T 0.695 - 5 1 4 4 A / T 0.828 7 0.05 * (7/7) 0.050 - 4 5 1 9 G / A 0.811 - 4 5 1 9 G / A 0.713 I V S 6 2 ( - 5 5 ) C / T 0.889 Continued on next page Table B . l — Continued from previous page 00 R a n k F D R F o r m u l a F D R V a r i a n t O b s . p - v a l u e V a r i a n t O b >. p - v a l u e V a r i a n t O b s . p - v a l u e Misc T C L M Z L / M A L T P T C L 1 0.05 * (1/7) 0.007 IVS24(-8)del(T) 0.077 rare6 0.015 IVS4(+36)del(AA) 0.013 2 0.05 * (2/7) 0.014 X39(5557)G/A 0.077 I V S 6 2 ( - 5 5 ) C / T 0.524 X 3 9 ( 5 5 5 7 ) G / A 0.186 3 0.05 * (3/7) 0.021 IVS4(+36)del(AA) 0.293 IVS4(+36)del(AA) 0.651 IVS24(-8)del(T) 0.206 4 0.05 * (4/7) 0.029 - 5 1 4 4 A / T 0.576 - 4 5 1 9 G / A 0.701 rare6 0.237 5 0.05 * (5/7) 0.036 I V S 6 2 ( - 5 5 ) C / T 0.619 - 5 1 4 4 A / T 0.767 - 4 5 1 9 G / A 0.295 6 0.05 * (6/7) 0.043 - 4 5 1 9 G / A 0.644 X 3 9 ( 5 5 5 7 ) G / A 0.924 - 5 1 4 4 A / T 0.338 7 0.05 * (7/7) 0.050 rare6 0.923 IVS24(-8)del(T) 0.988 I V S 6 2 ( - 5 5 ) C / T 0.407 C L L 1 0.05 * (1/7) 0.007 IVS24(-8)del(T) 0.202 2 0.05 * (2/7) 0.014 - 5 1 4 4 A / T 0.208 3 0.05 * (3/7) 0.021 X39(5557)G/A 0.212 4 0.05 * (4/7) 0.029 I V S 6 2 ( - 5 5 ) C / T 0.237 5 6 0.05 * (5/7) 0.05 * (6/7) 0.036 0.043 - 4 5 1 9 G / A rare6 0.240 0.301 7 0.05 * (7/7) 0.050 IVS4(+36)deI(AA) 0.509 Correction for F D R in tests of the different ethnicities Caucasian A l l N H L A s i a n A l l N H L S Asian A l l N H L 1 0.05 * (1/7) 0.007 rare6 0.111 - 5 1 4 4 A / T 0.288 - 5 1 4 4 A / T 0.369 2 0.05 * (2/7) 0.014 I V S 6 2 ( - 5 5 ) C / T 0.724 - 4 5 1 9 G / A 0.355 IVS24(-8)del(T) 0.370 3 0.05 * (3/7) 0.021 IVS4(+36)del(AA) 0.845 IVS4(+36)del(AA) 0.365 X 3 9 ( 5 5 5 7 ) G / A 0.370 4 0.05 * (4/7) 0.029 - 4 5 1 9 G / A 0:854 IVS24(-8)del(T) 0.415 - 4 5 1 9 G / A 0.480 5 0.05 * (5/7) 0.036 X39(5557)G/A 0.932 X 3 9 ( 5 5 5 7 ) G / A 0.434 I V S 6 2 ( - 5 5 ) C / T 0.531 6 0.05 * (6/7) 0.043 IVS24(-8)del(T) 0.966 I V S 6 2 ( - 5 5 ) C / T 0.553 rare6 0.939 7 0.05 * (7/7) 0.050 - 5 1 4 4 A / T 0.987 rare6 0.678 IVS4(+36)del (AA) 0.943 Appendix C Ethics approval T h i s s t u d y was a p p r o v e d b y the j o i n t C l i n i c a l R e s e a r c h a n d E t h i c s B o a r d of the B r i t i s h C o l u m b i a C a n c e r A g e n c y a n d the U n i v e r s i t y of B r i t i s h C o l u m b i a . A l l s u b -jects gave w r i t t e n i n f o r m e d consent . A copy of the a p p r o v a l is a t t a c h e d . 

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.831.1-0099901/manifest

Comment

Related Items