Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Identification of novel genes affecting body wall muscle in Caenorhabditis elegans Warner, Adam Dennis 2007

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-ubc_2007-0272.pdf [ 12.06MB ]
Metadata
JSON: 831-1.0100674.json
JSON-LD: 831-1.0100674-ld.json
RDF/XML (Pretty): 831-1.0100674-rdf.xml
RDF/JSON: 831-1.0100674-rdf.json
Turtle: 831-1.0100674-turtle.txt
N-Triples: 831-1.0100674-rdf-ntriples.txt
Original Record: 831-1.0100674-source.json
Full Text
831-1.0100674-fulltext.txt
Citation
831-1.0100674.ris

Full Text

I D E N T I F I C A T I O N OF N O V E L G E N E S A F F E C T I N G B O D Y W A L L M U S C L E IN CAENORHABDITIS ELEGANS by Adam Dennis Warner B . S c , University of Brit ish Columbia, 2002 A THESIS S U B M I T T E D IN P A R T I A L F U L F I L L M E N T OF T H E R E Q U I R E M E N T S F O R T H E D E G R E E OF M A S T E R OF S C I E N C E In T H E F A C U L T Y OF G R A D U A T E STUDIES (Genetics) T H E U N I V E R S I T Y OF BRIT ISH C O L U M B I A March, 2007 © Adam Dennis Warner, 2007 Abstract Muscular diseases affect many people worldwide. While we have learned much about the sarcomere, the basic building block of muscle cells, there are still numerous questions that remain to be answered. In fact, all of the proteins involved in the formation and maintenance of the sarcomere have yet to be catalogued and studied. We must learn more about proteins expressed in muscle and how they interact so that better treatments for myopathies can be produced. In this thesis, several novel sarcomeric proteins have been identified using Caenorhabditis elegans as a model organism. A list of genes expressed in muscle cells was compiled using available Serial Analysis of Gene Expression (SAGE) and microarray data. By eliminating or severely reducing the expression of each gene using R N A interference (RNA i ) , we were able to determine which genes were required for proper myofilament organization. O f 23 genes known to affect muscle, 16 were identified using this methodology. In total 119 genes were found to be necessary for proper myofilament organization, 103 of which are genes without a previously characterized role in muscle. In addition, a bioinformatics based screen that utilized tissue specific S A G E data in C. elegans yielded a number of potential candidate muscle affecting genes, and one, C28H8.6 was further studied. C28H8.6 consists of both an ' a ' and a 'b ' isoform, one of which, C28H8.6a has 4 L I M domains and bears striking sequence similarity to the focal adhesion protein paxil l in. In animals homozygous for a mutation that knocks out both isoforms of C28H8.6, movement is uncoordinated and development is arrested at the first larval stage. We have demonstrated using isoform specific R N A i that knocking down expression of C28H8.6b has no observable effect on development, while C28H8.6a is I l l required for proper larval growth and movement. Further study on this isoform verified that C28H8.6a co-localizes to dense bodies with a- actinin and is required for proper organization of myofilaments within the sarcomere. iv Table of Contents Abstract ii Table of Contents..... iv List of Tables vi List of Figures vii List of Abbreviations viii Acknowledgements x CHAPTER ONE - INTRODUCTION 1 1.1 Caenorhabditis elegans as a model system 2 1.2 C. elegans muscle organization 4 1.3 Attachment complexes in C. elegans body wall muscle 13 1.4 L I M domain proteins in C. elegans muscle 17 1.5 A two-pronged approach to identify muscle affecting genes 21 CHAPTER TWO - MATERIALS AND METHODS 24 2.1 A n R N A i screen for novel muscle affecting genes 24 2.1.1 Identification of muscle expressed genes 24 2.1.2 Genome-wide R N A i feeding library.... 25 2.1.3 Experimental procedure 25 2.1.3.1 Strains used 25 2.1.3.2 Preparation of worms for screening 26 2.1.3.3 R N A i feeding 27 2.1.3.4 Screening for overt phenotypes 27 2.1.3.5 Inspection of myofilament organization 28 2.1.3.6 P C R confirmation of R N A i feeding clones 29 2.1.3.7 Datastorage 29 2.2 Identification of the novel dense body protein C28H8.6 30 2.2.1 Experimental procedure 30 2.2.1.1 Strains used 30 2.2.1.2 Construction of R N A i feeding clones 31 2.2.1.3 Construction of a Gateway G F P translational fusion 32 2.2.1.4 Production of an okl483 rescuing G F P translational fusion 32 2.2.1.5 Microinjection procedure 34 2.2.1.6 Microscopic analysis 34 2.2.1.7 Antibody staining 35 V CHAPTER THREE - RESULTS 38 3.1 R N A i screen for muscle affecting genes 38 3.1.1 Age related muscle degeneration in RW1596 42 3.1.2 R N A i screen of muscle expressed transcripts 44 3.1.3 Rescreen of genes identified in our R N A i screen 46 3.1.4 R N A i screen data storage 49 3.1.5 P C R confirmation of R N A i feeding clones 51 3.2 Identification of a novel L I M domain dense body protein 52 3.2.1 S A G E expression of C28H8.6 52 3.2.2 C28H8.6 gene model 56 3.2.3 Production of strain DM7082 through mutant rescue 58 3.2.4 Phenotype of C28H8.6 mutants 59 3.2.5 C28H8.6 R N A i analysis 61 3.2.5.1 R N A i affecting both splice variants of C28H8.6 61 3.2.5.2 Isoform specific R N A i of C28H8.6 61 3.2.5.3 Myofi lament analysis in C28H8.6 depleted animals 64 3.2.6 Subcellular localization of C28H8.6a::GFP 66 3.2.7 Co-localization of C28H8.6a::GFP with a-actinin :68 CHAPTER FOUR - DISCUSSION 70 4.1 R N A i screen for muscle affecting genes 71 4.1.1 Genes identified as muscle affecting using a sensitive R N A i screen 75 4.1.2 Genes not identified as muscle affecting 79 4.2 Identification of C28H8.6 as a novel dense body protein 83 4.2.1 L I M proteins in C. elegans 86 4.2.2 Isoforms of C28H8.6 90 4.2.3 C28H8.6a is a paxil l in homolog 93 4.2.4 Myofi lament organization in C28H8.6 deficient animals 96 CHAPTER FIVE - CONCLUSION 98 References 101 VI List of Tables Table 1. L I M domain proteins in C. elegans body wall muscle 20 Table 2. Known muscle genes screened using R N A i 80 Table 3. L I M domain proteins in C. elegans 88 vi i List of Figures Figure 1. Model of a C. elegans sarcomere 5 Figure 2. Muscle quadrant arrangement within the worm 8 Figure 3. Sarcomere orientation in C. elegans 11 Figure 4. Comparison of vertebrate focal adhesions and C. elegans dense bodies 12 Figure 5. Assembly pathways for dense bodies and M-lines 15 Figure 6. Myofi lament organization in R N A i control animals 41 Figure 7. Age related decline of myofilament structure in RW1596 muscle 43 Figure 8. Myofi lament defects observed in R N A i affected animals 45 Figure 9. Results from an R N A i screen targeting 3301 genes 48 Figure 10. R N A i muscle screen online database 50 Figure 11. S A G E expression of known muscle genes 54 Figure 12. Elevated muscle cell S A G E expression of C28H8.6 55 Figure 13. Models of the organization and structure of the C28H8.6 gene and protein isoforms 57 Figure 14. C28H8.6 mutant animals 60 Figure 15. Production of two isoform specific C28H8.6 R N A i constructs 63 Figure 16. Myofi lament disorganization in C28H8.6 deficient animals 65 Figure 17. Localization pattern of C28H8.6a::GFP 67 Figure 18. Co-localization of a-actinin with C28H8.6a::GFP 69 Figure 19. Classes of muscle affecting genes identified using R N A i 77 Figure 20. Bioinformatics based screen for muscle affecting genes 85 Figure 21. Isoforms of C28H8.6 92 Figure 22. Protein alignment of C28H8.6a and UNC-95 with human paxi l l in 95 List of Abbreviations °c degrees Celsius % percent Ug microgram ul microlitre u M micromolar A-band anisotropic band A C T A alpha-actinin protein (Homo sapiens) A T N alpha-actinin protein (C. elegans) B S A bovine serum albumen c D N A complementary D N A C G C Caenorhabditis Genetics Center D A B C O 1,4-diazabicyclo[2.2.2]octane D E B dense body protein dH20 distilled water D i m disorganized muscle D L dorsal left D M D Duchenne Muscular Dystrophy D N A deoxyribonucleic acid D R dorsal right Dpy dumpy d s R N A double stranded ribonucleic acid E C M extracellular matrix E D T A ethylenediaminetetraacetic acid Emb embryonic lethal F A C S fluorescence activated cell sorting E S T expressed sequence tag F A K focal adhesion kinase F O fibrous organelle G F P green fluorescent protein H C high class I-band isotropic band IC intermediate class I P T G isopropyl-beta-D-thiogalactopyranoside kb ki lo base pairs K O H sodium hydroxide I L K integrin-linked kinase L I larval stage 1 L I M lin-11 isl-1 mec-3 L C low class L v a larval arrest L v l larval lethal M 9 minimal media salt solution 9 mg mil l igram mL mill i l i tre m M mill imolar M-lines mittellinie Mup muscle positioning defecive N normal ng nanogram N G M nematode growth medium O P v F open reading frame Pat paralyzed arrested elongation at two fold P C R polymerase chain reaction P H P hypertext preprocessor pmol picomole R N A ribonucleic acid R N A i R N A interference Ro l Roller S A G E serial analysis of gene expression S Q L structured query language Ste sterile T A E tris-acetic ac id -EDTA T B S tris-buffered saline Unc uncoordinated U K United Kingdom U S A United States of America U V ultraviolet V L ventral left V R ventral right W T wi ld type Z-disc zwischenscheibe disc X Acknowledgements Writing a thesis is not a modest undertaking, and without the help of others it would be nearly impossible. A s such, I have many people to thank for their advice and support during the thesis writing process, and their assistance with the experiments that led to the data herein. I would like to express gratitude to my supervisor, Dr. Donald G . Moerman for providing much more than the expertise, knowledge, and support that is expected of a supervisor. Don was always able to find time to chat when I really needed it, and has a passion for science that is infectious and made coming to work enjoyable and exciting. M y committee members Dr. L inda Matsuuchi and Dr. Calv in Roskelley were extremely helpful in providing advice throughout my time as a Masters student and with this thesis directly. Dr. Hugh Brock has been an excellent teacher, and prepared me for what lies ahead. Lastly, I would like to thank Dr. Douglas A l lan for taking the time to be a part of my examination committee. Within the lab, Dr. Barbara Meissner has been a role model for me. While working with her on a good portion of the work in this thesis, I learned a myriad of skills and also saw the work ethic required to succeed. Dr. Teresa Rogalski has also been a great help with this thesis and was very generous and patient in taking the time to teach me lab techniques. Nicholas Dube was integral in many aspects of the work in this thesis and through our time working together has become a great friend. A productive lab is always a team effort, and as such every lab member deserves thanks for any part they have had in this thesis work. If it were not for Dr. Joan Cossentine I may never have become a scientist. That first job is always tough to get, and I would like to express my deepest gratitude to Joan for giving me a chance. B y the same token, i f Dr. Jennifer K lenz hadn't given me an opportunity to be her work-study student I would not have been introduced to the worm! Jennifer has always been there for advice and I thank her for that. M y friends and family deserve thanks for their help and distractions during the thesis writing process. Thanks go out to Mike Hoy for hours of distraction while we recorded our latest Christmas album. M y brother Kenton was almost like an extra committee member with the amount of help he gave me looking over my thesis. M y two other siblings Rebecca and Craig deserve thanks as well for their encouragement. I'd like to thank my grandparents Ken and Irene Wi lson for their love and support, and my late grandfather Ford and his wife Louise for their love and kind words during difficult times. Last but definitely not least, I must thank my parents for everything they have done for me since I was young. They have provided encouragement and every facet of support and helped make me the person I am today. They are amazing parents, and for that I thank them. Final ly, I must say thank you to my partner Shannon. In this whole process, whenever I've been down or unmotivated or just plain grouchy she has been there to pick me up and put everything in perspective with her love and encouragement. Thank you. 1 CHAPTER ONE - INTRODUCTION Despite years of intensive work, treatments to alleviate or eliminate debilitating muscular diseases have been difficult to develop. One reason for this lack of progress is that we do not yet have a complete understanding of how muscle is made and maintained. While many of the big pieces of the puzzle are known, what eludes us is a complete understanding of how many pieces there are and how these pieces fit together to form and maintain a sarcomere, the structural repeat unit responsible for muscle contraction. It is of great importance to learn as much as possible about the proteins involved in sarcomere assembly and maintenance, and their interactions with each other. By carrying out fundamental research that builds a better understanding of subcellular muscle dynamics, we stand a better chance of providing adequate myopathy treatments. Defects in components of the functional repeat unit of muscle, the sarcomere, are implicated in over 20 diseases (Nowak et al. 2005). Mutations affecting the sarcomeric alpha-actin ( A C T A 1 ) gene in humans can cause Act in Myopathy due to an excess of actin filaments (Sparrow et al. 2003). Thick filaments are also prone to mutations causing disease. Laing-type Distal Myopathy develops as a result of mutations affecting the tail portion of the motor protein myosin, and is characterized by weakness in the limbs and neck, followed by the progression of tremors later in life (Lamont et al. 2006; Meredith et al. 2004). Mutations resulting in amino acid substitutions in myotil in have recently been implicated in causing a form of Limb-girdle muscular dystrophy (Gontier et al. 2005). Myot i l in is found in zwischenscheibe (Z-disk) complexes that function to anchor actin filaments to the cell membrane. Disease-causing mutations such as those in myoti l in show that Z-disc complexes are also susceptible to disease. Sti l l , while some of the genetic elements responsible for causing myopathies have been pinpointed, we lack a detailed understanding of the underlying mechanisms involved in disease progression and knowledge of key interactions between proteins, many of which have yet to be discovered. To underline our lack of understanding, almost 20 years after the discovery of the actin-associated protein dystrophin (Hoffman et al. 1987), little more than palliative treatment is available for patients afflicted with its associated disease, Duchenne Muscular Dystrophy (DMD) . While treatment of a disease such as D M D is driven substantially by drug development, the more we can learn about how a functional sarcomere is formed and the proteins required, the better chance we have of eventually providing adequate treatment to those in need. 1.1 Caenorhabditis elegans as a model system In order to gain insight into subcellular muscle assembly, structure, maintenance and function, our lab has taken advantage of the beneficial traits of the model organism C. elegans for genetic study. The worm has many attributes for muscle research which wi l l be outlined below, and has made it possible to study muscle in ways that would be difficult to achieve in more complex organisms. The nematode C. elegans is a common roundworm found in soil throughout most of the world (Brenner 1973). Nobel Laureate Sidney Brenner pioneered the use of this worm for the study of animal behaviour and development (Brenner 1973; Brenner 1974). Traits such as a short generational time of 3 days, ease of cultivation, and the ability of C. elegans to mate or self fertilize have proven to be excellent attributes for use in genetic research. 3 Experimental contributions such as a complete cell lineage (Sulston and Horvitz 1977; Sulston et al. 1983) and the production of thousands of mutant strains have added to the worm's attractiveness as a model organism. More recently, a fully sequenced genome (C. elegans Sequencing Consortium, 1998) and tissue and stage specific gene expression data (McKay et al. 2003; D.G. Moerman lab, unpublished) have become powerful new tools for genetic research. Dr. Andrew Fire was recently awarded the Nobel Prize for his work in developing a powerful C. elegans technique, R N A interference (RNAi ) that has since been used in other species. A reduction of coding transcipt to almost undetectable levels can be achieved for genes using R N A i . When d s R N A is introduced into the worm by injection, soaking, or feeding bacteria that produces d s R N A to animals, gene specific transcript knockdown occurs. A n R N A i feeding library that contains bacterial clones producing gene specific d s R N A covers - 9 0 % of the C. elegans genome (Timmons et al. 2001) and is an excellent tool for genetic research in the worm. One of the most attractive features of the worm is its large number of genes with human homologs. Just over 40% of C. elegans genes have a human homolog (Ahringer 1997; Blaxter 1998). The level of genetic similarity to humans, coupled with the aforementioned advantages, has led to the development of a community of devoted researchers using the worm in a variety of research areas. One such focus has been the study of muscle development and function. In fact, much of the early elucidation of mammalian muscle organization stemmed from analysis of mutations affecting muscle in C. elegans (Epstein et al. 1974; MacLeod et al. 1977a; Waterston et al. 1980; Zengel and Epstein 1980); some of the earliest work on myosin heavy chain proteins was carried out in C. elegans 4 (Epstein et al. 1974; MacLeod et al. 1977b). Subsequent research then led to a basic model of the major structural components needed to form and maintain the sarcomere. 1.2 C. elegans muscle organization The worm is a valuable model organism for the study of muscle due to its similarity to vertebrate muscle, along with its semi-transparent cuticle that allows for visualization of muscle structures in situ. The basic components within a C. elegans sarcomere have vertebrate counterparts, with only small differences in protein composition and organization. Contraction of a muscle cell is carried out by the sarcomere. The sarcomere consists of attachment complexes that anchor thin and thick filaments (Figure 1). These filaments are interdigitated, allowing the motor protein myosin that is associated with thick filaments to travel along thin filament actin chains. The mechanical force produced by the movement of myosin along actin filaments produces tension between adjacent attachment complexes that anchor the filaments. The tension created allows for contraction of a muscle cell. In vertebrates, the primary attachment structures are known as Z-discs, and mittellinie ( M -lines). In C. elegans these are known as dense bodies and M-lines respectively. The sarcomere is defined as the distance between two dense bodies. The area associated solely with actin filaments on either side of a dense body is known as an isotropic (I) band, while the area containing overlap between actin and myosin filaments is known as the anisotropic (A) band. 5 Dense Body HE M-Line Thick Filaments Thin Fi f T f l t t t Ii t t t i t t t t i  ilaments t +MAMA • l l T T T T • • m m MLM.M.JL Basal Lamina 1 t n t t t t t t Hypodermis J TTTT" Intermediate Filaments Cuticle h i J Half l-Band L H-Zone Half l-Band A-Band Figure 1. Model of a C. elegans sarcomere Attachment structures known as dense bodies and M-lines anchor thin and thick filaments respectively. Thin filaments are represented here by dark blue lines, while the thick filaments are represented by light blue lines associated with circular spheres symbolizing myosin motor proteins. Dense body and M-line anchoring complexes transmit force to the cuticle through association with complexes in the hypodermis that are connected in turn to the cuticle by intermediate filaments. 6 Muscle cells in the worm can be one of two types, cells with a single sarcomere and cells with multiple sarcomeres. Single sarcomere muscle cells are specialized for smaller areas of contraction rather than global movement of the animal. Some reproductive cells such as those responsible for contraction of the vulva, as well as muscle controlling pharyngeal pumping and defecation contain only a single sarcomere (Moerman and Fire 1997). Much of the work focused on sarcomere assembly is restricted to muscle cells that contain multiple sarcomeres and make up body wall muscle in C. elegans. In total, body wall muscle is comprised of 95 cells, divided into the ventral left (VL ) , ventral right (VR) , dorsal left (DL) , and dorsal right (DR) quadrants. The disbursement is equal between the V R , D L and D R quadrants, but the V L quadrant only has 23 cells (Sulston and Horvitz 1977) . Only 81 of these cells are present in a fully developed embryo, with the remaining 14 cells developing after hatching (Sulston et al. 1983). Post-embryonic growth of all muscle cells also occurs, extending the width from two A-bands to ten A-bands by the time a nematode reaches the adult stage (Mackenzie et al. 1978) . Each of these 95 cells is oriented in a flattened formation within each quadrant (Figure 2), with the hypodermis and outer cuticle encompassing the muscle mass (Hresko et al. 1994). A n ovoid flat muscle cell shape is necessary to make optimal contact with hypodermal cells because unlike vertebrate striated muscle in which cells can fuse to form multinucleated myotubes, C. elegans muscle cells do not fuse with each other. Rather, each cell is connected laterally to hypodermal cells surrounding the muscle tissue by interactions between dense bodies and hypodermal fibrous organelles (FO), similar to hemidesmosomes in vertebrates (Francis and Waterston 1985). In each F O , one subcomplex is found at the inner surface of the cuticle which surrounds the hypodermis, and one subcomplex at the 7 interface between hypodermal cells and the basal lamina of muscle cells (Ding et al. 2004). Intermediate filaments link the two, and this chain of attachments from muscle cell to cuticle allows for the kinetic energy produced through contraction of the full complement of sarcomeres in a single cell to be transferred to the worm's outer surface (Bartnik et al. 1986). Dorsal Nerve Cord Ventral Nerve Cord Figure 2. Muscle quadrant arrangement within the worm Muscle cells are arranged into four quadrants that are flattened against hypodermal cells surrounding the muscle tissue (Hresko et al. 1994). A n outer cuticle envelops all internal organs and tissues. A s body wall muscle matures, a proliferation of muscle arms occurs. Muscle arms are processes that elongate and stretch from muscle cells to a proximal nerve cord in order to make synaptic connections (White et al. 1986). Figure modif ied with permission from Altun and Hal l (2005). While C. elegans and vertebrates share many similarities in muscle structure, there are some subtle differences. The absence of myotubes in the worm requires that the muscle cel l 's contractile force be transmitted laterally through dense bodies and M-lines (Francis and Waterston 1985). The myofilament lattice is oriented parallel to the direction of movement, but dense bodies and M-lines are offset by 5-7° (Figure 3) (Mackenzie and Epstein 1980). This offset arrangement is thought to accommodate the sinusoidal movement of the animal by distributing force evenly over the surface of the cuticle (Burr and Gans 1998). It is also possible that the position of dense bodies is relies on the shape of muscle cells at the time of association with hypodermal contacts (Moerman and Wil l iams 2006). The 5-7° offset from an obliquely striated orientation is in contrast to the 90° offset observed in vertebrate cross-striated muscle. Another difference is the presence of the protein UNC-15/paramyosin in C. elegans (Epstein et al. 1993). UNC-15 forms the inner core of myosin filaments and because of this additional protein constituent, C. elegans thick filaments have a larger diameter (-14-34 nm) than vertebrate filaments (-14 nm) (Mackenzie and Epstein 1980). C. elegans thick and thin filaments are also ~6x longer than their vertebrate counterparts, with a greater proportion of thin filaments to thick filaments in the worm (12:1) than vertebrates (6:1) (Moerman and Fire 1997; Waterston 1988). Interestingly, whereas neurons in vertebrates migrate in order to synapse with their target muscle cells, projections are extended from each body wal l muscle cell in the worm to contact with a proximal nerve cord (Figure 2) (Dixon and Roy 2005; White et al. 1986). This unique extension of a muscle cell to synapse with, and receive input from a motor neuron increases in prominence as the worm grows; the number of muscle arms per cell increases from ~1.7 in the embryo to ~4 in an adult worm (Dixon and Roy 2005). 10 While it is evident that there are differences such as these between C. elegans and vertebrate muscle, there are enough similarities between the two to allow for genetic studies in the worm. Most importantly, the attachment complexes themselves are very similar in their composition, which makes the C. elegans counterparts an excellent model for muscle studies. Furthermore, the proteins needed to form a functional sarcomere in the worm bear a resemblance to the required proteins in vertebrate focal adhesions (Figure 4) (Moerman and Wil l iams 2006). These structures are found in migrating cells involved in a number of processes including tissue repair, immune responses, and tumor formation (Ridley et al. 2003). Movement of such cells involves polymerization of actin filaments, and subsequent attachment to the extracellular matrix (ECM) (Burridge et al. 1988; Ridley et al. 2003). The process of energy transmission mediated by focal adhesions is similar to that carried out by the anchorage of filaments to the sarcolemma via attachment complexes in C. elegans. 11 3-D view of myofilament lattice and A and I bands Sarcomere Figure 3. Sarcomere orientation in C. elegans Adjacent sarcomeres are offset from one another by 5-7° (Mackenzie and Epstein, 1980). The force transmitted by muscle contraction is spread evenly across the cuticle surface. Figure modified with permission from Altun and Hal l (2005). 12 A Focal Adhesion Dense Body B FAK Src paxillin MLCK talin vinculin «-actinin' •>perlecan V P I N C H ' . -actopaxin migfilin filamin AKT GSK-3 Nck2 WASP PAK x U N C - 5 2 CeTalln DEB-1 ATN-1 PAT-6 UNC-98 Figure 4. Comparison of vertebrate focal adhesions and C. elegans dense bodies Many of the principal components of focal adhesions and dense bodies are conserved. While several additional proteins such as paxillin and focal adhesion kinase (FAK) are required in focal adhesions, the structure and function of each of the represented proteins is very similar. Figure used with permission from Moerman and Williams (2006). 13 1.3 Attachment complexes in C. elegans body wall muscle Building a functional sarcomere in C. elegans is a complex process that requires the building of two main attachment complexes, the M-line and dense body. Both of these structures are anchored in the sarcolemma, projecting inwards to allow for anchoring of actin or myosin filaments. This is necessary to transmit the force created from the contraction of myofibri ls to the hypodermis. Fibrous organelles then complete transmission of force to the cuticle, allowing for movement of the animal (Bartnik et al. 1986; Francis and Waterston 1991). The majority of energy is transferred in this manner, however complexes very similar in structure to dense bodies known as attachment plaques are found at sites of muscle cel l-muscle cell adhesion and are able to create tension between adjacent muscle cells (Francis and Waterston 1985). Attachment plaques provide the only mechanism for lengthwise force transmission in the worm as they provide linkage of actin filaments in the terminal dense bodies to the muscle cell membrane. Sti l l , dense bodies and M-lines are predominantly responsible for providing linkage of the myofibrillar components to the cuticle-associated hypodermal components, thus allowing for sinusoidal movement of the animal. The initial step in the formation of a dense body or M-l ine is the deposition of the perlecan homolog U N C - 5 2 in the basement membrane of muscle cells (Mul len et al. 1999; Rogalski et al. 1993). U N C - 5 2 aggregates into a pattern that corresponds with the future sites of both dense bodies and M-lines (Francis and Waterston 1985). Animals lacking functional U N C - 5 2 display a paralyzed and arrested elongation at two-fold stage of embryogenesis (Pat) phenotype and do not develop into a larval stage worm (Wil l iams and Waterston 1994). Furthermore, antibody staining for other key attachment complex proteins in Unc-52 mutants has shown that proper localization of these other components requires 14 deposition of U N C - 5 2 in the basement membrane before sarcomere assembly can proceed (Rogalski et al. 1993). This method of observing localization of key attachment complex proteins in other muscle mutants has established an assembly dependence pathway for both M-lines and dense bodies (Figure 5). The a and P integrin homologs P A T - 2 and P A T - 3 are next in the assembly pathway after U N C - 5 2 , and are localized in the muscle cell membrane in a pattern corresponding with that of U N C - 5 2 in the basal lamina (Hresko et al. 1994; Rogalski et al. 1993; Wil l iams and Waterston 1994). The aggregation of integrin provides a platform from which subsequent dense body and M-l ine assembly can proceed. As attachment complex precursors accumulate, the hypodermal protein LET-805/myotactin appears to play a role in patterning the underlying hypodermis with associated F O complexes (Hresko et al. 1999). 15 DENSE BODY B M-LINE UNC-52/perlecan 4 PAT-3/integrin _ DEB-1/vinculin UNC-112 ' UNC-95 PAT-4/ILK 4 PAT-6/actopaxin talin UNC-97/PINCH ACTIN FILAMENTS UNC-52/periecan 4 PAT-3/integrin 4 UNC-112 PAT-4/ILK 4 PAT-6/actopaxin 4 UNC-89 i MYOSIN FILAMENTS talin UNC-97/PINCH Figure 5. Assembly pathways for dense bodies and M-lines Two parallel pathways are present for dense bodies that are co-dependant for final linkage of actin filaments to the structure (Moerman and Wil l iams, 2006). Proteins listed within each level of the pathway require the presence of each other for proper function or localization. A similar pathway exists for M-l ine formation, but with U N C - 8 9 possibly carrying out the linkage of myosin to the M-line as opposed to DEB-1 linkage of actin filaments to the dense body (Benian et al. 1996; Barstead et al. 1991). Solid arrows indicate progression of assembly can occur after proper localization of previous components, while a dashed arrow indicates a feedback interaction where loss of a protein further down the pathway can have an adverse effect on an earlier component. Figure used with permission from Moerman and Wil l iams (2006). 16 Addit ional proteins must associate with either the dense body or M-l ine before each can carry out its specific task of providing linkage to actin and myosin filaments respectively. Two parallel but dependant pathways are present that allow for the eventual l inking of the vinculin homolog DEB-1 with actin filaments (Moerman and Wil l iams 2006). One fork of this pathway involves two proteins, the MIG-2 homolog UNC-112 and the integrin-linked kinase ( ILK) homolog PAT-4 . These proteins are co-dependant on the presence of each other in order to properly associate with the nascent dense body structure (Mackinnon et al. 2002; Rogalski et al. 2000). Removal of either UNC-112 or P A T - 4 blocks the recruitment of the other, and prevents subsequent recruitment of PAT-6/actopaxin (L in et al. 2003), which forms a complex with UNC-112 and PAT-4 . Before actin attachment can occur however, an additional branch of the pathway must be completed. Recruitment of UNC-95 (Broday et al. 2004) and DEB-1 (Barstead and Waterston 1991) is needed in order to form an interaction with the P A T - 2 / P A T - 3 integrin heterodimer. Both UNC-95 and DEB-1 require the presence of each other for proper assimilation into the maturing dense body structure. Some properly localized DEB-1 is present in Unc-95 animals however, suggesting that DEB-1 is not completely dependant on U N C - 9 5 for proper integration into the dense body (Broday et al. 2004). Sti l l , these mutants are uncoordinated (Unc), as most DEB-1 remains in the cytosol resulting in severly disrupted myofilament organization. When both the D E B - l / U N C - 9 5 , and U N C - 1 1 2 / P A T - 4 pathways are complete, actin filaments can attach properly to the dense body. M-l ine assembly is similar to that of the dense body. It follows a comparable route as the U N C - 1 1 2 / P A T - 4 branch of the pathway, but diverges after P A T - 6 recruitment. Instead of D EB-1 or A T N - 1 mediating actin attachment, the myosin linker U N C - 8 9 is present in M -17 lines (Benian et al. 1996). This protein may allow for the final attachment of myosin filaments to the M-line rather than actin filaments. While many components are shared between these two major structures, DEB-1 and U N C - 8 9 appear to be important in carrying out the specific functional role of each. 1.4 LIM domain proteins in C. elegans muscle In addition to the key structural proteins that make up C. elegans attachment structures, other proteins are present that play a prominent, or even essential role in the formation of functional attachment structures. One emerging group of proteins that plays a important role is defined by a number of lin-11 isl-1 mec-3 (L IM) domain containing proteins found in body wall muscle, and named for their first detection in the genes lin-11 isl-1 mec-3. L I M domains are approximately 60 amino acids in size, and consist of the consensus histidine and cysteine rich sequence CX2CXi6-23HX2CX2CX2CXi6-2iCX2(C/H/D) in which the positions of the cysteine and histidine residues are highly conserved and the amino acids depicted by an X are variable (Sadler et al. 1992). Together, these amino acids form two tandem repeat zinc fingers that differ from those found in transcription factors in that they appear to mediate protein-protein interactions rather than directly regulating transcriptional activity (Kadrmas and Beckerle 2004). Proteins containing L I M domains are found in many organisms and have been implicated in a variety of roles depending on the composition of the L I M domains present, as wel l as any other domains present in the molecule (Kadrmas and Beckerle 2004). 18 Four L I M domain proteins have been previously described as playing a role in C. elegans body wall muscle (Table 1). O f primary importance is U N C - 9 7 , which is the only L I M domain protein required during embryogenesis. Nu l l mutants of unc-97 are Pat, while those carrying the mutant allele sullO are viable but paralyzed as adults indicating some partial U N C - 9 7 protein function (Hobert et al. 1999; Ken Norman (University of Utah, U S A ) , unpublished). U N C - 9 7 appears to play an integral role in the U N C - 1 1 2 / P A T - 6 branch of the dense body assembly pathway as all three of these proteins have been shown to bind the carboxy region of PAT-4 (Ken Norman (University of Utah, U S A ) , unpublished). This correlates with previous work by Tu et al. (1999) involving human homologs of these proteins in human tissue culture cells. A n interaction also exists between U N C - 9 7 and the tandem repeat zinc finger protein UNC-98 (Mercer et al. 2003). Animals lacking functional U N C - 9 8 have disrupted sarcomere organization, and muscle cells contain needle-like aggregations of sarcomeric proteins including UNC-15 (Mercer et al. 2003). Another L I M protein necessary for proper organization of the myofilament lattice is U N C - 9 5 (Broday et al. 2004). Proper recruitment of DEB-1 at normal levels is contingent on the presence of U N C - 9 5 . Lastly, A L P - 1 is a L I M protein that has been shown to localize to dense bodies and M-lines, as well as sites of adhesion between muscle cells (McKeown et al. 2006). Homozygous alp-1 mutants are superficially wi ld type in terms of muscle structure and function, but the localization of ALP-1 protein to specific muscle attachment sites indicates it may carry out a muscle specific function. A n interesting similarity between all four of these L I M domain proteins is that, along with their attachment site localization patterns, each has been observed in the nucleus of muscle cells (Broday et al. 2004; Hobert et al. 1999; Ken Norman (University of Utah, U S A ) , unpublished; M c K e o w n et al. 2006; 19 Mercer et al. 2003). To date, no concrete evidence has been established that implicates a specific role for any of these four L I M domain containing proteins in transcriptional regulation. These four proteins are all recognized as being muscle proteins, but they are not the only L I M domain proteins in the worm. In total, there are thirty-two such proteins in the worm, fifteen of which have little or no established data on their localization or function. This large number lends itself to the possibility that one or more of these proteins may carry out a function in C. elegans body wall muscle. 20 Table 1. LIM domain proteins in C. elegans body wall muscle UNC-95 UNC-97 UNC-98 ALP-1 # of LIM Domains 1 5 1 1-4 Other Domains PDZ Nuclear Localization V V V V M-Line Localization V V V Dense Body Localization V V V V Adhesion Plaque Localization V V Mutant Phenotype Unc Unc, Pat Unc WT Table 1 data from: Broday et al. (2004); Hobert et al. (1999); Ken Norman (University of Utah, U S A ) , unpublished; McKeown et al. (2006); Mercer et al. (2003) 21 1.5 A two-pronged approach to identify muscle affecting genes Mutational screens have long been the gold standard for genetic analysis. In C. elegans muscle, much of what we know about the protein composition of adhesion complexes was uncovered in such screens. In order for an embryo to proceed to the larval stage, it must have at least minimal function in its muscle cells to allow for movement and release from the chitin shell of the embryo. Thus genes encoding proteins required for the early formation of a functional sarcomere lead to the embryonic arrest Pat phenotype when mutated. A second group of mutants, the Unc class, exhibit hindered movement or paralysis. These genes are not necessary for the early formation of a striated myofilament array, but are responsible for a number of processes such as the recruitment of DEB-1 in the case of U N C -95 (Broday et al. 2004). Proper positioning of muscle cells can be disrupted when members of the muscle-positioning defective (mup) class of genes are mutated (Goh and Bogaert 1991; Hedgecock et al. 1987), and animals displaying mildly disorganized muscle without an obvious defect in global movement are classified as disorganized muscle (dim) mutants (Rogalski et al. 2003). St i l l , based on expression data it appears that a number of genes playing a part in the assembly and maintenance of the sarcomere that have yet to be characterized. Tissue specific Serial Analysis of Gene Expression ( S A G E ) and microarray studies have identified over 5000 genes expressed in embryonic muscle cells (McKay et al. 2003; D .G. Moerman lab, unpublished; David Mi l ler III (Vanderbilt University, U S A ) , unpublished), the vast majority of which do not have a previously established muscle role. It is also possible that many muscle genes do not result in an overt phenotype when knocked down and have thus been missed in mutational screens focusing on movement phenotypes. 22 Two known muscle genes dim-l and uig-\ are required for proper myofilament organization, but the movement of affected animals appears to be wi ld type. Due to limitations inherent in mutational screens for genes affecting C. elegans body wal l muscle, we set out to build a better screen that was more sensitive in picking up genes necessary for proper myofilament organization. Specifically, we carried out an R N A i screen to identify muscle affecting genes in C. elegans. We used a screening procedure that was designed to be more sensitive than previous genome wide screens (Kamath et al. 2003; Simmer et al. 2003) by implementing the use of a green fluorescent protein (GFP) tagged myosin strain (RW1596). Use of RW1596 allows for microscopic inspection of myofilament integrity in situ due to fluorescently labeled myosin ( M Y O - 3 : : G F P ) incorporated into thick filaments. Additionally, due to observed lower stability of M Y O - 3 : : G F P in l iving animals, RW1596 may provide a more sensitive background, allowing for identification of genes that may otherwise be missed due to genetic redundancy exhibited in the worm (Lehner et al. 2006b). Screening was carried out on a subset of genes identified as being expressed in body wal l muscle (McKay et al. 2003; D.G. Moerman lab, unpublished; David Mi l le r III (Vanderbilt University, U S A ) , unpublished). Our R N A i screen yielded 119 genes required for proper myofilament organization, 103 of which had no previously characterized muscle affecting role in C. elegans. Addit ionally, we have employed a bioinformatics based approach, taking into account tissue specific expression data and predicted protein structure to identify a novel component of C. elegans dense bodies. S A G E data from C. elegans embryonic tissue libraries was filtered to compile a list of genes with enriched expression in muscle tissue. These genes were further filtered to isolate genes encoding protein domains found in proteins known to 23 affect muscle. O f primary interest was the gene C28H8.6 which encodes four L I M domains. C28H8.6 bears a striking similarity to human paxil l in, which is a primary component of vertebrate focal adhesions. Analysis of a homozygous mutant provided by the International C. elegans Knockout Consortium, Vancouver (Canada) helped demonstrate that this novel muscle component is necessary for coordinated movement and postembryonic development. Subsequent R N A i analysis established that C28H8.6 is necessary for proper organization of the myofilament lattice, and thus of prime importance to body wall muscle. By carrying out an R N A i screen to identify novel genes affecting body wal l muscle in C. elegans, we have provided researchers with a list of 103 gene targets for further study. The work we have carried out to characterize C28H8.6 is an example of the type of further study required to properly characterize each of the 103 novel muscle affecting genes we identified in our R N A i screen. 24 CHAPTER TWO - MATERIALS AND METHODS 2.1 An RNAi screen for novel muscle affecting genes 2.1.1 Identification of muscle expressed genes A list of genes with expression in embryonic muscle cells was generated by K i m Wong at the British Columbia Genome Sciences Centre, Vancouver (Canada) using tissue specific expression data. This expression data was produced by util izing C. elegans embryos harbouring a muscle specific G F P marker. Embryos were disrupted into a suspension of single cells which ranged in age from the two cell embryonic stage to approximately comma stage embryos, and sorted to produce a population of cells of >90% GFP-marked muscle cells (McKay et al. 2003). Subsequent R N A extraction was followed by S A G E and microarray analysis. We considered a gene to be expressed in muscle i f it was upregulated in sorted muscle cells compared with whole embryos in 2 of 3 microarray experiments, and be present in at least 1 of 2 S A G E libraries. After compiling all such genes, confirmed ribosomal and mitochondrial associated transcripts were excluded to leave a final tally of 4042 genes. 25 2.1.2 Genome-wide RNAi feeding library Gene knockdown was achieved through the use of an R N A i feeding library produced by Dr. Julie Ahringer's lab at the University of Cambridge, Cambridge (UK) , publicly available from Geneservice (UK) . In total, this library contains -17,000 frozen bacterial glycerol stock clones in 384 well format, covering 87% of predicted C. elegans genes (Kamath et al. 2003). Each clone was produced by insertion of a corresponding gene fragment into a specialized feeding vector that produces double stranded R N A of the cloned fragment. This was carried out by polymerase chain reaction (PCR) amplification of a 1 -2 kb genomic fragment using primers generated by Research Genetics Genepairs. Ampl i f ied products were then cloned into an E c o R V digested L4440 (pPD 129.36) vector (Timmons and Fire 1998). This resulted in a series of L4440 plasmids with a genomic fragment specific for each gene, cloned in between two T7 promoters present on the vector. Lastly, each construct was purified and transformed into HT115(DE3) bacterial cells which lack d s R N A specific RNase III activity, before final glycerol stock preparation. 2.1.3 Experimental procedure 2.1.3.1 Strains used RW1596 (myo-3(st386) V ; stEx30{myo-3::G¥V + rol-6(sul006)]) Observation of myofilament integrity required use of the C. elegans strain RW1596, provided by Pamela Hoppe, Western Michigan University (USA) . Although these animals are homozygous for the lethal myo-3(st386) allele, they are viable due to the presence of a 26 wi ld type copy of the myo-3 gene fused in frame with sequence coding for G F P on the stEx30 array. The GFP-tagged M Y O - 3 protein is functional and allows for the visualization of the muscle thick filaments using G F P fluorescence. RW1596 animals are rollers (Rol) due to the presence of the dominant mutant allele of rol-6(sul006) within the stEx30 array. Animals not carrying the array are Pat due to lack of functional M Y O - 3 protein. 2.1.3.2 Preparation of worms for screening Large quantities of RW1596 animals were grown on two 15 cm nematode growth medium ( N G M ) petri dishes streaked with E. coli (OP50 strain). Plates were then washed with 10 mL of M 9 buffer into a sterile 15 mL polypropylene tube. Treatment with 10 mL of hypochlorite solution (75% dH20, 20% sodium hypochlorite, 5% I O N K O H ) was then carried out with gentle shaking. After the appearance of a yellow colour in the solution and subsequent loss of visible animal carcasses, the resulting embryos were rinsed 3 times in a minimal salts (M9) buffer to eliminate any remaining hypochlorite solution. Embryos were resuspended in 10 m L of M 9 buffer and transferred to a 15 mL polyproplene tube where they were incubated at room temperature overnight with gentle shaking. This allowed for hatching of embryos, and synchronization of worms at the first larval (L I ) stage. 27 2.1.3.3 RNAi feeding Individual clones from the frozen Geneservice feeding library were picked and grown overnight in Lennox (L) broth containing 50 ug/mL ampicil l l in. Specialized N G M plates were prepared containing 1 m M isopropyl-beta-D-thiogalactopyranoside (IPTG) to induce production of d s R N A in bacteria, and 50 ug/mL carbenicillin to select for bacteria containing an R N A i construct. These plates were then streaked with 50 ul of overnight culture and incubated at room temperature overnight to allow for d s R N A production. Approximately 20 L1 worms were spotted onto each R N A i plate by transferring a small aliquot of the previously prepared M 9 solution containing the hatched animals. These plates were then incubated at 20°C until the worms reached the young adult stage (-60-68 hours). Four worms from each plate were then transferred to fresh R N A i plates corresponding to the same feeding construct (2 plates, each with 2 worms), and left for -18 hours to lay embryos. These animals were then removed, and the newly laid embryos were incubated at 20°C until they reached the fourth larval stage (L4)/young adult stage (-36 hours). A small subset of feeding clones had such a strong R N A i effect, that animals did not proceed to the L4/young adult stage within this time frame. 2.1.3.4 Screening for overt phenotypes After maturation to the L4/young adult stage, worms were screened for phenotypes visible under a dissecting microscope (Wi ld Heerbrugg model). These included embryonic lethality (including Pat), uncoordinated movement (Unc), paralysis (Prz), larval arrest (Lva), c 28 larval lethality (Lvl) , slow growth (Slo), sterility (Ste), body morphology defects (Bmd), and hyperactive behaviour (Hya). 2.1.3.5 Inspection of myofilament organization Worms fed bacteria producing d s R N A for each gene tested were prepared for microscopic analysis by transferring a minimum of 20 animals into a 15 ul drop of M 9 containing 10% sodium azide on a glass microscope slide. By pinching a 24 mm x 24 mm coverslip between the thumb and forefinger, it was able to slightly stick to the side of one finger as it was laid gently onto the slide surface, thus minimizing pressure on the specimens. Slides were then mounted on a compound fluorescent microscope (Zeiss Axiophot D-7082 Oberkochen). Myofilaments were inspected at 200x magnification, and images were taken at 400x magnification using a Qimaging Q I C A M digital camera running Qcapture version 1.68.4. Integrity and organization of the myofilament lattice were noted including large aggregations of protein product within or beside the filaments, reduced protein levels (fluorescence), general disorganization of the structure, and gaps in the normal expression pattern (see Figure 8, Chapterl l l : 3.1.2). Slides that had >50% of animals displaying one or more of these defects had their corresponding gene termed 'muscle affecting.' A similar rescreening procedure was then carried out for these genes, with at least 30 animals screened and a more intensive study of myofilament integrity. 29 2.1.3.6 PCR confirmation of RNAi feeding clones We confirmed the presence of correct D N A fragments in the R N A i feeing clones using polymerase chain reaction (PCR). Sets of primers specific to the genes we identified as affecting muscle were used in 25 ul P C R reactions containing 10 pmol of both forward and reverse primers, l x P C R buffer, 480 u M deoxynucleotidetriphosphates (dNTPs), 1 unit of Taq polymerase, and D N A from the clone being tested. Products were then separated by agarose gel electrophoresis after loading into a 1% Tris-Acetic ac id -EDTA (TAE) agarose gel. The P C R fragments were observed by staining with Ethidium bromide and visualizing with an ultraviolet (UV) transilluminator. D N A sequencing was used to confirm positive P C R results. 2.1.3.7 Data storage Images and observations for each gene screened were archived in an online database. This was created using M y S Q L 3.23.49, with a web interface written in P H P 4.3.4. Many searchable options are integrated into the database including observed overt phenotypes, penetrance of R N A i for each gene and control constructs, and images of animals fed d s R N A for each muscle affecting gene (see Figure 10, Chapter III: 3.1.4). Hosting was set up on a server maintained by the Department of Zoology, University of Brit ish Columbia (Canada). This database was developed in collaboration with Adam Lorch, a research technician in the laboratory of Dr. Donald G. Moerman, University of Brit ish Columbia (Canada). 30 2.2 Identification of the novel dense body protein C28H8.6 2.2.1 Experimental procedure 2.2.1.1 Strains used N 2 (Bristol) is the principal wild-type strain used in C. elegans research (Brenner, 1974). The VC1012 (tag-327(okl483)lmTl[dpy-10(el28)] III) strain was provided by the International C. elegans Gene Knockout Consortium, Vancouver (Canada). The okl483 mutation is a 943 bp deletion within the C28H8.6/'tag-327 gene and animals homozygous for okl483 arrest development as L l larvae. Balancing of the homozygous lethal genotype was accomplished by crossing in the chromosome III balancer mTl, a dpy-10 marked translocation. The MT2495 \lin-15(n744) X ] strain was provided by the Caenorhabditis Genetics Center (CGC) , University of Minnesota (USA) . This strain is hypersensitive to R N A i while appearing superficially wild-type under normal growth conditions (Lehner et al. 2006a). Strain GE24 \pha-l(e2123ts) III] was used for microinjections because of its temperature sensitive phenotype (Granato et al. 1994). Animals homozygous for pha-l(e2123\s) arrest development during embryogenesis at 25°C, but are viable at 15°C. Adult animals grown at 15°C can be injected with D N A encoding a wi ld typepha-1 gene and then transferred to 25°C. Only progeny harbouring the injected D N A as an extrachromosomal array survive. 31 2.2.1.2 Construction of RNAi feeding clones Within the R N A i feeding library, a clone was available that affected the entire coding region of both isoforms of C28H8.6. In order to target each isoform exclusively however, additional R N A i feeding clones were constructed. Using N 2 genomic D N A as a template and Platinum Taq D N A Polymerase Hi-Fidel i ty (Invitrogen) as a polymerase, P C R was used to amplify -700 bp gene fragments corresponding to each isoform. The forward primer (5' C C C G A T A T C G T T T C G G T T T T T C G G T T T A T T G 3 ' ) and reverse primer (5' G A C G A T A T C G T A T G G C A T T A G A G A A C T A G 3 ' ) amplified a region encompassing the 5th exon of C28H8.6a, while the forward primer (5' C C A G A T A T C G T C T C T C C T G T C T T T T T G T G G 3 ' ) and the reverse primer (5' G T A T T A G C A G G T A G T T A G A T A T C A A A A C A A 3 ' ) amplified a region covering the 6th exon of C28H8.6b. Each fragment was then subcloned into the PCR-Blunt I I -TOPO vector (Invitrogen) using the Zero Blunt T O P O P C R Cloning K i t (Invitrogen). In parallel, the vector L4440 (provided by Andrew Fire, Stanford University (USA)) was digested with E c o R V and treated with Cal f Intestinal Alkal ine Phospatase (CIP) before separation in a 1% T A E agarose gel. After digestion of the PCR-Blunt I I -TOPO vector to release each isoform specific sequence, products were separated in a 1% T A E agarose gel. A l l visualization of nucleic acids was accomplished by util izing SyberSafe gel stain (Invitrogen) in conjunction with a Safe Imager blue light transilluminator (Invitrogen). Fol lowing gel purification of all products, both isoform-specific fragments were ligated into the L4440 vector overnight at 15°C using T4 ligase (Roche). Transformation into D H 5 a cells and subsequent D N A sequencing yielded successfully created constructs. A final transformation into HT115(DE3) cells was then carried out in preparation of feeding to C. elegans. The construct covering the 5th exon of C28H8.6a was named pDM#867 and the construct covering the 6th exon of C28H8.6b was named pDM#866. 2.2.1.3 Construction of a Gateway GFP translational fusion A G F P translational fusion for C28H8.6a was produced by taking advantage of the ORFeome (Reboul et al. 2003), publicly available from Open Biosystems (USA) . The C. elegans ORFeome consists of 12,625 open reading frames (ORFs) cloned into a Gateway (Walhout et al. 2000) donor vector. In collaboration with Barbara Meissner, University of Brit ish Columbia (Canada), a specialized destination vector for ORFeome created constructs was created to restrict expression to body wall muscle cells. This vector, named pDM#834 contains ~2 kb of the promoter for the known body wall muscle expressed gene T05G5.1. Downstream from the promoter are two A T T recombination sites (attR) that flank sequence coding for the ccdB gene, lethal to most E. coli cells. Using Gateway L R Clonase (Invitrogen), the donor O R F clone (ORF11108-H02) containing C28H8.6a was transferred into the pDM#834 vector between its attR sites, thus removing the ccdB gene as part of the reaction. The completed construct was named pDM#838, transformed into D H 5 a cells, and sequenced to confirm the correct reading frame of the O R F . 2.2.1.4 Production of an okl483 rescuing GFP translational fusion Construction of a G F P translational fusion able to rescue animals homozygous for the ok!483 allele was accomplished by inserting the full sequence of C28H8.6b along with ~2.4 33 kb of endogenous promoter region into the pPD95.75 vector provided by Andrew Fire, Stanford University (USA) . A forward primer consisting of 5' C A T A T C G A T C C C C C G A A A C T T G A C T A A G C C G T C 3 ' and the reverse primer 5' C C A T G G C C A T C A A T A A G C A T T T T C C T C T T C 3 ' were used in a P C R reaction with Platinum Taq D N A Polymerase Hi-Fideli ty (Invitrogen) and N 2 genomic D N A to amplify the promoter and coding regions of C28H8.6b. The forward primer contained a BamHI site at its 5' end while the reverse primer had a 5' MscI site incorporated. Vector pPD95.75 contains a BamHI cut site upstream of an MscI sequence in its multiple cloning site. Primers were designed to eliminate the stop codon in C28H8.6b while also allowing for it to be inserted into pPD95.75 such that it would be in frame with G F P sequence downstream of the multiple cloning site. Both the C28H8.6b amplicon and pPD95.75 were digested using BamHI and MscI , and the digested vector was subsequently treated with CIP before all products were loaded and separated by electrophoresis in a 1% T A E agarose gel. A l l visualization of nucleic acids was accomplished by util izing SyberSafe gel stain (Invitrogen) in conjunction with a Safe Imager blue light transilluminator (Invitrogen). After gel extraction, digested C28H8.6b and pPD95.75 construct were ligated together using T4 ligase (Roche) at 15°C overnight. Ligase products were then transformed into D H 5 a cells and sequenced to confirm the presence of the C28H8.6b insert in pPD95.75, as wel l as incorporation within the proper reading frame. The resulting construct was named pDM#864. 34 2.2.1.5 Microinjection procedure Transformation of worms with G F P translational fusion contructs was carried out using microinjection. Two co-injection markers were used in this process: pRF4 [rol-6(sul006dm)] which contains a copy of rol-6 and results in a.Rol phenotype of sucessful transformants, and pBx \pha-l::pha-l'(+)] which is able to rescue the embryonic lethalpha-1 phenotype of the strain GE24. One injection mix was prepared containing 45 ng/ul pRF4, 45 ng/ul pBx and 10 ng/ul pDM#838 (resulting in extrachromosomal array raEx55). Another was prepared containing 90 ng/ul pRF4 rol-6(sul006dm), and 10 ng/ul pDM#864 resulting in extrachromosomal array raEx82. The injection mix containing pDM#838 was injected into the gonad of GE24 worms, while the mix containing pDM#864 was injected into VC1012 animals. A l l injections were carried out using a microinjection setup featuring a Zeiss inverted compound microscope (IM35) by conventional methods (Mello et al. 1991). The two transgeneic strains obtained are DM7055 (pha-l(e2123) III; raEx55\pha-l(+); rol-6(sul006); pT05G5.1::C28H8.6a::GFP]) and DM7082 (tag-327(okl483) III; raEx82[rol-6(sul006); C28H8.6b::GFP]). The raEx82 extrachromosomal array fully rescues the okl483 lethal phenotype. 2.2.1.6 Microscopic analysis A l l phenotypic analysis of worms homozygous for the mutant allele okl483, or affected by R N A i targeting C28H8.6 was carried out using a dissecting microscope (Wi ld Heerbrugg model). Images were taken using a compound fluorescent microscope (Zeiss Axiophot D-7082 Oberkochen) at 400x magnification and a Qimaging Q I C A M digital camera running Qcapture version 1.68.4. Myofi lament organization was also observed in C28H8.6 deficient animals. Due to the arrest of C28H8.6 (okl483) mutants at the first larval stage, worms could not be screened at the L4/young adult stage as we did for our R N A i screen. The same set of procedures was carried out for myofilament screening of C28H8.6 as that of the previously described myofilament screen (Chapter II: 2.1.3.5) except that L I worms were analyzed. This required 400x magnification on a compound fluorescent microscope (Zeiss Axiophot D-7082 Oberkochen) for screening and images, which were captured using a Qimaging Q I C A M digital camera running Qcapture version 1.68.4. To properly assess the localization of C28H8.6: :GFP protein produced after successful uptake of microinjected constructs, ~50 worms were transferred with a thin platinum wire into a 15 ul drop of M 9 containing 10% sodium azide on a glass microscope slide. Animals were then covered with a 24 mm x 24 mm coverslip and mounted on a compound fluorescent microscope (Zeiss Axiophot D-7082 Oberkochen). Images were taken using a Qimaging Q I C A M digital camera running Qcapture version 1.68.4 at 400x magnification, and at lOOOx using oi l immersion. 2.2.1.7 Ant ibody staining Slides were coated with 0.1 mg/ml poly-L-lysine (Sigma) and incubated at 37°C for 30 minutes to dry. A mixed population of worms was rinsed from 6 cm N G M plates using M 9 buffer into a 1.5 mL microcentrifuge tube, and spun at 1500 rpm for 1 minute. After 36 aspiration of liquid and an additional rinse with M 9 buffer, all remaining l iquid was removed and animals were resuspended in 50 ul of a solution containing 4% sucrose and 1 m M ethylenediaminetetraacetic acid ( E D T A ) p H 7.4. This suspension was then transferred onto coated poly-L-lysine slides in 15 ul volumes and covered with a 24 mm x 48 mm glass coverslip, allowing slight overhang of the coverslip. Worm covered slides were transferred to an aluminum sheet kept at -80°C and immediately placed in a -80°C freezer prior to microscopic analysis. Antibodies were diluted in a solution containing 150 m M N a C l , 50 m M Tris p H 7.8, 0.1% Tween-20, and 1% Bovine Serum Albumen (BSA) . The primary antibody used was M H 3 5 , an anti-a-actinin mouse antibody (Francis and Waterston 1985). A goat anti-mouse secondary antibody conjugated with Alexa 568 was used to visualize M H 3 5 . Assessment of a-actinin co-localization with G F P fluorescence in DM7055 animals was carried out by staining worms with MH35 antibody. Prepared slides kept at -80°C were removed, and a razor blade was inserted under a coverslip overhang. A quick movement released the coverslip while also fracturing the frozen worms. Immediately, slides were placed in Copl in jar containing -20°C 100% acetone for 4 minutes. Subsequent 1 minute intervals in 75% acetone, 50% acetone, and 25% acetone were followed with a 4 minute incubation in a solution containing 150 m M N a C l , 50 m M Tris p H 7.8 and 0.1%) Tween-20 (wash solution). Each glass slide was then wiped dry with filter paper except for the area covered by bound worms. After 50 ul of primary antibody diluted in solution was added, slides were kept in a humidified container for 3 hours. Slides were then placed in a Copl in jar containing wash solution for 4 minutes before being wiped with filter paper, again taking care to not disturb the bound animals. A 50 ul volume of secondary antibody diluted in solution was added, and slides were kept in a dark humidified chamber for 90 minutes. A further 60 minutes in a Copl in jar containing wash solution was then carried out while ensuring slides were in total darkness. Lastly, slides were removed and 15 ul of anti-bleaching solution (90% glycerol, 2% l,4-diazabicyclo[2.2.2]octane ( D A B C O ) in Tris-buffered saline (TBS)) was added before slides were covered with a glass coverslip and sealed with nail polish. Analysis of localization patterns for MH35 and C28H8.6a was carried out using a Zeiss Axiovert inverted compound microscope (200M) with a Zeiss Pascal confocal setup (LSM5) . Images were taken at 400x magnification using Pascal imaging software version 3.2sp2. 38 CHAPTER THREE - RESULTS Prior work by the C. elegans muscle community has led to a basic understanding of how the worm sarcomere is built. However, we feel that in order to carry out the processes of building and maintaining they myofilament lattice, additional proteins must be involved that play important roles. Animals with moderately disrupted muscle structure appear to move normally (Hikita et al. 2005; Rogalski et al. 2003) and as such, mutational screens focusing on movement phenotypes have missed genes with important but not essential roles. We set out to build upon the work done by others by pinpointing novel genes required for the proper functioning of the sarcomere. Two experimental approaches were carried out to accomplish this goal. First, a sensitive R N A i screen was used to identify genes that were required for proper myofilament organization. The worm strain we used added sensitivity to the screen by allowing us to observe a fluorescently tagged myosin heavy chain protein in live animals. Secondly, S A G E data was used to identify a small number of genes with enhanced muscle specific expression. After further filtering based on protein structure, the L I M domain protein C28H8.6 was identified as a potential muscle affecting gene. Both the R N A i and bioinformatics based screens were successful in accomplishing the goal of identifying genes that affect body wall muscle in C. elegans. 3.1 RNAi screen for muscle affecting genes Much of what we know about C. elegans muscle has been derived from mutational screens. Whi le these have been very fruitful, we chose to take a reverse genetics approach to 39 identify additional genes necessary for proper myofilament organization. In collaboration with Barbara Meissner and Nicholas Dube, we focused our attention on a list of genes with demonstrated embryonic muscle cell expression in S A G E and microarray analysis (McKay et al. 2003; D .G. Moerman lab, unpublished; David Mi l ler III (Vanderbilt University, U S A ) , unpublished). After filtering out genes known to be specific to mitochondria and ribosomes, 4042 genes had active transcripts in C. elegans muscle, 3301 of which had available clones in the Ahringer R N A i feeding library. Prior to screening, control experiments were carried out to assess the effectiveness of our screen and the ability to confidently discriminate between wi ld type and affected myofilament integrity. Two negative controls were used: worms fed HT115(DE3) cells without a feeding vector, and worms fed HT115(DE3) cells containing the L4440 vector without a genetic insert. Each of these controls was essentially the bacterial strain used in our R N A i experiments (HT115), but with constructs unable to produce dsRNA. Both negative controls provided almost identical results, so the negative control throughout our screen was restricted to HT115(DE3) with an empty L4440 vector. A s positive controls, two constructs were employed: HT115(DE3) cells containing an L4440 vector with an insert consisting of G F P sequence, and HT115(DE3) cells harbouring an L4440 vector containing an unc-97 insert. The G F P targeted construct was used because the transcript coding for M Y O - 3 : :GFP has G F P and myo-3 sequences fused together in our worm test strain RW1596. R N A i in C. elegans eliminates the entire m R N A molecule so the construct targeting G F P destroys the entire myo-3::G¥P m R N A molecule, thus depleting animals of necessary myosin heavy chain A . Over 79% of animals fed our G F P targeted feeding clone displayed significant defects in myosin localization (n=306). Similarly, 66% of worms (n=446) fed an 40 unc-97 targeted R N A i clone had defects in the myofilament lattice which was expected due to the integral role U N C - 9 7 plays in sarcomere assembly. In comparison, the negative control we tested caused little effect. Only 25% of worms (n=641) had observed flaws in myosin localization when fed HT115(DE3) cells that do not produce d s R N A (Figure 6). 41 50um Figure 6. Myof i lament organization in R N A i control animals When RW1596 animals producing G F P tagged M Y O - 3 are fed E. coli that does not produce d s R N A , body wall muscle myofilaments are arranged in an organized parallel structure (A). When fed unc-97 dsRNA, M Y O - 3 : :GFP is observed in large deposits (B). A n almost complete loss of M Y O - 3 : :GFP is observed in animals fed d s R N A containing G F P sequence (C) Images were taken using a Zeiss confocal microscope. 42 3.1.1 Age related muscle degeneration in RW1596 The G F P tagged M Y O - 3 fusion protein made by the strain RW1596 does not appear to be as stable as the native form of the myosin protein. As RW1596 worms age, a decrease in the organization of myofilaments within body wall muscle has been observed using a fluorescent compound microscope (Herndon et al. 2002). In control experiments, we also noticed a breakdown in RW1596 myofilament organization, primarily in older adult worms. We assessed this progressive disorganization of myofilaments by scoring worms at the fourth larval stage (~2 days old) and in older animals. The percentage of animals with disrupted myofilament organization was - 2 3 % at the L4/young adult stage, just slightly below the observed value for animals fed a negative control in the myofilament screen. A t the 1 day adult stage (~3 days old total), this climbed to - 3 2 % , and rose yet again to - 4 3 % in 2 day adults. By the 3 day adult age, - 7 5 % of RW1596 worms had defective myosin localization (Figure 7). It is also possible to observe myofilament integrity using polarized light microscopy. We used polarized light to assess the myofilament lattice in each animal to ensure that fluorescent artefacts were not skewing observations, and results between the two were congruent. In wi ld type (N2) animals however, we observed normal myofilament organization in worms as old as 15 days into adulthood. Due to the age related myofilament breakdown in RW1596 worms, we restricted our scoring procedure to L 4 animals. Figure 7. Age related decline of myofilament structure in RW1596 muscle As RW1596 animals age, the organization of myofilaments decreases. Panel A is representative of an L 4 animal with normal structure of the myofilament lattice. In a one day old adult (B) the structure is still fairly stable, but breaks down noticeably in two day old (C) and three day old adults (D). Images were taken using fluorescent compound microscopy. 44 3.1.2 RNAi screen of muscle expressed transcripts In a preliminary pass through all 3301 genes available in our R N A i library, genes were placed in one of four categories based on the penetrance observed in tested animals. More specifically, as each animal was observed, its myofilament organization was assessed and judged to be abnormal or wi ld type. For each R N A i construct, the number of animals with either abnormal or wi ld type myofilament organization was tabulated and the percentage of worms in each category calculated. Constructs that resulted in less than 25% of animals with a displayed myofilament defect were termed wild type class. Those that caused defects in 25% to 49.9% of worms were labeled as low class (LC) , and d s R N A for genes causing myofilament disruption in 50% to 74.9% of animals were named intermediate class (IC). The remaining genes had d s R N A constructs causing myofilament disorganization in 75% of animals or greater and were termed high class (HC). A variety of defects in the myofilament lattice organization were observed during the screening procedure. Large aggregations of protein product within or beside the filaments were common in affected animals, as well as general disorganization of the normal discrete alignment of filaments. A n overall lower integration of fluorescent myosin into its normal localization was observed in some animals, as well as large gaps in the normal expression pattern (Figure 8). A l l of these myofilament irregularities were used to score animals as wi ld type or having abnormal myofilament organization, and thus labelling each gene as affecting, or not affecting body wall muscle in C. elegans. 4 • C09D1.1 "m r • M04F3.4 Figure 8. Myof i lament defects observed in R N A i affected animals When fed dsRNA, worms display a variety of abnormalities in the regular oganization of myofilaments in body wall muscle. Using a fluorescent compound microscope, large deposits of M Y O - 3 : :GFP were observed in some animals fed d s R N A corresponding to genes in our screen (A). General disorganization (B) and gaps in the regular myofilament structure (C) were also present in some worms. Small aggregations within or close to myosin filaments were also recorded, organized in a pattern perpendicular to the orientation of the filaments themselves (D). 46 O f 3301 genes screened, 290 were scored as being IC or H C representing 8.8% of tested constructs. A total of 42 genes fell into the high class while 248 were deemed intermediate class. These genes all produced defects in the organization or localization of M Y O - 3 : :GFP in over 50% of the animals, and thus were targeted for further analysis. O f those genes that were not deemed necessary to examine further, 1710 fell into the low class, while 1252 had little or no myofilament abnormalities. A remaining 49 tested genes displayed such a severe embryonic lethal (Emb), larval lethal (Lvl), sterile (Ste) or larval arrest (Lva) phenotype that not enough worms were available to screen. Of these 49 genes, 18 were not tested any further due to previously characterized roles in non-muscle essential functions. The other 31 genes were subjected to further screening along with genes exhibiting myofilament anomalies in 50% of animals or greater. 3.1.3 Rescreen of genes identified in our RNAi screen Genes with a penetrance of & 50% in our R N A i screen were subjected to an additional, more stringent screen to ensure reproducibility of our results. The scoring procedure included a larger number of animals (n s: 30), and required that animals had substantial myofilament defects throughout the animal rather than a portion of the body wal l muscles in each animal. A n additional 31 R N A i constructs resulting in non-viable phenotypes in our preliminary screen were also rescreened, with the first generation of animals tested to avoid the subsequent lethal F I population. In total, 119 genes passed through both the preliminary screen and rescreen. O f these, 12 fell into the high class, and 107 fell into the intermediate class (Figure 9). A 47 number of the positive preliminary screen genes dropped to a total of below 50% penetrance, but 51 of these genes still had a penetrance of a 45% when animals were fed corresponding d s R N A making them of interest as well . Attesting to the efficiency of this screen, 16 of 23 known muscle affecting genes were confirmed. These include unc-15, unc-23, unc-45, unc-52, unc-89, unc-95, unc-97, unc-112, pat-4, pat-6, mup-4, myo-3, vab-10, tnt-3, uig-1, and dim-1. Interestingly, both uig-1 and dim-1 were previously known to not display an overt phenotype but still had disorganized myofilaments making them more difficult to pick out with a mutational screen focusing on movement phenotypes. Genes such as dim-1 and uig-1 were a gene type we had hoped to pick out with this screening method. O f the 119 genes identified as muscle affecting after the preliminary screen and rescreen, 58 did not display an obvious movement, body morphology, or behavioural phenotype, but did have abnormal myofilament organization. Our ability to identify muscle affecting genes not necessary for proper movement, but necessary for normal myofilament organization demonstrates the advantages of our R N A i screen when compared with a mutational screen for uncoordinated animals. 48 Preliminary Screen 1252 (38.0%) 1710 (51.8%) 3301 genes screened 49 (1.4%) Rescreen Wild Type Low Class Intermediate Class High Class Insufficient data RNAi penetrance Wild Type: 0-25% Low Class: 25-50% Intermediate Class: 50-75% High Class: 75-100% 107 216 High Class • Intermediate Class • Dropouts F igure 9. Results f rom an R N A i screen targeting 3301 genes After screening of 3301 genes, 42 genes were present in the high class, and 248 genes in the intermediate class. A n additional 49 genes were not initially screened due to non-viable phenotypes. These genes, highlighted pictorially by a shaded oval, were screened by observing myofilament organization in animals grown for only one generation on R N A i bacteria, and genes in the high class and intermediate class were rescreened with higher scoring stringency. This resulted in 119 genes deemed to be muscle affecting, represented by the blue and red pie pieces in the rescreen chart. 3.1.4 RNAi screen data storage Storage of our experimental results was accomplished by setting up an online database in collaboration with Adam Lorch, a research technician in the laboratory of Dr. Donald G. Moerman, University of British Columbia (Canada). A l l information was stored in such a way that it can be retrieved using a number of search tools outlined below. Due to the large number of genes that fell slightly below a penetrance level of 50%, a user may set their own cutoff at any desired percentage and search for these genes. In this way, varying levels of stringency can be placed on the data by each individual user. The efficiency of R N A i was found to be slightly variable between each day of screening so we have provided control data that allows database users to correlate data for each screened gene with corresponding control data. Images are also available within the database for each gene at the time of the preliminary screen and rescreen. These images provide an accurate view of the types of defects produced by each R N A i construct (Figure 10). 50 Figure 10. R N A i muscle screen online database Pictured is a screenshot of a web-based database to be published online and made public when completed. This database contains all pertinent results from our R N A i screen for muscle affecting genes. A number of search tools are available in order to provide each user with the data they require in an easy to navigate format. This database was produced by Adam Lorch, laboratory of Dr. Donald G. Moerman, University of Brit ish Columbia (Canada). 51 3.1.5 PCR confirmation of RNAi feeding clones Confirmation of all clones corresponding to genes identified as muscle affecting was necessary to ensure that each contained the appropriate construct. P C R was used to amplify the genomic insert within each clone, using gene specific primers. By using gene specific primers and purified R N A i construct D N A , we were able to conclude whether an R N A i clone contained the proper genomic insert i f amplification occurred. Feeding constructs for which a suitably sized P C R amplicon was not produced were sequenced by taking advantage of T7 promoters flanking the multiple cloning site of the L4440 vector. In total, 17 constructs did not contain the appropriate genomic insert. A portion of these were due to incorrect transfer of clones from the original 384 well format of our R N A i library to a new set of 96 wel l plates, and some were due to incorrect confirmation at the time of R N A library construction. One of the R N A i feeding clones contained an insert that is not listed as being present in our R N A i library. Geneservice (UK) has released data indicating which wells contained incorrect clones and where the correct clone was located. A l l genes listed in the latest version of our online database as being muscle affecting have been updated to reflect the proper genomic sequence within each clone that produced a muscle affecting phenotype. 52 3.2 Identification of a novel LIM domain dense body protein Along with our R N A i screen for muscle affecting genes, a bioinformatics based screen was carried out to identify candidate muscle affecting genes to study further. Using a wealth of tissue specific expression data described below, a number of genes with enhanced expression in muscle were identified. Genes with an enriched expression pattern in muscle were then ranked based on a number of criteria including predicted protein structure and homology with human proteins. The top candidate gene C28H8.6 was subjected to further study, and found to be present in body wall muscle dense bodies and required for proper myofilament organization. 3.2.1 SAGE expression of C28H8.6 The gene C28H8.6 was pinpointed as a possible body wall muscle component by using gene expression data as a primary criterion. C. elegans S A G E expression data was produced through a process of sorting embryonic cells derived from worm strains producing fluorescence in specific tissue types (McKay et al. 2003; D.G. Moerman lab, unpublished). Subsequent R N A extraction and sequencing of S A G E tags produced twelve tissue specific libraries detailing levels of gene expression within each cell type. Upon careful observation of these data sets (publicly available at http://elegans.bcgsc.ca/home/sage.html), there were some notable trends that emerged, specifically with previously characterized body wal l muscle genes unc-95 and unc-97 (Figure 11). Overall expression levels within the two muscle specific S A G E libraries were markedly increased when compared to non-muscle genes, and expression was also higher in the muscle data sets when compared to all other 53 tissue specific libraries. A similar pattern was also seen for C28H8.6. Within 12 embryonic tissue specific S A G E libraries, the sum of all S A G E tags corresponding to C28H8.6 was 55. O f these 55 tags, 29 were within two muscle cell libraries while the remaining 26 were spread throughout the others fairly evenly (Figure 12). O f note, all libraries are normalized to 100,000 tags because some libraries have more total S A G E tags than others, thus normalization ensures that all libraries are comparable. Some elevated C28H8.6 expression was also seen within the punc-4::GFP neuronal library, but numbers tend to be higher in libraries containing only a few cells within a tissue type due to the overall lower number of genes expressed. Overall, the highly muscle enriched arrangement of tags was intriguing because known body wall muscle genes appear to show muscle enrichment as wel l . 54 F igure 11. S A G E expression of known muscle genes A number of known muscle genes including unc-95, and unc-97 have a muscle specific S A G E enrichment. The X-axis reflects the tissue types in each S A G E library, while the Y -axis reflects the number of S A G E tags in each library. Red bars represent muscle libraries while dark blue bars represent full embryonic tissues. Light blue bars represent libraries made using cell types comprising only a subset of cells within a tissue. 55 F igure 12. Elevated muscle cell S A G E expression of C28H8.6 Within embryonic tissue S A G E libraries constructed for C. elegans, C28H8.6 has a majority of S A G E tags within the two muscle specific libraries. The X-axis reflects the tissue types in ,each S A G E library, while the Y-axis reflects the number of S A G E tags in each library. Red bars represent muscle libraries while dark blue bars represent full embryonic tissues. Light blue bars represent libraries made using cell types comprising only a subset of cells within a tissue. 56 3.2.2 C28H8.6 gene model Two gene isoforms are predicted for C28H8.6. Both isoforms code for a 256 amino acid protein and share 4 exons of coding sequence (Figure 13). C28H8.6b contains 6 exons and encompasses the entire coding sequence of C28H8.6a because of the spacing of its 2 exclusive exons. C28H8.6b has been partially confirmed according to c D N A sequencing data produced by Yu j i Kohara, National Institute of Genetics, Mish ima (Japan). Two L I M domains are encoded by the 4 exons of coding sequence shared between C28H8.6b and C28H8.6a. A n additional 2 L I M domains are encoded by the 5th exon of C28H8.6a, giving it a total of 4 L I M domains. Sequencing data from the ORFeome project (Reboul et al. 2003) has confirmed C28H8.6a as an expressed isoform of the gene, even though this is not reflected on the Wormbase website. A Predicted model of C28H8.6 gene isoforms C28H8.6D 4472bp, 771 bp coding 5' C28H8.6a 3784bp, 771 bp coding • • 0 • 3' l I ok1483 deletion B Predicted model of C28H8.6 protein isoforms C28H8.6b N C28H8.6a " LIM LIM " - C N ' LIM LIM LIM 256 aa LIM )C 256 aa Figure 13. Models of the organization and structure of the C28H8.6 gene and protein isoforms Two gene isoforms exist for C28H8.6, both sharing significant coding sequence (A). These isoforms code for two L I M domain containing proteins, for which two L I M domains are shared by both isoforms and two L I M domains are exclusive to the ' a ' isoform. The 943bp deletion allele okl483, represented by a red bar above, was isolated by the International C. elegans Gene Knockout Consortium, Vancouver (Canada) and significantly disrupts the predicted coding sequence of both isoforms. 58 3.2.3 Production of strain DM7082 through mutant rescue The International C. elegans Gene Knockout Consortium, Vancouver (Canada) provided the strain VC1012 which carries a 943bp deletion within the coding sequence of C28H8.6. When homozygous for the mutant allele okl483 on chromosome III, animals arrest at the L1 stage of development. This allele was balanced with the balancer chromosome m T l in order to maintain the allele. This posed some problems for mutant analysis however because animals homozygous for m T l are dumpy (Dpy), referring to their shorter and wider phenotype, and are also not viable. To alleviate the issues caused by the balancer allele, a construct for rescue was engineered that contained the entire coding sequence of C28H8.6b and its promoter region (~2.4kb) in frame with sequence coding for G F P at its 3' end (C28H8.6b::GFP). Since C28H8.6b encompasses C28H8.6a, production of both isoforms was possible within transformed animals, but only C28H8.6b is fused to G F P . C28H8.6b: :GFP was combined in solution with the rol-6 co-injection marker pRF4 which upon proper transformation causes animals to roll upon themselves as they move. The C28H8.6b: :GFP/pRF4 solution was injected into the gonad of adult VC1012 animals. Concatamerization of C28H8.6b: :GFP and pRF4 produced the extrachromosomal array raEx82, and progeny harbouring the array were fully rescued from the L l arrest phenotype and were detected by the roller phenotype (Rol). The ability to create populations from an individual hermaphrodie worm homozygous for okl483 confirmed complete rescue because i f the array were not able to rescue, the animals would not lose the m T l balancer and Dpy worms would be present in the population. The Dpy phenotype was not observed in DM7082 worms and all animals larger that L l were rollers due to the presence of pRF4 in the raEx82 array. Because the extrachromosomal array was not passed on to all progeny, 59 some DM7082 animals were homozygous for okl483 but lacked the rescuing array raEx82. These worms were not rollers and still arrested at L I , providing an opportunity to better observe their mutant phenotype in the absence of any other genetic lesions. 3.2.4 Phenotype of C28H8.6 mutants Most previously characterized C. elegans muscle mutants arrest at the embryonic stage or display uncoordinated movement. However, C28H8.6 mutants arrested at the first larval stage. Using the strain DM7082, the mutant phenotype for this gene was observed and studied. Animals homozygous for the mutant allele ok J483 arrested at the first larval stage of development and were less responsive to stimuli. N2 worms respond quickly to touch with a sterile platinum worm pick, but C28H8.6 mutants moved in an uncoordinated fashion with a delayed response. These mutants were primarily observed curled up upon themselves rather than moving in a smooth sinusoidal motion (Figure 14). After repeated prodding, animals moved away from the stimuli but proceeded to curl up upon themselves soon after. Interestingly, homozygous okl483 animals appeared to continue to feed upon the E. coli bacteria grown on the N G M plates, but did not grow any larger. These mutant worms continued to live for some time, surviving for days at 20°C. After a period of 3-4 days a small percentage of animals perished, but most survived for what appeared to be a wi ld type life expectancy. 60 Time (hours) Figure 14. C28H8.6 mutant animals Animals with deficient C28H8.6 protein product arrested at the L l developmental stage, were uncoordinated, and curled up upon themselves. Panels (A) and (B) and (C) are Nomarski images of animals after -48 hours of growth. (A) and (B) show mutant animals while (C) shows a wi ld type N 2 worm. A lso pictured is a graph (D) depicting the growth curves of wi ld type (N2) worms (green line) and mutant C28H8.6 (okl483) animals (red line). 3.2.5 C28H8.6 RNAi analysis 3.2.5.1 RNAi affecting both splice variants of C28H8.6 Mutagenesis procedures used to create mutations within a gene in C. elegans can also result in disruptions to additional loci because of random untargeted mutagenesis. Because of this, R N A i was used to knock down C28H8.6 transcript levels to determine whether the mutant phenotype was in fact due solely to the deletion in C28H8.6. The Geneservice R N A i library contains a feeding clone containing ~2kb of genomic D N A covering both isoforms of C28H8.6 and associated introns, and this was used to deplete C28H8.6a and C28H8.6b m R N A . To improve efficiency, the R N A i hypersensitive strain MT2495 was used which carries a mutation in the gene lin-15. Mutations in lin-15 and have been shown to maximize the effect of feeding d s R N A to worms, however the mechanism by which this occurs is unclear as lin-15 has been shown to be a regulator of the cell cycle (Lehner et al. 2006a). Using the R N A i hypersensitive strain MT2495, almost 100% penetrance of the mutant phenotype was achieved when fed C28H8.6 dsRNA. Virtually all worms arrested at the L l stage with the same movement defects observed in homozygous mutant animals, confirming that the mutant phenotype was due solely to a mutation in C28H8.6. 3.2.5.2 Isoform specific RNAi of C28H8.6 Two isoforms are predicted for C28H8.6, so it was not immediately obvious which splice variant is necessary for the worm to proceed past the first larval stage. Each isoform has at least one exon not present in the other, so these were targeted for R N A i analysis. A -700 bp genomic fragment containing each isoform specific exon was amplified and inserted 62 into the R N A i feeding vector L4440 (Figure 15). After transformation into HT115(DE3) bacterial cells, each clone was fed to R N A i hypersensitive (MT2495) worms to observe the effect of each dsRNA. While the construct targeting C28H8.6b (pDM#866) did not have an observable effect, pDM#867 which encompasses the 5th exon of C28H8.6a produced a very strong L I arrest phenotype. The effect was just as strong as that seen with the clone from the Geneservice R N A i library that covers both isoforms. While C28H8.6b may still be an important isoform, C28H8.6a appears to be sufficient and necessary for progression past L I . 63 Figure 15. Production of two isoform specific C28H8.6 RNAi constructs Two R N A i constructs were produced, each containing unique sequence present in each of the splice variants (see shaded ovals above). No overlap of sequence was present in the constructs. Each -700 bp fragment was produced using P C R that targeted the exon and enough flanking region to produce the proper sized insert. Amplicons were ligated into the L4440 feeding vector and transformed into HT115(DE3) bacterial cells. 64 3.2.5.3 Myofilament analysis in C28H8.6 depleted animals Organization of the myofilament lattice was monitored in C28H8.6 depleted worms to determine whether its loss would affect the structure or maintenance of the sarcomere. Whi le C28H8.6 was previously tested in our R N A i screen for muscle affecting genes, disorganization of the myofilaments was only observed in a minority of animals. A n L I arrest phenotype was recorded, but some animals progressed to the L 4 stage so we scored them and found less than 50% had disorganized myofilaments. One issue with this however is that worms progressing past L I must not have been affected substantially by the R N A i treatment. To rectify the issue of scoring unaffected animals, the R N A i treatment was repeated using RW1596 animals, but animals were only scored i f they stalled at the L I stage after 2 days, ensuring that C28H8.6 m R N A had been sufficiently inhibited. Using these criteria, 66% of animals (n=44) displayed defects in myofilament organization along with deposits of mislocalized myosin (Figure 16). Adding significance is the observation that RW1596 animals fed bacteria unable to produce dsRNA displayed little or no disruption of myosin localization or organization as opposed to the substantial loss of integrity seen in older animals. Figure 16. Myofilament disorganization in C28H8.6 deficient animals When animals are fed bacteria not producing dsRNA, worms have normal myofilament organization (A). When fed C28H8.6 dsRNA, RW1596 worms display disruption of the myofilament lattice (B)(C). Minor disorganization is observed as well as deposits of mislocalized MYO-3::GFP. Panels A ' , B ' , and C are enlargements of panels A , B, and C. 66 3.2.6 Subcellular localization of C28H8.6a::GFP Determination of the localization pattern of a protein is possible by tagging the protein with G F P and observing its subcellular location in vivo. Using a construct (pDM#834) containing G F P sequence in frame with Gateway recombination sites and under the control of the promoter of the body wall muscle expressed gene T05G5.1, and the O R F clone for C28H8.6a as a donor vector, Gateway recombination technology was used to transfer the C28H8.6a O R F into pDM#834 (see 2.2.1.3 for details). This provided a construct with the coding sequence for C28H8.6a fused in frame to sequence coding for G F P (C28H8.6a::GFP), all under the influence of the promoter from the gene T05G5.1 which is expressed solely in muscle (Barbara Meissner (University of Brit ish Columbia, Canada), unpublished data). Animals harbouring this construct as part of the extrachromosomal array raEx55 express G F P fluorescence in dense bodies (Figure 17). Fluorescence is not observed in muscle cell nuclei, M-lines, or attachment plaques. Figure 17. Localization pattern of C28H8.6a::GFP Punctate expression is observed in the body wall muscle of young adult DM7055 animals that harbour a pT05G5.1 ::C28H8.6a::GFP translational fusion. Shown are portions of five muscle cells. The GFP fusion protein appears to be localized to dense bodies. M-line localization would appear as a line in between each row of dense bodies, and attachent plaque localization would appear as a line between adjacent muscle cells but neither of these expression patterns was observed. Nuclear expression, which would appear as a single large globular fluorescent signal in each muscle cell was also not seen. A model of sarcomere orientation within a muscle cell can be seen in Figure 3 to help distinguish between the various structures. This image was taken using a compound microscope at 400x magnification. 68 3.2.7 Co-localization of C28H8.6a::GFP with a-actinin Confirmation of C28H8.6a: :GFP expression in dense bodies was carried out by immunofluorescence staining with an antibody (MH35) recognizing the actin l inking dense body protein a-actinin. A secondary antibody labeled with A lexa 568 was used to label anti-a-actinin in the red spectrum, and confocal microscopy was used to look for co-localization. Distinct overlap of expression patterns for pT05G5.1::C28H8.6a::GFP and a-actinin (red) can be seen in Figure 18. This co-localization is consistent with C28H8.6a: :GFP being present in body wall muscle dense bodies. Figure 18. Co-localization of a-actinin with C28H8.6a::GFP pT05G5.1::C28H8.6a::GFP expression (A) was compared with antibody staining for a actinin (B) tagged with Alexa 568 (red). Punctate fluorescence in both channels was found to be in the same location in the overlay (C), consistant with co-localization. 70 CHAPTER FOUR - DISCUSSION A great deal of what we have learned about C. elegans muscle has hinged upon mutational studies focused on Unc and Pat phenotypes. These mutational studies appear be reaching a saturation point however, where for every new muscle mutant found, the vast majority wi l l simply be additional mutations in known muscle genes. A s wel l , not all genes with a functional role in muscle have a visible phenotype (Hikita et al. 2005; Rogalski et al. 2003), and some C. elegans genes may require an interacting gene to be knocked out to elicit a phenotype (Lehner et al. 2006b). In order to overcome these problems, we took advantage of S A G E and microarray expression data (McKay et al. 2003; D.G. Moerman lab, unpublished; David Mi l le r III (Vanderbilt University, U S A ) , unpublished) to identify genes with body wal l muscle expression, and to screen each gene for its necessity in maintaining proper myofilament organization within C. elegans body wall muscle. By screening the subcellular muscle structure, we feel we have carried out a more sensitive screen than past mutational approaches. A s a first step towards identification of additional muscle components, we compiled a list of genes with muscle cell expression as determined by S A G E and microarray (in conjunction with K i m Wong at the British Columbia Genome Sciences Centre, Vancouver (Canada). Our list of genes was then screened using R N A i feeding clones to individually knock down 3301 transcripts. The resulting organization and integrity of the myofilament lattice within C. elegans body wall muscle cells was then directly examined microscopically, providing added sensitivity when compared to a standard muscle mutational screen relying on scoring phenotypes showing disruption of worm movement. After screening 3301 genes 71 using R N A i , we were able to identify 119 genes necessary for proper myofilament organization in C. elegans. Our list of muscle expressed genes was also used in a bioinformatics based screen to identify muscle affecting genes. S A G E data (D.G. Moerman lab, unpublished) and protein structure predictions (http://wormbase.org) were used to compile a small number of genes with increased S A G E expression levels in muscle libraries, and containing predicted protein domains found in other known body wall muscle proteins. One such gene C28H8.6 was examined further because of its high muscle specific S A G E expression and predicted L I M domains. We found that one of the protein products of C28H8.6 was observed in body wal l muscle dense bodies and that it was required for proper myofilament organization. 4.1 RNAi screen for muscle affecting genes Using R N A i , 3301 C. elegans genes were knocked down and the myofilament organization in affected worms was observed. When compared with the tight register of myofilaments seen in wi ld type animals, there were a variety of differences in the organization and distrubution of M Y O - 3 : : G F P protein in animals fed d s R N A genes in our screen (see Figure 6, Chapter III: 3.1). These disruptions in the myofilament lattice organization and stucture were used to score animals as having abnormal or wi ld type body wall muscle. A number of different types of muscle defects were observed in our screen. In particular, large aggregations of M Y O - 3 : : G F P within or beside the filaments were common in affected worms (see Figure 8, Chapter III: 3.1.2). Aggregations of M Y O - 3 : : G F P could be 72 caused by the inability of M Y O - 3 : :GFP to localize properly and completely into discrete thick filaments anchored to dense bodies when a muscle affecting gene was knocked down. Alternatively, i f a particular muscle affecting gene was important for maintenance of the sarcomere, M Y O - 3 : : G F P may be displaced or degraded i f such a gene was knocked down using R N A i . Addit ional abnormalities in myofilament structure were also observed in our R N A i screen for muscle affecting genes. We observed an overall reduction in M Y O - 3 : :GFP levels in animals fed d s R N A for a number of genes, including the positive control we used that targets G F P . Certainly, loss of fluorescence in animals fed d s R N A corresponding to sequence coding for G F P can be attributed to the subsequent reduction in myo-3::gfp transcript. However, for screened genes displaying reduced M Y O - 3 : : G F P there are other possibilities. Transcription factors responsible for regulating levels of myo-3 transcript and consequently M Y O - 3 protein, and genes playing a role in folding or maintaining M Y O - 3 protein stability could both conceivably result in a loss of M Y O - 3 : :GFP protein levels when knocked down using R N A i . Some animals with disrupted muscle had large gaps in the myofilament lattice where M Y O - 3 : : G F P should have been present (see Figure 8, Chapter III: 3.1.2). It is possible that loss of a protein involved the stabilization and attachment of muscle cells to the underlying hypodermis could reduce the buffering of mechanical stress during muscle contraction. Lastly, a general disorganization of the regular alignment of thick filaments in worms was a prominent observation for many screened genes, indicating a possible role for these gene products in cross linking filaments, attaching filaments to dense bodies, or maintaining the structure of the dense bodies themselves. One common observation with all of the genes we identified as muscle affecting was that animals fed d s R N A corresponding to these genes did not always display one type of 73 muscle disruption. For example, while most animals fed unc-97 d s R N A had large deposits of M Y O - 3 : : G F P protein and gaps throughout the myofilament lattice, other animals had severe disorganization instead. It was therefore difficult to make a correlation between the types of myofilament disorganization observed and the role that each gene product has within body wall muscle. Since genes required for the initial formation of the sarcomere cause a Pat phenotype when knocked out, one would expect that such genes would not be picked up in our screen because they would not progress past embryogenesis. Thus, it is l ikely that the genes we identified are involved in processes further along in the maturation of muscle. Maintenance of muscle structures and turnover of sarcomeric proteins are important processes that could represent the functions of many of these novel muscle affecting genes. Clouding the picture however is the role that R N A i penetrance plays. Because R N A i is not effective at completely eliminating transcript levels (Simmer et al. 2003), a small amount of protein can be made in R N A i affected animals. A good example of this is C28H8.6a which is essential for progression past the first larval stage of development, but when knocked down using R N A i can result in some L4 animals with both organized and disorganized myofilaments. Therefore while our R N A i screen was effective at the basic level of identifying muscle affecting genes, ascertaining the specific function of each gene by looking at phenotypes or the types of myofilament disruption observed is not possible. Further work must be done for each of our muscle affecting genes to properly characterize the function they carry out in body wall muscle. A s stated previously, due to the lack of penetrance exhibited when using R N A i (Simmer et al. 2003), we did not see an abnormal organization of the myofilament lattice in every single animal corresponding to each muscle affecting gene. Some worms had much 74 stronger myofilament disorganization than others, and while most animals fed a particular d s R N A tended to primarily display one specific abnormality such as mislocalization of M Y O - 3 : : G F P , other defects were also possible within the same worm. A lack of complete R N A i penetrance, and the variety of myofilament abnormalities observed by our screen required that we scored each animal as either abnormal or wi ld type. The specific abnormality observed in each worm using compound fluorescence microscopy was also recorded and images of multiple animals were taken. In doing so, we were able to determine whether a majority of worms had a disruption of their myofilament lattice structure, while also keeping track of the specific defects in each animal. By screening each gene at least twice and requiring that a majority of animals had muscle defects in order for a gene to be termed muscle affecting, we feel that we have isolated a high confidence list of 119 genes. One of the key components of our R N A i screen was the worm strain RW1596. RW1596 is homozygous for a null allele of myo-3 but harbours an extrachromosomal array containing a wi ld type copy of myo-3 fused to GFP. When translated, myo-3::GFP produces a fusion protein capable of carrying out the normal function of M Y O - 3 protein. Animals that do not receive the extrachromosomal array arrest embryonically and rescued animals are superficially wi ld type. It appears however that the addition of G F P to the C-terminal end of M Y O - 3 affects the stability of M Y O - 3 : :GFP within thick filaments. We found that while N 2 worms display organized body wall muscle filaments as adults, RW1596 animals are prone to a gradual breakdown in myofilament structure as they age (see Figure 7, Chapter III: 3.1.1). Structural integrity of the myofilaments appears normal at the L4 stage, the developmental stage we carried out scoring in our screen. It is l ikely though that myosin is not able to pack as tightly when fused to G F P . Within thick filaments, M Y O - 3 and another 75 heavy chain myosin U N C - 5 4 must pack tightly with the paramyosin (UNC-15) core (Epstein et al. 1993). One can imagine how a barrel shaped 238 amino acid G F P protein (Chalfie et al. 1994) fused to the C-terminal end of M Y O - 3 may be a hindrance to optimal packing of thick filaments. M Y O - 3 : :GFP is capable of carrying out the function of M Y O - 3 as evidenced by its ability to rescue a lethal myo-3 mutation, but the presence of G F P fused to M Y O - 3 appears to contribute to destabilization of the myofilament lattice as worms age (Herndon et al. 2002; D.G. Moerman lab, unpublished). It is therefore l ikely that RW1596 animals are more sensitive to R N A i targeting muscle affecting genes. In a recent genome wide R N A i screen, various worm strains harbouring a mutation in a single gene were subjected to R N A i towards thousands of genes in order to find new or enhanced phenotypes (Lehner et al. 2006b). A number of interactions were successfully identified, demonstrating that a lesion in a single gene can provide a base for enhanced sensitivity towards additional genetic changes. In our screen, the strain RW1596 may have facilitated a similar effect. The presence of G F P attached to myosin may have acted in a similar fashion as a minor mutation, slightly reducing the stability of the myofilament lattice and allowing R N A i for genes affecting muscle to have a greater effect. Thus, the use of the strain RW1596 may have aided us by enabling the identification of muscle affecting genes that would have otherwise not caused enough of an effect to be identified. 4.1.1 Genes identified as muscle affecting using a sensitive RNAi screen Knockdown of m R N A transcripts caused reproducible effects for 119 of the 3301 genes screened using R N A i . Each of the 119 genes was grouped according to similar 76 predicted function based on homology with proteins in other species, and human homology was determined according to E N S E M B L annotations (http://ensembl.org). Some notable trends are apparent in Figure 19. A total of 42 genes have no known function, 16 of which have human homologs making them excellent candidates for further study. Some expected classes of genes were well represented such as those coding for L I M proteins, predicted cytoskeleton components, and adapter proteins. Cytoskeletal and adapter proteins including the L I M class could possibly play an important role in building and maintaining structural complexes within the sarcomere such as dense bodies, M-lines, and attachment plaques. A number of transcription factors and proteins predicted to bind nucleic acids were also identified as being important for proper myofilament organization in our R N A i screen. This was expected due to their possible function in regulating the expression of body wal l muscle genes. Addit ionally, classes of genes with less of an obvious role in body wal l muscle were pulled out in our screen. For example, a total of 15 genes coding for predicted metabolic proteins were identified. While a role for metabolic proteins may not be as immediately apparent as those with predicted cytoskeleton or adhesion roles, there are examples of known muscle affecting genes with predicted metabolic roles that are important sarcomere components. The immunoglobulin-like DIM-1 does not stand out as an excellent muscle candidate based on its domain structure, but is necessary for proper organization of myofilaments (Rogalski et al. 2003). A s wel l , UIG-1 is predicted to be a nucleotide exchange factor yet has also been found to associate with UNC-112 and be necessary for proper myofilament organization (Hikita et al. 2005). Evidently, i t i s possible for a range of predicted gene classes to play an important role in body wall muscle. 77 No Homolog Human Homolog Figure 19. Classes of muscle affecting genes identified using RNAi A large number of genes with unknown function were identified as muscle affecting in our R N A i screen. In total, 42 unknown genes were implicated, 16 of which have clear human homologs according to the E N S E M B L database (http://ensembl.org). A l l human homologs are labeled as green above. Expected classes such as predicted cytoskeleton interactors and transcription factors were identified, as well as genes coding for L I M domain proteins and molecular adapters. Many classes of genes were represented by the 119 muscle affecting genes, leaving a wide variety of possibilities for future study. 78 One of the most important benefits of scoring R N A i affected animals by direct observation of the organization of myofilaments was the ability to identify genes that did not elicit a defect in muscle movement. Mutational screens for C. elegans muscle mutants rely on movement and embryonic arrest phenotypes to identify potential muscle mutants. Mutational screens have identified a large number of currently known muscle affecting genes, but there are some notable muscle mutants that do not display an Unc or Pat phenotype. Two such muscle affecting genes are the previously mentioned dim-1 and uig-1. Mi ld ly disorganized myofilaments are observed in dim-1 and uig-1 mutants (Rogalski et al. 2003; Hiki ta et al. 2005), but neither was identified in mutational screens directly searching for Unc or Pat animals. In the case of uig-1, interest in the gene was piqued by yeast two-hybrid data which indicated an interaction with the body wall muscle protein UNC-112 (Hikita et al. 2005). Similarly, dim-1 was identified in a screen set up to find UNC-112 interactors; an unc-112 supressor screen identified dim-1 (Rogalski et al. 2003). After further study, both uig-1 and dim-1 were found to be required for proper myofilament organization using polarized light microscopy to observe body wall muscle structure in mutant animals (Hikita et al. 2005; Rogalski et al. 2003). In our R N A i screen for muscle affecting genes, both uig-1 and dim-1 were identified as muscle affecting genes. In total, 56 genes were identified that did not display a phenotype, but had disrupted myofilament organization, representing 47% of the total. It is quite remarkable that animals are able to move normally when their subcellular muscle structure is irregular. We knew from the onset of screening that it was possible for this to occur based on the dim-1 and uig-1 examples. The fact that we picked out both uig-1 and dim-1 in our screen was very important and validated the experimental design. 79 4.1.2 Genes not identified as muscle affecting One hundred nineteen genes were identified that were necessary for proper myofilament organization, but we did not identify all known genes affecting muscle in C. elegans. In total, 16 of 23 genes known to affect body wall muscle were identified (Table 2). Our ability to identify known muscle affecting genes using R N A i validated the experimental design, but also exposed a bias towards false negatives. One of the shortcomings of R N A i is that penetrance is not always 100%, and as a result there are false negatives (Simmer et al. 2003). A s wel l , R N A i feeding constructs are not always equally effective at knocking down gene expression. For example, unc-98 was not identified in our screen as muscle affecting, and previous genome wide screens have also failed to observe an Unc phenotype using the same R N A i feeding clone. Mutations in unc-98 however result in hindered movement and accumulation of paramyosin complexes that disrupt sarcomere structure (Zengel and Epstein 1980; Mercer et al. 2003). The R N A i feeding construct for unc-98 is evidently not efficiently knocking down gene expression. When comparing R N A i results between genome wide screens in C. elegans (Kamath et al. 2003; Simmer et al. 2003; Sonnichsen et al. 2005) it is apparent that the same phenotypes are not always seen in each screen. Sti l l , while the presence of R N A i false negatives is a disadvantage, it is balanced by the ability to identify novel muscle affecting genes in a manner that would not be possible when relying on mutational screens. Table 2. Known muscle genes screened using RNAi In total, 16 of 23 (70%) of genes known to affect body wall muscle were identified in our R N A i screen (in bold lettering). Known Muscle muscle gene Description affecting in RNAi screen unc-15 Paramyosin, forms core of thick filaments Yes unc-23 Molecular chaperone Yes unc-45 Myosin chaperone Yes Perlecan homolog, muscle cell basement unc-52 membrane Yes unc-89 Myosin linker to M-line Yes unc-95 LIM protein, adapter Yes unc-97 LIM protein, PINCH homolog Yes unc-112 Mitogen inducible gene (MIG-2) ortholog Yes myo-3 Myosin heavy chain A, thick filament component Yes mup-4 Muscle cell attachment Yes vab-10 Muscle cell attachment Yes Troponin, thin filament component Yes uig-1 UNC-112 interactor Yes dim-1 UNC-112 interactor Yes pat-4 Integrin-linked kinase, adapter protein pat-6 a-parvin ortholog unc-98 UNC-97 interactor No unc-105 Muscle degenerin, membrane ion channel hlh-1 myo-D homolog, transcription factor mlc-3 Myosin light chain No spc-1 a-spectrin, embryonic actin cable patterning No let-805 Myotactin homolog, hypodermal patterning No pat-3 (3-integrin subunit No 81 Addit ional genes affecting body wall muscle were likely missed for reasons other than the variable penetrance of R N A i . In order to identify genes most l ikely to be muscle affecting, we restricted our screen to genes with S A G E and microarray expression in embryonic muscle cells (McKay et al. 2003; D.G. Moerman lab, unpublished; David Mi l ler III (Vanderbilt University, U S A ) , unpublished). F A C S sorting of embryonic cells is necessary to produce tissue specific preparations for some tissues. Manual dissection of tissue types such as the intestine and pharynx are possible in a larval stage animal, but one can imagine the impossibility in manually removing every ciliated neuron within a 1 mm long worm. High throughput production of tissue specific samples for S A G E and microarray analysis required that embryonic cells be used, but not all genes are expressed embryonically. Thus, use of embryonic cells led to a loss of genes with solely postembryonic expression. Issues with both S A G E and microarray procedures also restrict the number of genes that can be identified using each technique. Microarray chips are produced by hybridizing oligonucleotides based on predicted gene coding sequences to the chip surface. This restricts expression readings to annotated genes, and does not allow for data on genes that may have been incorrectly predicted or not predicted at all. S A G E can identify expression tags for such genes because S A G E takes into account almost all m R N A transcripts within the experimental sample (Velculescu et al. 1995). Not every m R N A molecule is picked up using S A G E however, because the procedure requires that an NLAI I I restriction site be present in each transcript (Pleasance et al. 2003; Velculescu et al. 1995). Due to S A G E tags that cannot be mapped unambiguously and the requirement of an NLAI I I site within each transcript sequence, up to 15% of all C. elegans genes cannot be detected using S A G E that uses a 14 bp tag (Pleasance et al. 2003). When a longer 21 bp tag is used, the issue is alleviated 82 substantially, but genes are still missed. Stochiastic sampling can lead to genes with low expression to have only one or two S A G E tags or they are missed entirely even though they are expressed in a particular tissue. The high price of each S A G E library makes multiple replicates of each library prohibitive, so a loss of genes with low expression is unavoidable. Another factor contributing to a reduction in muscle expressed genes was our dual use of S A G E and microarray data to produce a high confidence subset of muscle genes. When we compiled our list of muscle expressed genes, we required that each gene be upregulated in 2 of 3 microarray data sets specific to muscle cells (David Mi l ler III (Vanderbilt University, U S A ) , unpublished) when compared to baseline expression levels in total embryo preparations, as well as have a S A G E tag in at least one muscle specific S A G E library (McKay et al. 2003; D.G. Moerman lab, unpublished). Since both S A G E and microarray techniques miss a small percentage of genes, by requiring that both data sets be used concurrently we undoubtedly omitted a number of genes that were only present in each data set respectively but may have been of interest to screen. Lastly, of the 4042 genes compiled in our list of genes to screen, only 3301 had available R N A i feeding clones in the Ahringer R N A i library. If the 741 genes without feeding constructs were tested and had the same proportion of muscle affecting genes as those in our screen, an additional 27 muscle affecting genes may have been identified. Taking all caveats into account, we believe that we carried out a highly successful screen that took advantage of the technology currently available while minimizing the number of muscle affecting genes that were not screened and identified. Most importantly, we feel that we minimized the number of false positives by using a strict screening methodology and rescreening all muscle affecting genes, thus ensuring our results were reproducible. 83 4.2 Identification of C28H8.6 as a novel dense body protein A s the secondary component of a two-pronged approach to identify muscle affecting genes in C. elegans, an in silico screen of bioinformatic data was used. Using the compiled list of muscle expressed genes used in our R N A i screen, the number of S A G E tags in embryonic libraries were used to identify a list of genes with elevated expression in embryonic muscle cells. Each highly expressed gene was then individually analyzed for its distribution of S A G E tags within each embryonic S A G E library. Genes with a highly elevated tag count in both muscle specific S A G E libraries ( S W E M 1 and SW031) but low tag counts in 10 other embryonic S A G E libraries were collated. Approximately 150 genes displayed a muscle specific enhancement of expression. A n important trend emerged when these 150 genes were analysed; a number of known muscle affecting genes were present within these 150 genes displaying an elevated muscle specific enrichment of S A G E tags (see Figure 11, Chapter III: 3.2.1). While expected, the presence of a number of genes known to be important in the development and functioning of a body wall muscle cell within the 150 muscle enriched genes was a key finding towards validating the idea that novel muscle affecting genes could be identified by filtering through S A G E library data. Two additional criteria were used to compile a small number of genes for more intensive study: predicted protein domains and human homology. Predicted protein domains (http://wormbase.org) were analysed in order to identify genes with domains found in other body wall muscle genes or domains that could interact with known body wal l muscle components. Examples of such genes included those coding for potential actin or myosin binding proteins, or coding for adapter proteins such as L I M domain proteins. Homology with human genes was used as a criterion to choose possible genes for further study. While studying any gene with a function in body wall muscle may be interesting, deciphering the role played by genes with a human homolog may be of more interest to researchers studying human myopathies. After completing the process of sorting through all muscle expressed genes based on their S A G E tag distribution within embryonic libraries, protein domain structure, and whether a human homolog existed, C28H8.6 stood out as a primary candidate gene for further study (Figure 20). Eliminate genes with less than 10 S A G E tags in a • muscle library -1000 genes Compile genes with low expression in other tissue specific libraries when compared to muscle T libraries (muscle enriched) 150 genes Isolate genes based on protein domain architecture and human i r homology 20 genes i Choose the most interesting gene based on all criteria C 2 8 H 8 . 6 Figure 20. Bioinformatics based screen for muscle affecting genes 86 4.2.1 LIM proteins in C. elegans Bui lding a functional sarcomere in C. elegans requires a wide variety of proteins including those that form the structure of attachment complexes, proteins that crosslink filaments to M-lines and dense bodies, and adapters such as L I M proteins. A number of L I M proteins have been found to be important for the proper functioning of C. elegans muscle. U N C - 9 5 , U N C - 9 7 , and U N C - 9 8 are all L I M proteins expressed in body wall muscle (Broday et al. 2004; Hobert et al. 1999; Ken Norman (University of Utah, U S A ) , unpublished; Mercer et al. 2003), and mutations in their coding sequence result in structural defects within the sarcomere. A L P - 1 is another L I M protein found within dense bodies and attachment plaques, and mutations in alp-1 cause abnormal pharyngeal pumping (McKeown et al. 2006). Addit ional L I M domain proteins have also been found to localize to dense bodies or M-lines including F42H10.3 (Kristina Mercer (Emory University (USA) , unpublished), F25H5.1 (Kristina Mercer (Emory University (USA) , unpublished), M L P - 1 (Mary Beckerle (University of Utah, U S A ) , unpublished), and the zyxin homolog Z Y X - 1 (David Madsen (University of Utah, U S A ) unpublished), but little has been established regarding their specific function. Including the L I M protein we have shown to be expressed in dense bodies (C28H8.6a), 32 L I M proteins are coded for in the C. elegans genome (Table 3). A correlation exists between cellular localization and the distribution of S A G E tags within the various tissue specific libraries. O f the 9 genes with the greatest percentage of S A G E tags located within muscle libraries, only 1 has not been confirmed to be expressed in body wall muscle using G F P promoter fusion or translational fusion analysis. A lso interesting is the presence of unc-97, unc-95, unc-98, and alp-1 within the top 9 genes in terms of S A G E tag distribution in muscle libraries. Lastly, of all genes coding for L I M proteins in C. elegans, C28H8.6a has the fourth highest percentage of its S A G E tags within body wall muscle libraries. The combination of known L I M muscle genes displaying high S A G E levels in body wal l muscle libraries, and the presence of additional L I M coding genes with similar high levels leads to two conclusions; additional L I M proteins may play an important role in body wall muscle, and S A G E data can be used as a powerful tool for identifying potential targets for future study. 88 Table 3. L I M domain proteins in C. elegans % of total Expression pattern S A G E S A G E S A G E tags using GFP tags in tags in distributed LIM translational muscle all other in BWM Gene Locus domains fusions libraries libraries libraries BWM (dense bodies, Y105E8A.6 unc-95 EE/i-', M-lines, and nucleus) E:.;,i.76/..EEi Excretery Canal, F20D12.5 exc-9 Seam Cells, Neurons 60 F42H10.3 I :k:-;i : .:/:• BWM, Intestine mmBM i e J i 2 l | l l 59 C28H8.6a taq-327 BWM-dense bodies 55 BWM (dense bodies, T11B7.4 alp-1 Ei, M-lines, and nucleus) 47 BWM (dense bodies, F08C6.7 unc-98 E / 1 M-lines, and nucleus) [ > - - / 3 W E 3 E- 43 BWM, Intestine, T04C9.4 mlp-1 M 2 • • Pharynx, Neurons 42 BWM (dense bodies, F14D12.2 unc-97 M-lines, and nucleus) 38 F25H5.1 taq-15 BWM w^msm 23 F09B9.2 unc-115 Neurons 22 Y57G11A.1 tag-273 N/A Mmrnsm 20 ZK381.5 Intestine 20 F33D11.1 Neurons 20 BWM, Intestine, F42G4.3 zyx-1 Ki$EE Pharynx, Neurons 18 B0496.8 taq-224 MM:-:' ;... Neurons ;:.:i,:1:1£:E:i 10 K02C4.4 ltd-1 l;:i:;:Miy Seam Cells ::1PE:: 10 F07C6.1 pin-2 EEv3iE MMfM? 7 C40H5.5 ttx-3 s:S';::::2:;;\;;u Neurons 7 F28F5.3 taq-204 BWM, Pharynx 6 B0496.7 IE, N/A M./'y^^j:- S I M 5 ZC64.4 lim-4 l^-i>2Y::;:;•• Neurons WM7SSM 3 C04F1.3 lim-7 &^E2iEE-:: Neurons 130 E: 3EEi C26C6.6 N/A 0 TM':. 07E.E'-C34B2.4 :::;.:1EIEE N/A 0 EE :-0Z::-F01D4.6 mec-3 1^12"". :r;.; Neurons l - : ; , ' ''0V\:'".i^  F46C8.5 ceh-14 IIIi:;:::2::.; Neurons w:.: •o..;.."',:'-S K03E6.1 lim-6 EEX/.. Neurons P E I E I o /TE; Y1A5A.1 N/A T\TMMIM Y57G11A.3 N/A I.EIEII Y65B4A.7 E ^ i - N/A I::-:;' 0 .'.T:g ZC247.3 tin-11 Neurons, Vulva E,'"' ;0.:/;;;:f;::i ZK622.4 E \ N/A E ; . . a . Mm 89 Using a G F P translational fusion, we have demonstrated that C28H8.6a is a component of dense bodies in C. elegans body wall muscle. Not all L I M proteins in muscle are restricted to dense bodies however. U N C - 9 5 , U N C - 9 7 , U N C - 9 8 , and A L P - 1 have all been observed in the nucleus early in development (Broday et al. 2004; Hobert et al. 1999; Ken Norman (University of Utah, U S A ) , unpublished; McKeown et al. 2006; Mercer et al. 2003). Additionally, U N C - 9 7 , U N C - 9 8 , and UNC-95 have been found to localize to M-lines (Hobert et al. 1999; Ken Norman (University of Utah, U S A ) , unpublished; Mercer et al. 2003; Broday et al. 2004). C28H8.6a::GFP is not observed in either the nucleus or M-lines setting it apart from other body wall muscle L I M proteins. Interestingly C28H8.6 has been shown to interact with A L P - 1 in a large scale yeast two-hybrid screen (L i et al. 2004) and neither has been observed in M-lines. A s indicated by its name a-actinin associated L I M protein, A L P - 1 associates with a-actinin (ATN-1) , a dense body protein. C28H8.6a also co-localizing in vivo with ATN-1 (see Figure 18, Chapter III: 3.2.7). What makes this significant is that ATN-1 is not necessary for the initial attachment of actin filaments to the dense body, and atn-1 mutants are also viable past embryogenesis (Ono et al. 2006). C28H8.6a mutants are able to progress to the L I stage but not futher, indicating it is not necessary for preliminary sarcomere assembly. Because A L P - 1 , C28H8.6a and ATN-1 are not needed for initial sarcomere assembly, and taking interaction data into account, it is possible that C28H8.6a forms a complex with ATN-1 and A L P - 1 , involved in a process not required during embryogenesis. One possible function of the proposed complex would be the promotion of further growth of dense bodies and subsequent additional actin filament attachment. The dynamics of how new dense bodies form to allow for elongation of a body wall muscle cell are not well understood, but perhaps C28H8.6a is responsible for mediating 90 the process. Whether C28H8.6a interacts with ATN-1 and/or A L P - 1 is unclear without further study, but the L l arrest phenotype displayed by C28H8.6a mutants demonstrates that part of its function is important for progression past the first larval stage of development. A great deal of further study is required to decipher the exact role C28H8.6a plays, and what proteins it interacts with in order to carry out its function within the worm. 4.2.2 Isoforms of C28H8.6 C28H8.6 has two predicted isoforms that contain L I M domains (http://wormbase.org). Two L I M domains are present in both the ' a ' and 'b ' isoforms and two are exclusive to C28H8.6a (Figure 21). By constructing isoform specific R N A i feeding clones, each isoform was knocked down independently and the phenotype of affected worms observed. A feeding clone targeting C28H8.6b did not have an observable effect, but when C28H8.6a was knocked down the worms arrested at the first larval stage of development. Since the isoform specific feeding construct targeting C28H8.6a gave such strong phenotype and was almost 100% penetrant, the region of sequence within that feeding clone must be a part of the transcript necessary for progression past the first larval stage of development. Other than the obvious assumption that C28H8.6a is necessary for progression past L l , there are additional explanations. For instance, it could be argued that the gene prediction for C28H8.6 is incorrect. Wormbase (http://wormbase.org) uses expressed sequence tags (ESTs) from c D N A clones to confirm gene isoforms. Only portions of each C28H8.6 isoform are confirmed, which leaves the possibility that perhaps only one C28H8.6 isoform exists, in which exons of both predicted isoforms are incorporated. However, there is some data 91 supporting the currently predicted C28H8.6a isoform as correct. Sequencing data compiled during ORFeome construction (Reboul et al. 2003) is not taken into account in Wormbase gene predictions, but it supports the C28H8.6a gene model. When each O R F clone was constructed, primers were used to amplify predicted isoforms using c D N A as a template before insertion into the p D O N R Gateway vector (Reboul et al. 2003). C28H8.6a amplified correctly during ORFeome construction, thus creating the O R F clone that was used in production of the C28H8.6a::GFP translational fusion used to demonstrate C28H8.6a localization to dense bodies. Proper amplification confirms that the primer sequences are a part of the coding sequence, but i f upstream or downstream regions from the primers exist, they would not be amplified and thus not sequenced. For example, i f the predicted 5' exon of C28H8.6b was actually apart of C28H8.6a, ORFeome sequencing primers would not identify it as being part of the isoform because it would be oriented outside the amplified sequence. While we are fairly confident that the prediction for C28H8.6a is correct because of ORFeome sequencing data and isoform specific R N A i supporting the model, R T - P C R and/or isolation and sequencing of c D N A is necessary for final confirmation. 92 5' C28H8.6 Gene Isoforms C28H8 6b 4472bp, 771 bp coding • • 0 3' ESTs C28H8.6a 3784bp, 771 bp coding • 3' ESTs ORFeome confirmed SAGE tags pDM866 RNAi pDM867 RNAi JA:C28H8.6 RNAi ok1483 deletion C28H8.6 Protein Model C28H8.6D N C28H8.6a LIM LIM 256 aa N LIM "LIM ^ LIM LIM C 256 aa Figure 21. Isoforms of C28H8.6 Two isoforms are predicted for C28H8.6, an ' a ' and ' b ' isoform, each of which was targeted using the respective R N A i constructs pDM#867 and pDM#866. The R N A i clone JA:C28H8.6 from the Geneservice R N A i library and the deletion okl483 each disrupt both predicted isoforms. Each splice isoform has been partially confirmed through combinations of E S T and Orfeome sequence data, and S A G E tags with multiple hits show that at least one of the isoforms is substantially expressed. Two L I M domain proteins are predicted to be encoded by C28H8.6 with 2 L I M domains in C28H8.6b, and 4 L I M domains in C28H8.6a. 93 4.2.3 C28H8.6a is a paxillin homolog Human paxi l l in is made up of 591 amino acids and contains both a paxil l in domain and 4 L I M domains. Paxi l l in has been shown to be an integral part of focal adhesions in vertebrates (Turner et al. 1990), but while many focal adhesion proteins have a C. elegans homolog, a match for paxi l l in in the worm has not yet been clearly identified (see Figure 4, Chapter I: 1.2). U N C - 9 5 has been cited as being the most probable paxi l l in homolog (Broday et al. 2004) and some similarity is seen when aligned with paxi l l in using ClustalW (http://.ebi.ac.uk/clustalw). However, C28H8.6a appears to be a more probable candidate as a paxi l l in homolog. C28H8.6a shares significant sequence identity with the C terminal half of human paxil l in but it is only 256 amino acids in length and lacks the N-terminal half of paxil l in. The N-terminal portion of paxil l in has been shown to bind focal adhesion kinase ( F A K ) and vinculin, but it is the third L I M domain in paxil l in that is necessary for localization to focal adhesions (Brown and Turner 1996). When compared with human paxi l l in, the 4 L I M domains in C28H8.6a share either identical amino acids or conserved substitutions at almost every position within the 4 L I M domains in paxi l l in (Figure 22) when aligned using ClustalW. The high level of sequence identity, and absence of a ful l length version of paxi l l in in the worm can be explained in a number of ways. It is possible that in C. elegans only the 4 L I M domains are required to carry out its function. Supporting that idea is sequence data showing very similar 4 L I M domain proteins in a number of other organisms including the nematode C. briggsae, the pufferfish Fugu rubripes, and the elephant Loxodonta africana (data from http://ensembl.org). It is also possible that C28H8.6a complexes with another protein containing similar properties to the N-terminal portion of human paxil l in and together they carry out a single function. In doing so, together 94 they would effectively act as one paxil l in protein. Such a partner protein is not immediately evident based on alignments between the N-terminal portion of paxil l in and C. elegans proteins, but that does not rule out the possibility that a protein with divergent sequence but a similar function to the N-terminal half of paxil l in exists. Alternatively, perhaps a bonafide paxil l in is not required in C. elegans at al l. C28H8.6a may carry out a completely different function within the worm, and while it may carry out a task within body wall muscle, it might be completely different than that of paxil l in. In fact, two pieces of data provide support for a different function of C28H8.6a than paxil l in. First, yeast two-hybrid data indicates that C28H8.6a interacts with A L P - 1 (L i et al. 2004). A s the name a-actinin associated L I M protein implies, A L P - 1 is associated with A T N - 1 at the membrane-distal portion of the dense body where ATN-1 is prevalent, rather than the more membrane-proximal orientation of paxil l in in focal adhesions (Brown and Turner 1996; Brown and Turner 2004). If C28H8.6a does in fact bind to A L P - 1 then it may not be clustered around the base of dense bodies where paxi l l in would be expected, but rather spread throughout the dense body. The second interesting piece of data was produced through mutational analysis of human paxil l in. Brown and Turner (1996) demonstrated that the vinculin binding domain of paxil l in is in the N-terminal half of the protein, while the focal adhesion targeting sequence is within the third L I M domain. Since C28H8.6a contains only the 4 L I M domains that align with those in paxil l in, it contains the dense body targeting sequence, but not an identifiable vinculin binding domain. Many possibilities exist for the exact function of C28H8.6a, and it could be that C28H8.6a is not a paxi l l in ortholog even i f it is a homolog. 95 P a x i l l i n C28H8.6 unc-95 P a x i l l i n C28H8.6 unc-95 P a x i l l i n C28H8. 6 unc-95 Paxill i n C28H8.6 unc-95 P a x i l l i n C28H8.6 unc-95 P a x i l l i n C28H8.6 unc-95 P a x i l l i n C28H8.6 unc-95 Paxill i n C28H8 . 6 unc-95 P a x i l l i n C28H8.6 unc-95 P a x i l l i n C28H8.6 unc-95 MDDLDALLADLESTTSHHKR|VFLSE|TP|SYP|GNHTY|EIAVPPPVPPPB|SEALNG MTB|PQ|SHQQF|S-|QWT|ESRSS|QRHG TGTJ|QDGRLS T|LHLDQWQPSGSRFIHQQPQSSSPVYGSSAKTS|VSNPQ|S|GSPCSRVGEEEHVYSF A|PJ|VERHVAR WRSEJRNSNK|K|FRNDEEFSQQDEIVNG PNKQKSAEPSP|VMSTSLGSNLSELD|LLLELNAVQHNPPGFPADEANSSPP|PGAL|PL TLTALKNDVEQ|TEIIRRKQEQMRME|RQFQTEMEVNGRISIDPTDDWLAAR|KAVS|— YGVPETNSPLGGKAGPLTKEKP|RNGGRGLED|RPSVES!LDELESSVPSPVPAITVNQ| DDMNQQLV|LKQDQRQNA|TDTLAA|VYDVNATTEVLRRGQRGRD| |MSSPQRVTSTQQQT|ISASSATRELDELMASLSDFK|QDLEQRADGERCWAAGWP|DG| |DGNKKKKEEIEYTL|LTPAPEE Q|PQRPKIPEDDNMETDDYS|QY| RSSPGGQ|EGGFMAQGKTG|SSJPGGPPKP^ QLDS1LHSIQ1DLN1LJBA1 |N|E§S|—|H|N|iplqL4i VQMSEET|S—L RRRRAR|T T|RRTLHISH| PPP P|AA||A Y | T | H v p i | E | l | s | - N M p | Q a f t f l B | |MIIIPT|CE|GA|L|Q|- PMBNIRAF^^MOI .QKF|TY§|M||Y||KALNMHGTYR|H|L KJ(PDCFY^ |Y HRIYYINGPILIKVBTALDRTWSPIHIFH BK|QG|HRA|T|RCISVMNKNF|I|C|THE|NQPKDHM N |LQYA|DDHQAS I E|L I RA|LE|Y|SHN|LH|EH|RE|FT P| QP|T SIFITMGIHBID—OHIGVSI • p i — ISEEBDHAILJHLAPNSERAQK CAF C S l IVN-G] ING DGQPYCK HYHERRG i f # l L B R H Y H E : ' R G ILB|G|QKP|I |Q NDK| Iv DR: K|Y|QN| R l F f H r i FLKLFC YNNTYA P a x i l l i n C28H8.6 LTPA unc-95 Figure 22. Prote in al ignment of C28H8.6a and U N C - 9 5 wi th human paxi l l in C28H8.6a and U N C - 9 5 were aligned with human paxil l in using ClustalW (http://ebi.ac.uk/clustalw). Amino acid matches are marked in red, while similar and conserved substitutions are coloured yellow. A small degree of similarity is noticable through U N C - 9 5 and paxi l l in, but they do not align as well as C28H8.6a and paxil l in. The 256 amino acids in C28H8.6a bear a striking similarity to the C-terminal amino acid sequence in human paxi l l in with 209 of the 256 amino acids in C28H8.6a having an exact match or conserved substitution when aligned with paxil l in. 96 4.2.4 Myofilament organization in C28H8.6 deficient animals In our R N A i screen for muscle affecting genes, we did not identify C28H8.6 as being necessary for proper myofilament organization. In a preliminary pass through each of the 3301 genes tested, C28H8.6 was screened as muscle affecting, but when rescreened, the percentage of worms displaying defects in the myofilament lattice dropped to 44%. Upon closer examination of the factors involved however, this reason for this result is very clear. R N A i does not always produce a completely penetrant effect, and in the case of C28H8.6 R N A i with full penetrance would result in every animal arresting at the first larval stage. Instead, in our study enough worms progressed to the L 4 stage that we were able to score a total of 20 animals in the preliminary screen, and 32 in our rescreen. A n L l arrest phenotype was noted for both screens, but penetrance evidently was not 100%. Complicating matters was the need to screen animals at the L 4 stage to be consistent and avoid the age related breakdown of myofilament organization in RW1596 worms. Any animals that were fed C28H8.6 d s R N A and were still able to progress to the fourth larval stage could not have had C28H8.6 completely knocked down. Therefore, C28H8.6 was not screened effectively in our R N A i screen due to the experimental design employed. In order to determine whether C28H8.6 is necessary for proper myofilament organization, we used R N A i to knock down C28H8.6 gene expression and then observed worms at the first larval stage. A t the L l stage, body wall muscle cells are much smaller than those in an adult, making detailed subcellular observation difficult. It was still possible however to observe the difference between RW1596 animals fed C28H8.6 d s R N A and those fed OP50 E. coli. When C28H8.6 was knocked down, 66% (n=44) of RW1596 worms had abnormal myofilament organization. In contrast, virtually all RW1596 animals fed OP50 had 97 normal myofilament organization. If L l worms had been screened for C28H8.6 in our R N A i screen, C28H8.6 almost certainly would have been identified as muscle affecting. This also indicates that there are l ikely a number of muscle affecting genes with a penetrance of myofilament defects in our R N A i screen below 50% due to similar circumstances. In a large scale R N A i screen, consistent screening methods are imperative, but in this case the L l arrest of C28H8.6 deficient animals and the lack of full R N A i penetrance resulted in a muscle affecting gene not being properly identified. 98 CHAPTER FIVE - CONCLUSION By combining an R N A i screen with use of bioinformatics data, this work has contributed to the identification of 103 novel muscle affecting genes, as wel l as identification and further characterization of the L I M protein C28H8.6. The R N A i screen we carried out is a good example of how the power of R N A i can be exploited to sensitively identify genes in a particular process; in this case the proper functioning of a muscle cell. The use of S A G E data to identify C28H8.6a as a body wall muscle gene is an excellent example of how to properly take advantage of the multitude of bioinformatics data available. Lastly, the strategy employed to further characterize C28H8.6a is a model for the type of experimentation that must take place to learn more about the genes we identified as having an effect on body wall muscle in C. elegans After screening 3301 genes using R N A i , and 4042 genes with bioinformatics data, further study must now be undertaken on some i f not all of the genes found to be of interest. Using R N A i , 119 genes were found to be muscle affecting in C. elegans. O f these, 16 are known muscle genes, leaving 103 potential future study targets. Addit ionally, a bioinformatics based screen has identified interesting genes with very high muscle specific expression. Intensive characterization of every muscle affecting gene may be slightly ambitious, but by combining data from both screens, a smaller number of high confidence genes can be compiled. There are several such genes which were identified in our R N A i screen, and also have muscle specific enrichment of expression, code for potential structural or adapter proteins, and have a human homolog. By focusing on a small number of genes, a more exhaustive analysis can be carried out. 99 Determining where a protein localizes is an appropriate first step towards characterization of a gene of interest. By taking advantage of the ORFeome, muscle specific G F P translational fusions can be produced in a high throughput fashion (Barbara Meissner (University of Brit ish Columbia, Canada), unpublished). In fact, G F P translational fusions have already been constructed for a number of genes that were identified as muscle affecting in our R N A i screen. Some protein products are localized to dense bodies, while others are found in muscle cell nuclei (Barbara Meissner (University of Brit ish Columbia, Canada), unpublished). This is very exciting, and further demonstrates that these genes have a role within body wal l muscle. Further experimentation should help us understand what these genes are responsible for body wall muscle, complementing our current knowledge of where they are localized. This study of C28H8.6a is an example of the type of further detailed analysis that is required to learn more about the interesting genes identified in our R N A i screen. The work on C28H8.6a is already underway, but more must be accomplished. This work has shown that C28H8.6a is localized to dense bodies, but this expression was under the control of a muscle specific promoter. It is possible that C28H8.6a is expressed in other tissues, so a ful l length G F P translational fusion that incorporates the endogenous promoter of C28H8.6 must be constructed. The usefulness of such a C28H8.6a translational fusion would extend beyond its ability to determine protein localization. Mutant rescue using a C28H8.6a: :GFP construct would show that C28H8.6a is necessary in the worm while C28H8.6b is not. One step further would be attempting to rescue the mutant phenotype with the muscle promoter driven C28H8.6a construct that was used to demonstrate dense body localization of C28H8.6a. If this muscle specific C28H8.6a translational fusion is able to rescue the larval 100 arrest phenotype of okl483 animals, then two conclusions wi l l be made; not only is C28H8.6a the only C28H8.6 isoform required for viability in the worm, but it is only required in body wal l muscle cells. It w i l l also be interesting to see what proteins C28H8.6 is able to bind to. Yeast two-hybrid is a straightforward and quick technique to look for protein interactions. Alternatively, antibodies raised against G F P can be used to pull down C28H8.6: :GFP fusion protein from protein preparations in order to identify proteins bound to the complex. Identifying binding partners wi l l be very important in order to determine how and where C28H8.6a fits into the structure of dense bodies. Lastly, antibody staining for known body wall muscle components in a C28H8.6 background wi l l be carried out. Antibody staining for various muscle proteins in mutant backgrounds of other muscle proteins provided us with much of what we know about the assembly of the sarcomere. C28H8.6a is not required for the early development of the body wal l muscle, but i f any dense body structural proteins are mislocalized in a C28H8.6 mutant background it w i l l provide insight into the function of C28H8.6a. Carrying out a similar level of detailed analysis for genes identified in our R N A i screen wi l l provide insight into their specific functions, and help f i l l in pieces of the sarcomere assembly puzzle. 101 References Ahringer, J . 1997. Turn to the worm! Curr Opin Genet Dev. 7:410-5. Al tun, Z .F. , and D.H. Hal l . 2005. Handbook of C. elegans Anatomy. Wormatlas. http://www.wormatlas.org. Barstead, R.J. , and R .H . Waterston. 1991. Vincul in is essential for muscle function in the nematode. The Journal of cell biology. 114:715-724. Bartnik, E.et al. 1986. Intermediate filaments in muscle and epithelial cells of nematodes. J Cell Biol. 102:2033-41. Benian, G.M.et al. 1996. The Caenorhabditis elegans gene unc-89, required for muscle M -line assembly, encodes a giant modular protein composed of Ig and signal transduction domains. The Journal of cell biology. 132:835-848. Blaxter, M . 1998. Caenorhabditis elegans is a nematode. Science. 282:2041-2046. Brenner, S. 1973. The genetics of behaviour. British medical bulletin. 29:269-271. Brenner, S. 1974. The genetics of Caenorhabditis elegans. Genetics. 77:71-94. Broday, L.et al. 2004. The L I M domain protein UNC-95 is required for the assembly of muscle attachment structures and is regulated by the R I N G finger protein R N F - 5 in C. elegans. The Journal of cell biology. 165:857-867. Brown, M.C. , and C E . Turner. 1996. Identification of L I M 3 as the principal determinant of paxi l l in focal adhesion localization and characterization of a novel motif on paxi l l in directing vinculin and focal adhesion kinase binding. The Journal of cell biology. 135:1109-1123 Brown, M.C. , and C E . Turner. 2004. Paxi l l in: adapting to change. Physiol Rev. 84:1315-39. Burr, A . H . , and C. Gans. 1998. Mechanical significance of obliquely striated architecture in nematode muscle. Biol Bull. 194:1-6. Burridge, K.et al. 1988. Focal adhesions: transmembrane junctions between the extracellular matrix and the cytoskeleton. Annual Review of Cell Biology. 4:487-525. C. elegans Sequencing Consortium. 1998. Genome Sequence of the Nematode C. elegans: A Platform for Investigating Biology. Science. 282:2012-8 Chalfie, M.et al. 1994. Green fluorescent protein as a marker for gene expression. Science. 263:802-805. 102 Ding, M.et al. 2004. The cytoskeleton and epidermal morphogenesis in C. elegans. Exp Cell Res. 301:84-90. Dixon, S.J. , and P.J. Roy. 2005. Muscle arm development in Caenorhabditis elegans. Development. 132:3079-3092. Epstein, H.F.et al. 1993. Myosin and paramyosin of Caenorhabditis elegans embryos assemble into nascent structures distinct from thick filaments and multi-filament assemblages. The Journal of cell biology. 122:845-858. Epstein, H.F.et al. 1974. A mutant affecting the heavy chain of myosin in Caenorhabditis elegans. Journal of Molecular Biology. 90:291-300. Francis, G.R., and R .H . Waterston. 1985. Muscle organization in Caenorhabditis elegans: localization of proteins implicated in thin filament attachment and I-band organization. J Cell Biol. 101:1532-49. Francis, R., and R.H. Waterston. 1991. Muscle cell attachment in Caenorhabditis elegans. The Journal of cell biology. 114:465-479. Goh, P.Y. , and T. Bogaert. 1991. Positioning and maintenance of embryonic body wall muscle attachments in C. elegans requires the mup-1 gene. Development. 111:667-81. Gontier, Y.et al. 2005. The Z-disc proteins myotil in and F A T Z - 1 interact with each other and are connected to the sarcolemma via muscle-specific filamins. J Cell Sci. 118:3739-49. Granato, M.et al. 1994. pha-1, a selectable marker for gene transfer in C. elegans. Nucleic Acids Res. 22:1762-3. Hedgecock, E.M.et al. 1987. Genetics of cell and axon migrations in Caenorhabditis elegans. Development. 100:365-82. Herndon, L.A.et al. 2002. Stochastic and genetic factors influence tissue-specific decline in ageing C. elegans. Nature. 419:808-14. Hiki ta, T.et al. 2005. Identification of a novel Cdc42 G E F that is localized to the P A T - 3 -mediated adhesive structure. Biochem Biophys Res Commun. 335:139-45. Hobert, O.et al. 1999. A conserved L I M protein that affects muscular adherens junction integrity and mechanosensory function in Caenorhabditis elegans. The Journal of cell biology. 144:45-57. Hoffman, E.P.et al. 1987. Dystrophin - the Protein Product of the Duchenne Muscular-Dystrophy Locus. Cell. 51:919-928. 103 Hresko, M.C.et al. 1999. Myotactin, a novel hypodermal protein involved in muscle-cell adhesion in Caenorhabditis elegans. The Journal of cell biology. 146:659-672. Hresko, M.C.et al. 1994. Assembly of body wall muscle and muscle cell attachment structures in Caenorhabditis elegans. J Cell Biol. 124:491-506. Kadrmas, J.L. , and M . C . Beckerle. 2004. The L I M domain: from the cytoskeleton to the nucleus. Nat Rev Mol Cell Biol. 5:920-31. Kamath, R.S.et al. 2003. Systematic functional analysis of the Caenorhabditis elegans genome using R N A i . Nature. 421:231-7. Lamont, PJ .e t al. 2006. Laing early onset distal myopathy: slow myosin defect with variable abnormalities on muscle biopsy. J Neurol Neurosurg Psychiatry. 77:208-15. Lehner, B.et al. 2006a. Loss of LIN-35, the Caenorhabditis elegans ortholog of the tumor suppressor p l05Rb, results in enhanced R N A interference. Genome Biol. 7:R4. Lehner, B.et al. 2006b. Systematic mapping of genetic interactions in Caenorhabditis elegans identifies common modifiers of diverse signaling pathways. Nature genetics. 38:896-903. L i , S.et al. 2004. A map of the interactome network of the metazoan C. elegans. Science. 303:540-3. L i n , X.et al. 2003. C. elegans PAT-6/actopaxin plays a critical role in the assembly of integrin adhesion complexes in vivo. Current biology : CB. 13:922-932. Mackenzie, J . M . , Jr., and H.F. Epstein. 1980. Paramyosin is necessary for determination of nematode thick filament length in vivo. Cell. 22:747-55. Mackenzie, J . M . , Jr.et al. 1978. Muscle development in Caenorhabditis elegans: mutants exhibiting retarded sarcomere construction. Cell. 15:751-62. Mackinnon, A.C.et al. 2002. C. elegans P A T - 4 / I L K functions as an adaptor protein within integrin adhesion complexes. Current biology : CB. 12:787-797'. MacLeod, A.R.et al. 1977a. A n internal deletion mutant of a myosin heavy chain in Caenorhabditis elegans. Proc Natl Acad Sci U SA. 74:5336-40. MacLeod, A.R.et al. 1977b. Identification of the structural gene for a myosin heavy-chain in Caenorhabditis elegans. J Mol Biol. 114:133-40. M c K a y , S.J.et al. 2003. Gene expression profil ing of cells, tissues, and developmental stages of the nematode C. elegans. Cold Spring Harbor symposia on quantitative biology. 68:159-169. 104 McKeown , C.R.et al. 2006. Molecular characterization of the Caenorhabditis elegans A L P / E n i g m a gene alp-1. Dev Dyn. 235:530-8. Mel lo , C.C.et al. 1991. Efficient gene transfer in C.elegans: extrachromosomal maintenance and integration of transforming sequences. Embo J. 10:3959-70. Mercer, K.B.et al. 2003. Caenorhabditis elegans U N C - 9 8 , a C2H2 Zn finger protein, is a novel partner of U N C - 9 7 / P I N C H in muscle adhesion complexes. Molecular biology of the cell. 14:2492-2507. Meredith, C.et al. 2004. Mutations in the slow skeletal muscle fiber myosin heavy chain gene ( M Y H 7 ) cause laing early-onset distal myopathy (MPD1). Am J Hum Genet. 75:703-8. Moerman, D.G. , and A . Fire. 1997. Muscle: Structure, function and development. In C. elegans II. D.L. Riddle, T. Blumenthal, B.J. Meyer, and J.R. Priess, editors. Co ld Spring Harbor Press, New York. 417-470. Moerman, D.G. , and B.D. Wil l iams. 2006. Sarcomere assembly in C. elegans muscle. In Wormbook. T.C.e.R. Community, editor. Mul len, G.P.et al. 1999. Complex patterns of alternative splicing mediate the spatial and temporal distribution of perlecan/UNC-52 in Caenorhabditis elegans. Molecular biology of the cell. 10:3205-3221. Nowak, K.et al. 2005. Muscular dystrophies related to the cytoskeleton/nuclear envelope. Novartis Foundation symposium. 264:98-111; discussion 112-7, 227-30. Ono, K.et al. 2006. Caenorhabditis elegans kettin, a large immunoglobulin-like repeat protein, binds to filamentous actin and provides mechanical stability to the contractile apparatuses in body wall muscle. Mol Biol Cell. 17:2722-34. Pleasance, E.D.et al. 2003. Assessment of S A G E in transcript identification. Genome Res. 13:1203-15. Reboul, J.et al. 2003. C. elegans ORFeome version 1.1: experimental verification of the genome annotation and resource for proteome-scale protein expression. Nature genetics. 34:35-41. Ridley, A . J.et al. 2003. Cel l migration: integrating signals from front to back. Science. 302:1704-9. Rogalski, T.M.et al. 2003. D IM-1 , a novel immunoglobulin superfamily protein in Caenorhabditis elegans, is necessary for maintaining bodywall muscle integrity. Genetics. 163:905-15. 105 Rogalski, T.M.et al. 2000. The UNC-112 gene in Caenorhabditis elegans encodes a novel component of cell-matrix adhesion structures required for integrin localization in the muscle cell membrane. The Journal of cell biology. 150:253-264. Rogalski, T.M.et al. 1993. Products of the unc-52 gene in Caenorhabditis elegans are homologous to the core protein of the mammalian basement membrane heparan sulfate proteoglycan. Genes & development. 7:1471-1484. Sadler, Let al. 1992. Zyx in and c C R P : two interactive L I M domain proteins associated with the cytoskeleton. J Cell Biol. 119:1573-87. Simmer, F.et al. 2003. Genome-wide R N A i of C. elegans using the hypersensitive rrf-3 strain reveals novel gene functions. PLoS Biol. 1:E12. Sonnichsen, B.et al. 2005. Full-genome R N A i profiling of early embryogenesis in Caenorhabditis elegans. Nature. 434:462-9. Sparrow, J.C.et al. 2003. Muscle disease caused by mutations in the skeletal muscle alpha-actin gene (ACTA1) . Neuromuscul Disord. 13:519-31. Sulston, J .E. , and H.R. Horvitz. 1977. Post-embryonic cell lineages of the nematode, Caenorhabditis elegans. Dev Biol. 56:110-56. Sulston, J.E.et al. 1983. The embryonic cell lineage of the nematode Caenorhabditis elegans. Dev Biol. 100:64-119. Timmons, L., and A . Fire. 1998. Specific interference by ingested dsRNA. Nature. 395:854. Timmons, L.et al. 2001. Ingestion of bacterially expressed dsRNAs can produce specific and potent genetic interference in Caenorhabditis elegans. Gene. 263:103-12. Tu, Y.et al. 1999. The LIM-only protein P I N C H directly interacts with integrin-linked kinase and is recruited to integrin-rich sites in spreading cells. Mol Cell Biol. 19:2425-34. Turner, C.E.et al. 1990. Paxi l l in: a new vinculin-binding protein present in focal adhesions. J Cell Biol. 111:1059-68. Velculescu, V.E.et al. 1995. Serial analysis of gene expression. Science. 270:484-487. Walhout, A.J.et al. 2000. G A T E W A Y recombinational cloning: application to the cloning of large numbers of open reading frames or ORFeomes. Methods in enzymology. 328:575-592. Waterston, R .H . 1988. Muscle. In The nematode Caenorhabditis elegans. W . B . Wood, editor. Cold Spring Harbor Press, New York. 281-335. 106 Waterston, R.H.et al. 1980. Mutants with altered muscle structure of Caenorhabditis elegans. Developmental biology. 77:271-302. White, J.G.et al. 1986. The Structure of the Nervous System of the Nematode Caenorhabditis elegans. Philisophical Transactions of the Royal Society of London. 314:1-340. Wi l l iams, B.D. , and R .H . Waterston. 1994. Genes critical for muscle development and function in Caenorhabditis elegans identified through lethal mutations. The Journal of cell biology. 124:475-490. Zengel, J .M . , and H.F. Epstein. 1980. Identification of genetic elements associated with muscle structure in the nematode Caenorhabditis elegans. Cell motility. 1:73-97. 

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0100674/manifest

Comment

Related Items