CHARACTERIZATION OF THE CHEMOSENSORY PROTEIN GENE FAMILY F R O M THE EASTERN SPRUCE BUD WORM, CHORISTONE URA FUMIFERANA by Kevin W. Wanner B. Sc. (Honours Biology), University of Victoria, 1990 Master of Pest Management, Simon Fraser University, 1994 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF T H E REQUIREMENTS FOR THE DEGREE OF Doctor of Philosophy in THE F A C U L T Y OF G R A D U A T E STUDIES (Plant Science) THE UNIVERSITY OF BRITISH COLUMBIA December 2004 ©Kevin W. Wanner, 2004 Abstract The peripheral sensory system of insects is the first to detect chemical stimuli; it is composed of specialized sensory neurons located within hollow, hair-like sensilla. Chemosensory proteins (CSPs) and odorant binding proteins (OBPs) are small, soluble proteins that transport hydrophobic stimuli across the hydrophilic lymph that separates the sensory receptors from the external environment. Incidental results from various studies indicate that most CSPs, and some OBPs, are expressed broadly in many different tissues, raising the question 'what is their non-sensory function?' In this thesis I explored the non-sensory function of CSPs using three different scopes of investigation: 1) an in silico analysis of all known CSP sequences, 2) a characterization of the expression pattern of four CSP genes from a representative lepidopteran species, and 3) a functional characterization of an individual CSP. I identified 15 new CSP sequences; four from cDNA clones described herein and 11 from sequence databases. Several protein similarity classes, representing CSPs from six insect orders, were identified, and each was characterized by highly conserved sequence motifs, including (A) N-terminal YTTKYDN(V/I)(WD)(L/V)DEIL, (B) central DGKELKXX(I/L)PDAL, and, (C) C-terminal KYDP. Three similarity classes were identified that diverged from these conserved motifs, presumably because they are under new functional and selective pressures. A detailed analysis of the expression pattern of four CSP genes from the Eastern spruce budworm, Choristoneura fumiferana, revealed that one (characterized by the retention of the conserved motifs) was expressed in the adult stage, while two that diverged from the conserved motifs were expressed in the immature stages (larvae and pupae). Furthermore, two of the divergent CSP genes were up-regulated during a natural molt, or during an ecdysteroid agonist induced molt. The ligand binding specificity of CfumAY624538, a divergent CSP, was characterized using the fluorescent reporter 1-NPN. ii Some CSPs bind to medium chain-length fatty acids; this was not the case for CfumAY624538, rather, a short chain-length alcohol was the only ligand tested that displaced 1-NPN in competition. Collectively, my results indicate that divergent CSPs from the Eastern spruce budworm function in development, including larval molting. iii Table of Contents Abstract ii Table of Contents iv List of Tables vii List of Figures viii List of Abbreviations xi Acknowledgements xiv CHAPTER ONE: Literature review 1 1.1 Introduction 1 1.2 Insect Sensilla 3 1.2.1 General overview 3 1.2.2 Structure 3 1.2.3 Function 5 1.3 Insect odorant binding proteins and chemosensory proteins 7 1.3.1 Overview 7 1.3.2 Structure 10 1.3.3 Phlogenetics 13 1.3.4 Expression 19 1.3.5 Ligand binding 21 1.4 Summary and objectives 27 1.5 References 28 CHAPTER TWO: Analysis of the insect chemosensory protein gene family 36 2.1 Introduction 36 2.2 Results 37 2.2.1 Cloning CSP cDNAs 37 2.2.2 Identification of CSPs from sequence databases 39 2.2.3 Genomic organization and intron structure 46 2.2.4 Protein similarity groups , 48 2.3 Discussion 52 iv 2.4 Materials and methods 60 2.4.1 Nomenclature 60 2.4.2 Molecular cloning of CSP cDNAs 60 2.4.3 Identification of chemosensory proteins from sequence databases 62 2.4.4 Gene predictions and conceptual translation 62 2.4.5 Protein similarity groups 63 2.5 References 63 CHAPTER THREE: Developmental expression patterns of four chemosensory protein genes from the Eastern spruce budworm, Choristoneura fumiferana 68 3.1 Introduction 68 3.2 Results 69 3.2.1 Both conserved and divergent members of the CSP gene family are found in the Eastern spruce budworm 69 3.2.2 CSP genes from the ESB display distinct developmental expression patterns 73 3.2.3 CSP transcripts can be positively or negatively regulated by the ecdysteroid cascade 82 3.3 Discussion 84 3.4 Materials and methods 89 3.4.1 Insect rearing, staging, and collections 89 3.4.2 Induction of a premature molt 90 3.4.3 Rapid amplification of cDNA ends (RACE) 91 3.4.4 Northern blot analysis 92 3.5 References 93 CHAPTER FOUR: Functional characterization of a divergent chemosensory protein from the Eastern spruce budworm, Choristoneura fumiferana 98 4.1 Introduction 98 4.2 Results 99 4.2.1 CfumAY426538 protein expression indicates a function in preparation for pupation 99 4.2.2 CfumAY624538 binds to the fluorescent reporter 1 -NPN, but not 1-AMA 100 4.2.3 Ligand competition assays 104 4.2.4 RNAj is effective in situ, but does not reduce transcript levels in vivo 107 4.3 Discussion 109 4.4 Materials and Methods 113 4.4.1 CfumAY624538 expression and purification 113 4.4.2 Polyclonal antibody preparation 115 4.4.2 Western blot analysis 116 4.4.3 Fluorescence-based binding assays 117 4.4.4 RNA interference 118 4.5 References 119 CHAPTER FIVE: Comparative characterization of a non sensory odorant binding protein from the Eastern spruce budworm, Choristoneura fumiferana 122 5.1 Introduction 122 5.2 Results 124 5.2.1 An OBP from Eastern spruce budworm larvae with only four conserved Cys residues 124 5.2.2 CfumSericotropin-like transcripts are highest during metamorphosis 125 5.2.3 CfumSericotropin-like protein also is most abundant during metamorphosis 129 5.3 Discussion 135 5.4 Materials and methods 138 5.4.1 Insect rearing, sampling and staging 138 5.4.2 RNA isolation and Northern blotting 138 5.4.3 CfumSericotropin-like expression and purification 138 5.4.4 Polyclonal antibody preparation 138 5.4.5 Western blot analysis 138 5.5 References 139 CHAPTER SIX: Summary and future perspectives 142 vi List of Tables Table 1.1 List of OBPs and CSPs whose structures have been solved By X-ray crystallography and/or NMR spectroscopy 11 Table 1.2 Summary of ligand binding assays using chemosensory proteins and fluorescent probes 24 Table 2.1 Summary list of all CSP sequences identified from the Insecta 43 Table 3.1 Developmental expression of four CSP genes from the Eastern spruce budworm 75 vii List of Figures Figure 1.1 General representation of an insect sensillum 4 Figure 1.2 General schematic representation of the phylogenetic relationship between different classes of OBPs 16 Figure 2.1 Nucleic acid sequence and conceptual translation of cDNA isolated from a C. fumiferana larval cDNA library 38 Figure 2.2 Alignment of the CSP family 40 Figure 2.3 Genomic organization and intron structure of CSP genes 47 Figure 2.4 Neighbor joining distance phenogram of all known CSP protein sequences 49 Figure 2.5 Evolutionary tree of the Insecta 56 Figure 2.6 Ribbon drawings of the NMR and crystalline structures of MbraCSPA6, and of MsexSAPl as determined by homology modeling 57 Figure 3.1 A) Nucleotide sequence of two cDNAs cloned from the Eastern spruce budworm, and the conceptually translated amino acid sequence 71 Figure 3.1 B) Amino acid alignment of select CSPs representing six different insect orders 72 Figure 3.2 Neighbor joining tree of all known CSP sequences from the Lepidoptera 74 Figure 3.3 Expression of four CSP genes, determined by Northern blot analysis, at different developmental stages of the Eastern spruce budworm 77 Figure 3.4 Expression of three CSP genes within the head, thorax and abdomen segments of 6th instar larvae (48-60 h old), and adults (24-72 h old), of the Eastern spruce budworm 78 Figure 3.5 CSP gene expression, as determined by Northern blotting, during the development of the Eastern spruce budworm from early 5 th instar larvae through to pupation 79 Figure 3.6 Expression of four CSP genes at 8 h intervals during the Development of the last larval instar of the Eastern spruce budworm, determined by Northern blotting 80 Figure 3.7 CSP gene expression during a premature molt 83 viii Figure 4.1 Western blot analysis of CfumAY426538 protein expression at eight hour time intervals during the development of the last larval instar of the ESB 100 Figure 4.2 Emission spectra of 1-NPN bound to CfumAY624538 101 Figure 4.3 Photobleaching of 1 -NPN fluorescence after repeated measurements 102 Figure 4.4 Quenching of intrinsic tryptophan fluorescence by 1-NPN binding 103 Figure 4.5 Binding saturation of 1-NPN to CfumAY624538 and the corresponding Scatchard plot 104 Figure 4.6 Competitive binding of 11 different ligands with 1-NPN 105 Figure 4.7 Structure of chemicals used in binding assays 106 Figure 4.8 C A T activity in extracts of Sf-9 cells after transfection with a C A T reporter vector, either alone, or in combination with 15 pg of dsRNA co-transfected or amended into the media (soaking) 108 Figure 5.1 Nucleotide sequence and conceptually translated protein sequence of CfumSericotropin-like 126 Figure 5.2 A) Protein alignment of 4-Cys OBPs representing the orders Coleoptera, Diptera and Lepidoptera, and B) a subset of classical OBPs with the typical six conserved Cys residues 127 Figure 5.3 Neighbor joining distance phenogram of the 4-Cys OBPs from the orders Coleoptera, Diptera (suborder Brachycera) and Lepidoptera, and, a subset of classical lepidopteran OBPs, that retain all six conserved Cys residues 128 Figure 5.4 Expression of CfumSericotropin-like, determined by Northern blot analysis, at different developmental stages of the Eastern spruce budworm 130 Figure 5.5 CfumSericotropin-like gene expression, as determined by Northern blotting, during the development of the Eastern spruce budworm from early 5 th instar larvae through to pupation 131 ix Figure 5.6 Expression of CfumSericotropin-like at 8 hour intervals during the development of the 6 th larval instar of the Eastern spruce budworm, determined by Northern blotting 132 Figure 5.7 Expression of CfumSericotropin-like within the head, thorax and abdomen segments of 6 th instar larvae (48-60 h old), and adults (24-72 h old), of the Eastern spruce budworm 133 Figure 5.8 Western blot analysis of CfumSericotropin-like protein expression during developmental stages of the Eastern spruce budworm 134 Figure 5.9 Western blot analysis of CfumSericotropin-like protein expression at eight hour intervals during the development of the 6 th instar of the Eastern spruce budworm 135 x List of Abbreviations 3-D three dimensional 3' three prime 4-Cys four cysteine 5' five prime A260 absorbance at 260 nanometers aa amino acid ABPX(s) antennal binding protein X(s) A M A 1 -amino-anthracene ANS 1,8-amino-naphthalene sulfonate Arg arginine A S A (+/-)-12-(9-anthroxyloxy) stearic acid ASP1 antennal specific protein 1 bp base pair BSA bovine serum albumin °C degrees Celsius C carbon in the context of fatty acid backbone chain length C A T chloramphenicol acetyltransferase CD circular dichroism cDNA complementary DNA C G , conceptual gene CNS cental nervous system cpm counts per minute CSP chemosensory protein C-terminal carboxy-terminal Cys cysteine d deoxynucleotide DNA deoxyribonucleic acid ds double stranded DTT dithiothreitol E D T A ethylenediaminetetraacetic acid eIF-4E eukaryotic initiation factor 4 ESB Eastern spruce budworm EST(s) expressed sequence tag(s) FABP fatty acid binding protein g gram(s) GCPR(s) G-protein coupled receptor(s) GFP green fluorescent protein GOBP general odorant binding protein h hour HCS head capsule slip iLBP intercellular lipid binding protein IPTG Isopropyl-beta-D-thiogalactopyranoside ir7 immune responsive gene 7 JH1 juvenile hormone 1 kb kilobase kDa kilodalton xi dissociation constant L B Lutria-Bertani broth Lys lysine m mi l l i M molar in the context of concentration M A L D I - T O F matrix assisted laser desorption ionization - time M C S multiple cloning site min minute(s) ml millilitre m M millimolar M O P S 3-(N-morpholino) propane sulfonic acid m R N A messenger R N A ng nanogram N M R nuclear magnetic resonance N P N N-phenyl-1 -naphthylamine nt nucleotide N-terminal amino-terminal OBP(s) odourant binding protein(s) O D 5 9 5 optical density of absorption at 595 nanometers oligo(s) oligonucleotide(s) OR(s) odorant receptor(s) O R F open reading frame O S - D olfactory segment D OS-E/F olfactory specific gene E/F P A G E polyacrylamide gel electrophoresis PBP(s) pheromone binding protein(s) P B P R P pheromone binding protein related protein P B S phosphate buffered saline P C R polymerase chain reaction P D B protein data bank Peb l l l ejaculatory bulb protein III Phk pherokine ppm parts per mil l ion P T G S post translational gene silencing R A C E rapid amplification of c D N A ends R N A j R N A interference RNA(s) ribonucleic acid(s) R N A s e ribonuclease rpm revolutions per minute R T - P C R reverse-transcriptase polymerase chain reaction s seconds S A P sensory appendage protein SDS sodium dodecyl sulphate Sf-9 Spodoptera frugiperda 9 ss single stranded SSC sodium citrate solution T A E Tris acetate E D T A T H P Tenebrio haemolymph protein Tris Tris-hydroxymethyl amino methane Tip tryptophan x i i Tyr tyrosine p, micro U units when referring to the amount of enzyme V volts in the context of electrophoresis WHC white head capsule xiii Acknowledgements First and foremost I thank my family, Tish and Ryan, for supporting my efforts and enduring through the challenging periods. To my parents, Emma and Henry, who valued and encouraged education, and challenged me by the examples they set. I thank my two supervisors, Murray Isman and David Theilmann, for supporting this project and making it all possible. I also thank my committee members Erika Plettner, and collaborator Qili Feng, for graciously inviting me into their labs during the completion of my thesis. During my visit to the Great Lakes Forestry Centre, Tim Ladd and Bill Tomkins kindly assisted my work, providing ideas, advice and technical supplies. Thanks to Ivy Ling and Nikki Honson at SFU, and Rob O'Brien and David Arkinstall at OUC, who all helped me along the way to competing my thesis. Bob McCron and associated staff at the GLFC insectary provided top quality insects, as did Nancy Brard at UBC. The "unrelenting" effort of Less Willis, who bore the brunt of my training, was greatly appreciated, along with the help that I received from all of my colleagues in the "DAT" lab. Many people have kindly advised me and provided ideas, these include Brian Ellis, Arthur Retnakaran, Dick Vogt and Hugh Robertson. I thankfully acknowledge the organizations that have supported my research, particularly the Killam Foundation for personal support, and Genome Canada for material support. Finally, a special thanks goes to Jim Thompson, Joyce Tom and Alina Yuhymets who have done a fantastic job of supporting students in the AgroEcology department, for this I am grateful. xiv CHAPTER ONE Literature review 1.1 Introduction Chemotaxis, the ability to sense and respond to a surrounding chemical environment, was a fundamental evolutionary development. Specialized neurons in multicellular eukaryotes detect chemical stimuli in the environment and convert them into electrical nerve impulses that travel to the central nervous system (CNS), a process termed signal transduction. Sensory neurons located in specialized organs, such as the chemosensilla of insect maxillae and antennae, and the taste buds and nasal epithelium of mammals, are responsible for the senses of taste (gustation) and smell (olfaction). Sensory neurons are bipolar; the dendrites located in the sensory organs are exposed to the external environment, and are connected directly to the CNS by the axon, which transports the nerve impulse (Frazier, 1985). Nerve impulses are initiated when receptors in the sensory neuron dendrites bind specific chemical stimuli (Ronnett and Moon, 2002); identifying the molecular components of the signal transduction pathway in vertebrate and invertebrate organisms, specifically the neuronal receptors, has been a highly sought goal during the last decade. Olfactory receptors were first cloned and identified from rat sensory neurons by Buck and Axel (1991), who discovered a multigene subfamily of G-coupled protein receptors (GCPRs). This landmark breakthrough, along with the complete sequencing of the human genome and that of several model organisms, such as the fruit fly Drosophila melanogaster and the nematode Caenorhabditis elegans, lead to the use of computational methods to identify several large gene families representing candidate gustatory and olfactory receptors that belong to the GCPR superfamily (for example, Troemel et al, 1995; Clyne et al, 1999; 2000). These studies revealed that odorant and gustatory receptor gene families were the largest known to 1 exist in a given animal genome - up to 1000 different genes representing up to 1% of the genome (Mombaerts, 1999). Another striking feature was the low sequence similarity among olfactory and gustatory receptors both within an individual genome and between different species. The evolution of a large and diverse receptor gene family is believed to be an adaptation required to solve the challenging problem of recognizing and discriminating among thousands of chemically divergent stimuli (Buck and Axel, 1991; Mombaerts, 1999; Ronnett and Moon, 2002). Although structurally different, insect and vertebrate olfactory organs have similar features. The sensory neurons of both are in direct contact with the external environment, separated only by a hydrophylic fluid; an aqueous mucosa that coats the vertebrate olfactory epithelium and the sensillum lymph that fills the hollow, hair-like chemosensilla of insects. The existence of transport proteins was predicted based upon the fact that hydrophobic odorants must cross the hydrophylic mucosa/lymph of both vertebrates and insects before they become accessible to the GCPRs on the neuron membrane (Ronnett and Moon, 2002). Thus, odorant binding proteins (OBPs), thought to transport lipophylic ligands in both vertebrate and insect sensory organs, were discovered ten years prior to the neuron membrane bound receptors (Vogt and Riddiford, 1981; Pelosi et al, 1982; Pevsner et al, 1985; 1986). The topic of this thesis focuses on an insect specific family of odorant binding proteins, termed chemosensory proteins (CSPs). The following sections will review the current literature on the structure and function of insect sensilla and their associated odorant binding proteins. Examples of vertebrate olfactory mechanisms will be included for comparison, since they help to illustrate several common paradigms that have been conserved in the evolution of sensory organs. 2 1.2 Insect sensilla 1.2.1 General overview Insect sensory neurons are housed within hollow hair-like structures termed sensilla, and anywhere from one to a few, or hundreds of thousands, can be located on an individual sensory organ (Frazier, 1985). The shape and design of sensilla are highly variable, and can only loosely be associated with function. In general, adult insects possess the greatest number of sensilla as compared to the larval stages, the majority of which function in olfaction and are located on the antennae. Mixed olfactory and gustatory sensilla are also found on the adult proboscis and maxillary palps, and can also be found on various other organs such as the front tarsi, female ovipositor, and the wings (Bernays and Chapman, 1994). 1.2.2 Structure While the structure of sensilla varies greatly, there are several common features (Figure 1.1). A bipolar sensory neuron (some times termed a sense cell) is housed within a hollow cuticular hair that may have one or many pores. The neuron dendrite, which contains the sensory receptor area, extends into the hollow sensillum, which is filled with an aqueous lymph, the only barrier between the odorant and gustatory receptors found on the neuron membrane and the external environment. This is often termed the peripheral olfactory system, since it is here that odorants first enter the sensory structure, and the primary events of signal transduction begin. The main body of the neuron lies immediately below the epidermis, where it is surrounded by two or more sheath cells; typically one inner sheath cell termed the trichogen, and an outer sheath cell termed the tormogen (Frazier, 1985). The sheath cells are believed to act as support cells, involved in developing and maintaining the sensillum structure and physiology, including components of the lymph. Variations on this general sensillum structure are numerous, including variation in size and shape (for example, sunken in a pit, peg-like or hair-like) and the number of pores 3 Figure 1.1. General representation of an insect sensillum, adapted from Frazier (1985). The body o f the neuron (N) is surrounded by inner (I) and outer (O) supporting sheath cells. The dendrite of the neuron is surrounded by the sensillum lymph (SL) and extends into the hollow, cuticular hair that is perforated by wall pores (WP). (uniporous vs multiporous). The cuticle may be single or double walled, and it may be thick or thin. Sensilla may be attached to the epidermis either rigidly, or by flexible cuticle in a socket. A single sensillum may house one or several neurons, and the neuron dendrites may be singular or branched. These features have been used to classify sensilla into different categories (Frazier, 1985), which in some cases can be loosely associated with function. For example, many gustatory sensilla have a single pore at the tip of a peg-like sensillum, and are termed contact chemoreceptors, while many olfactory sensilla are long, multiporous, hair-like 4 structures, designed to allow maximal diffusion of odorants into the sensillum lymph (Bernays and Chapman, 1994). 1.2.3 Function Sensory organs such as insect antennae or the vertebrate nose contain thousands of sensory neurons that work collectively to detect chemical stimuli, and to discriminate both between different stimuli, as well as between different levels of the same stimulus. For example, some insect sensilla are highly tuned and sensitive to one or a few chemical stimuli, such as sex pheromones, while other sensilla are less sensitive, and respond to a broad range of chemical stimuli, such as general plant odors (Bernays and Chapman, 1994). This information, some times referred to as the sensory code, must be relayed to the central nervous system. Recent evidence indicates that chemical stimuli may be coded in a two dimensional pattern in the brain. The axons of sensory neurons that express the same odorant receptor (OR) all terminate at the same discrete location within glomeruli that are found on the olfactory lobe, even though their dendrites (the receptor area) may have a scattered, indiscrete distribution in the sensory organ (Vosshall et al., 2000). In this way, a specific odor may be associated with a specific relay switch in the brain. This literature review will focus on the peripheral sensory system, where the sensory neuron dendrites interface with the external environment to initiate the primary events of signal recognition. Stimulation of the sensory neuron results in a graded receptor potential; the strength of the stimulus (related to the quantity of the stimulant, among other factors) is correlated to the magnitude of the receptor potential. The receptor potential initiates an action potential that is not graded - the action potential in itself does not contain information about the quantity of a stimulus. It is the number of action potentials per unit time that is proportional to the magnitude of the stimulus (Bernays and Chapman, 1994). Even if the stimulus continues, sense cells become adapted and require a period without stimulation. Therefore, the physiological environment that surrounds the sensory receptors (the sensillum lymph and nasal mucosa for example) is critical to the functioning of receptor potentials. Several events are required for signal transduction, including: a chemical signal must trigger a receptor potential; the signal must be accessible to the receptor; the chemical signal must be distinct from background noise; and, the signal must be terminated. The detection of a chemical stimulus, and the subsequent initiation of a receptor potential, is primarily carried out by odorant and taste receptors, that belong to a large and diverse family of GCPRs, as many as 1000 in a single genome. GCPRs, with seven transmembrane domains, are located at the dendritic tips of the sensory neurons (Troemel et al., 1995; Clyne et al., 1999; 2000; Mombaerts, 1999). After binding to a specific ligand, GPCRs associate with, and activate, a G-protein that in turn activates a secondary messenger system, resulting in the formation of a membrane potential (Ronnett and Moon, 2002). Some GCPRs are internalized by endocytosis after activation by a ligand; this may represent one mechanism of signal termination. Alternatively, biotransformation enzymes such as cytochrome P450 and glutathione-S-transferase have been found to be specifically associated with olfactory organs such as the antennae (Vogt and Riddiford, 1981; Rybczynski et al, 1989; Vogt et al., 1991; Rogers et al, 1999; Wang et al., 1999); these enzymes transform specific chemical signals (such as pheromones), making them more soluble for excretion and degradation. Biotransformation enzymes may also function in the degradation of deleterious chemicals that enter the sensory organ, and interfere with the detection of target stimuli (Rogers et al., 1999). Ensuring that sensory receptors on the neuron membrane have access to hydrophobic odorants is a selective pressure that both vertebrates and insects have had to evolve to, since the sensory neurons of both are surrounded, and separated from the external environment, by an aqueous, hydrophylic lymph/mucosa. This fact led to the theoretical prediction of lipophylic transport proteins that solubilize hydrophobic odorants in the lymph/mucosa, and 6 the early discovery of such proteins in both insects and vertebrates (Vogt and Riddiford, 1981; Pelosi etal, 1982; Pevsner et al, 1985; 1986). 1.3 Insect odorant binding proteins and chemosensory proteins 1.3.1 Overview Three different classes of lipophylic transport protein have been identified within sensory organs, two from insects and one from vertebrates. Insect odorant binding proteins: Odorant binding proteins were first isolated from insect antennae by their association with hydrophobic pheromone ligands labeled with radioisotopes (Vogt and Riddiford, 1981). Based upon their specific expression in the lymph of male olfactory sensilla that detect female produced pheromone (Vogt et al, 1991; Steinbrecht et al, 1992), and their ability to bind pheromone components in vitro (Vogt and Riddiford, 1981; Krieger et al, 1992; D u and Prestwich, 1995; Plettner et al, 2000; Campanacci et al, 2001), they were named pheromone binding proteins (PBPs) (Vogt et al, 2002). The hypothesis that PBPs transport hydrophobic pheromone odorants from the surface of the sensillum cuticle, across the hydrophylic sensillum lymph, to the sensory neuron receptors, has essentially remained unchanged since their discovery. Whether PBPs have an active role in olfactory coding by discriminating between different ligands, or simply play a passive role by making hydrophobic ligands soluble, remains a question. Following from this is the question of whether or not PBPs interact specifically with the neuron membrane receptors. Evidence both consistent with, and contradictory to, an active role for PBPs has been published during the last decade, and the specific role that PBPs play in olfaction remains unresolved. A s discussed in the previous section, several important events occur in and around the sensory neuron dendrite, including signal termination. Alternative hypothesis (not necessarily exclusive of one another) include a role for PBPs in odorant clearance, degradation, or, protection from degradation; these hypotheses also remain essentially unresolved. 7 The size of this gene family within insects has been expanded greatly, both by the discovery of PBP-related sequences from diverse insect orders, and by the complete genome sequences of D. melanogaster and Anopheles gambiae. As new PBP-related sequences were discovered, they were given the general name of odorant binding protein in the absence of functional data (Vogt et al., 2002). Along with the name, they were also assigned hypothetical functions analogous to that of the PBPs, namely the transport of hydrophobic lignads in sensillum lymph. Recent genetic studies have provided additional evidence for the significance of OBPs to olfactory function. The genome wide analysis of odorant receptor and OBP families in both insects and vertebrates provides a possible insight into the relationship between the two families. The genome of D. melanogaster encodes far fewer OR genes (~ 60) as compared to that of the mouse (-1000), but many more OBP genes (51 as opposed to 4 in the mouse) (Hekmat-Scafe et al, 2002). This has led to speculation that in insects, OBPs may interact with ORs in a combinatorial fashion, to achieve a similar discriminatory power as that of 1000 vertebrate ORs (Hekmat-Scafe et al, 2002), although there is no experimental evidence as yet to support this. Further tantalizing, but indirect evidence, comes from the identification of a genetic locus (Gp-9) that is associated with the regulation of social structure in the fire ant, Solenopsis invicta. In response to chemical cues from the queen, worker ants regulate the number of queens per colony to one or three; the type of colony was correlated to a polymorphism in Gp-9, a locus that encodes an OBP (Krieger & Ross, 2002). The identification of the lush gene in D. melanogaster provides a more concrete example of the functional significance of an OBP. Lush was identified using a behavioural assay designed to detect fruit fly mutants that exhibited olfactory defects. The mutation of an OBP gene (termed lush) resulted in an inability to avoid high concentrations of ethanol, as compared to wild type 8 flies (Kim et al, 1998; Kim and Smith, 2001). In both cases, however, contradictory results have also been published. For example, Gp-9 protein was isolated from the thorax of queen ants, rather then the antennae of worker ants (supplemental material, Krieger & Ross, 2002). In the case of Lush, in vitro binding assays using purified protein could not confirm ethanol binding, rather, a chemical commonly found in plastics (and thus also in ethanol stored in plastic bottles) bound to the protein, and also had repellent effects in behavioural assays (Zhou et al, 2004). C h e m o s e n s o r y p r o t e i n s : More recently, a second group of small, soluble insect specific proteins has been isolated from sensory organs and termed OS-D-like, after the first member that was cloned from the olfactory segment of D. melanogaster antennae (McKenna et al, 1994). Additional members were subsequently isolated from the sensory organs (antennae, tarsi, palps) of Schistocerca gregaria, Periplaneta americana and Eurycantha calcarata, and termed chemosensory proteins since they were immunolocalized to the sensillar lymph of S. gregaria contact chemosensilla (Angeli et al, 1999; Picimbon & Leal, 1999; Marchese et al, 2000). At about the same time, five new CSPs were each reported from the cabbage moth, Mamestra brassicae (Nagnan-Le Meillour et al, 2000), and the sphinx moth, Manduca sexta (Robertson et al, 1999), isolated from proboscis and antennae, respectively. Based upon their association with sensory organs, a three dimensional (3-D) structure similar to that of OBPs, and the ability to bind typical odorant ligands in vitro, they have been hypothesized to have a function similar to that of OBPs in the transport of hydrophobic odorants (Bohbot et al, 1998; Angeli et al, 1999; Nagnan-Le Meillour et al, 2000; Jacquin-Joly et al, 2001; Briand et al, 2002; Lartigue et al, 2002; Monteforti et al, 2002; Campanacci et al., 2003). However, CSPs and OBPs do not share amino acid homology, and appear to originate from unrelated gene families. CSPs are less well characterized as compared to the OBPs, due to their relatively recent discovery. 9 Vertebrate OBPs: Vertebrate OBPs will not be reviewed in detail here. For the purpose of clarity, it should be noted that vertebrate OBPs are entirely unrelated to insect OBPs and CSPs. Vertebrate OBPs are members of the lipocalin superfamily, that has ancient evolutionary roots that originate in bacteria, and lineages that have continued within plants and most eukaryotes (Flower et al., 2000; Tegoni et al., 2000). Lipocalin OBPs will be discussed in the following sections only in sufficient detail to compare and contrast the similarities and differences that distinguish them from insect OBPs and CSPs. 1.3.2 Structure Insect OBPs: The 3-D structures of six OBPs that represent five different insect orders have been solved by X-ray crystallography and/or nuclear magnetic resonance (NMR) spectroscopy (Table 1.1). The amino acid sequences of the six OBPs are highly divergent, some share only 11% identity. Remarkably, however, they all share a common protein fold (reviewed recently in Tegoni et al., 2004). In general, all are globular proteins typically composed of six amphiphatic hexadecenal & cineol 3.2 2-bromo-naphthalene & NB tt » benzyl-benzoate Polistes CSP 1-NPN 2.2 Fluorescent probe Calvello et dominulus al., 2003 methyl laurate; tetradecanol; 0.2,0.3,0.4, hexadecanol; palmitic; 0.4,0.4,0.5, Competition with a a myristic; elaidic; stearic acid; 0.6, 0.7 & 1-NPN oleic & dodecanol 0.9 lauric, octadecanol & 1.8, 3.2 & pelargonic 7.1 Schistocerca CSP- 1-NPN 4 Fluorescent probe Ban et al., gregaria sg4 2002 1- AM A NB il it Phenylacetonitrile; anisole & NB Competition with 1- it guaiacol NPN cis-3-hexen-l-ol; 3-phenyl-l-propanol; 2-phenylethanol; menthol; decanal; 2,2-diquinoline; phenylacetonitrile; anisole; NB a guaiacol; m-cresol; p-methylanisole; 2,6-dimethylaniline; 2-methylpyrazine; phenol THP & 1-hexanolTHP 2-amylcinnamaldehyde 9 tt » a ») 2-methoxycinnamaldehyde; cinnamaldehyde ethylene; 24, 33,44, tt » a i-citronellal; 3,7- 44&58 dimethyloctanol & 1-nonanol y-nonalactone; 1,2- 98, 102, dipyrazylethane; 124, 133, a diphenylamine; 2,2'-dipyridyl; 133 & 133 citralva & linalool 25 Ligand binding assays indicate that vertebrate OBPs and insect CSPs generally bind hydrophobic chemicals in a relatively indiscriminant manner (Tegoni et al, 2000; Briand et al., 2002; Calvello et al, 2003; Campanacci et al, 2003; Table 1.2), while insect OBPs are more specific. PBPs are the most characterized OBP, and binding assays indicate that they can discriminate between similar chemicals, preferentially binding to pheromones and their derivatives (Du and Prestwich, 1995; Maibeche-Coisne etal., 1997; Campanacci et al, 2001). Some OBPs even appear to be able to discriminate between different isomers of the same chemical (Plettner et al, 2000; Calvello et al, 2003). However, most binding assays with PBPs have been conducted using radiolabeled lignads and a limited set of competing chemicals, and with the presupposition that the pheromone is the natural ligand. When A M A was used to characterize the binding specificity of M. brassicae PBP1 (MbraPBPl) and ApolPBPl, pheromones from several different species, as well as several fatty acids, were able to displace A M A (Campanacci et al, 2001). Thus, it seams clear that PBPs are more discriminatory, but not necessarily exclusively to the pheromone components. The biological significance of the their discriminatory power (in relation to an active role in olfactory coding) remains a question. While CSPs and vertebrate OBPs are relatively less specific, there is evidence that they can distinguish between different classes of chemicals, as was thoroughly demonstrated for three different rat OBPs (Lobel et al, 2002). CSPs from A. mellifera and M. brassicae appear to bind preferentially to medium chain length fatty acid derivatives (Briand et al, 2002; Campannacci et al, 2003), while CSPs from S. gregaria and L. migratoria appear to bind larger, more bulky ligands (Ban et al, 2002; 2003). 26 1.4 Summary and objectives A large body of literature supports the sensory specific function of the insect O B P family; much less evidence exists to support the sensory specific function of the C S P family. The literature clearly indicates that CSP genes are broadly expressed in non-sensory tissues. Surprisingly, experimental results arising indirectly from various studies are beginning to indicate non-sensory expression of several OBPs . This raises an interesting question: what other physiological functions within insects require a large repertoire of binding proteins? However, experimental studies continue to focus on the sensory function o f these proteins. The following thesis examines aspects of the non-sensory function of members of the chemosensory protein family. The Eastern spruce budworm (ESB), Choristoneura fumiferana (Clemens), (Lepidoptera: Tortricidae), was selected as a representative of the Lepidoptera since it is also the most economically significant pest of Canadian forests east of the Rocky mountains. The subject is approached from three different levels of investigation. First, a bioinformatics approach is used to examine the entire C S P family as a whole within the class Insecta. N e w CSP gene sequences are identified from C. fumiferana and the ongoing A. mellifera and Drosophilapseudoobscura genome sequencing projects. Along with all other known CSP sequences entered on the N C B I (National Center for Biotechnology Information) database, they are examined for trends in amino acid homology, systematics and genomic organization. Next, the gene expression pattern of four new CSPs from C. fumiferana is characterized. Specifically, sex, segment and development specific gene expression is analyzed. Finally, the function of one C. fumiferana C S P is examined in more detail by investigating protein expression patterns and characterizing binding preferences to various chemical ligands. For comparison, a 4-Cys non-sensory O B P was also cloned from C. fumiferana, and its gene and protein expression pattern characterized. With this approach, several specific objectives are addressed, including: 27 1) Can a phylogenetic characterization of CSP sequences within the Insecta yield insights into their function, and provide a theoretical framework for further investigation? 2) What is the sex, tissue and development specific expression pattern of CSP genes within the Eastern spruce budworm, and how might these patterns reflect on the non-sensory function of this gene family? 3) What is the protein expression pattern and ligand binding preference of a CSP from the Eastern spruce budworm, and what does it indicate about function in non-sensory tissue? 4) What is the gene and protein expression pattern of a 4-Cys non-sensory OBP, and how does it compare to that of the CSPs? 1.5 References Angeli S., Ceron F., Scaloni A., Monti M . , Monteforti G., Minnocci A., Petacchi R., and Pelosi P. (1999). Purification, structural characterization, cloning and immunocytochemical localization of chemoreception proteins from Schistocerca gregaria. Eur J Biochem 262: 745-54. Area B., Lombardo F., Lanfrancotti A., Spanos L. , Veneri M . , Louis C , and Coluzzi M . (2002). A cluster of four D7-related genes is expressed in the salivary glands of the African malaria vector Anopheles gambiae. Insect Mol Biol 11: 47-55. Ban L. , Scaloni A., Brandazza A., Angeli S., Zhang L. , Yan Y., and Pelosi P. (2003). Chemosensory proteins of Locusta migratoria. Insect Mol Biol 12: 125-34. Ban L. , Zhang L. , Yan Y., and Pelosi P. (2002). Binding properties of a locust's chemosensory protein. Biochem Biophys Res Commun 293: 50-4. Bernays E.A., and Chapman R.F. (1994). Sensory systems. In Host-plant Selection by Phytophagous Insects, pp. 61-94. Chapman and Hall, New York. Bohbot J., Sobrio F., Lucas P., and Nagnan-Le Meillour P. (1998). Functional characterization of a new class of odorant-binding proteins in the moth Mamestra brassicae. Biochem Biophys Res Commun 253: 489-94. Briand L. , Swasdipan N., Nespoulous C , Bezirard V., Blon F., Huet J. C , Ebert P., and Penollet J. C. (2002). Characterization of a chemosensory protein (ASP3c) from honeybee (Apis mellifera L.) as a brood pheromone carrier. Eur J Biochem 269: 4586-96. 28 Buck L . , and A x e l R. (1991). A novel multigene family may encode odorant receptors: a molecular basis for odor recognition. Cell 65: 175-87. Butler M . J., Jacobsen T. L . , Cain D . M . , Jarman M . G. , Hubank M . , Whittle J. R., Phillips R., and Simcox A . (2003). Discovery of genes with highly restricted expression patterns in the Drosophila wing disc using D N A oligonucleotide microarrays. Development 130: 659-70. Calvello M . , Guerra N . , Brandazza A . , D'Ambrosio C , Scaloni A . , Dani F. R., Turillazzi S., and Pelosi P. (2003). Soluble proteins of chemical communication in the social wasp Polistes dominulus. Cell Mol Life Sci 60: 1933-43. Calvo E . , deBianchi A . G . , James A . A . , and Marinotti O. (2002). The major acid soluble proteins of adult female Anopheles darlingi salivary glands include a member of the D7-related family of proteins. Insect Biochem Mol Biol 32: 1419-27. Campanacci V . , Krieger J., Bette S., Sturgis J. N . , Lartigue A . , Cambillau C , Breer H . , and Tegoni M . (2001). Revisiting the specificity of Mamestra brassicae and Antheraea polyphemus pheromone-binding proteins with a fluorescence binding assay. J Biol Chem 276: 20078-84. Campanacci V . , Lartigue A . , Hallberg B . M . , Jones T. A . , Giudici-Orticoni M . T., Tegoni M . , and Cambillau C. (2003). Moth chemosensory protein exhibits drastic conformational changes and cooperativity on ligand binding. Proc Natl Acad Sci USA 100: 5069-74. Christophides G . K . , Mintzas A . C , and Komitopoulou K . (2000). Organization, evolution and expression of a multigene family encoding putative members of the odorant binding protein family in the medfly Ceratitis capitata. Insect Mol Biol 9: 185-95. Clyne P. J., Warr C. G . , and Carlson J. R. (2000). Candidate taste receptors in Drosophila. Science 287: 1830-4. Clyne P. J., Warr C. G . , Freeman M . R., Lessing D. , K i m J., and Carlson J. R. (1999). A novel family of divergent seven-transmembrane proteins: candidate odorant receptors in Drosophila. Neuron 22: 327-38. Darwish Marie A . , Veggerby C , Robertson D . H . , Gaskell S. J., Hubbard S. J., Martinsen L . , Hurst J. L . , and Beynon R. J. (2001). Effect of polymorphisms on ligand binding by mouse major urinary proteins. Protein Sci 10: 411-7. D u G. , and Prestwich G . D . (1995). Protein structure encodes the ligand binding specificity in pheromone binding proteins. Biochemistry 34: 8726-32. Flower D . R., North A . C , and Sansom C. E . (2000). The lipocalin protein family: structural and sequence overview. Biochim Biophys Acta 1482: 9-24. Frazier J .L. (1985). Nervous system: sensory system. In Fundamentals of Insect Phsiology (Blum M . S . , ed.), pp. 287-356. John Wiley and Sons Inc., Athens, G A . 29 Fujii S., and Amrein H. (2002). Genes expressed in the Drosophila head reveal a role for fat cells in sex-specific physiology. EmboJ 21: 5353-63. Galindo K., and Smith D. P. (2001). A large family of divergent Drosophila odorant-binding proteins expressed in gustatory and olfactory sensilla. Genetics 159: 1059-72. Graham L. A., Brewer D., Lajoie G., and Davies P. L. (2003). Characterization of a Subfamily of Beetle Odorant-binding Proteins Found in Hemolymph. Mol Cell Proteomics 2: 541-9. Graham L. A., and Davies P. L. (2002). The odorant-binding proteins of Drosophila melanogaster: annotation and characterization of a divergent gene family. Gene 292: 43-55. Graham L. A., Tang W., Baust J. G., Liou Y. C., Reid T. S., and Davies P. L. (2001). Characterization and cloning of a Tenebrio molitor hemolymph protein with sequence similarity to insect odorant-binding proteins. Insect Biochem Mol Biol 31: 691-702. Gutierrez G., Ganfornina M . D., and Sanchez D. (2000). Evolution of the lipocalin family as inferred from a protein sequence phylogeny. Biochim Biophys Acta 1482: 35-45. Hekmat-Scafe D. S., Scafe C. R., McKinney A. J., and Tanouye M . A. (2002). Genome-wide analysis of the odorant-binding protein gene family in Drosophila melanogaster. Genome Res 12: 1357-69. Ishida Y., Chiang V. P., Haverty M . I., and Leal W. S. (2002). Odorant-binding proteins from a primitive termite. J Chem Ecol 28: 1887-93. Jacquin-Joly E. , Vogt R. G., Francois M . C , and Nagnan-Le Meillour P. (2001). Functional and expression pattern analysis of chemosensory proteins expressed in antennae and pheromonal gland of Mamestra brassicae. Chem Senses 26: 833-44. Kane C. D., and Bernlohr D. A. (1996). A simple assay for intracellular lipid-binding proteins using displacement of 1-anilinonaphthalene 8-sulfonic acid. Anal Biochem 233: 197-204. Kim M . S., Repp A., and Smith D. P. (1998). LUSH odorant-binding protein mediates chemosensory responses to alcohols in Drosophila melanogaster. Genetics 150: 711-21. Kim M . S., and Smith D. P. (2001). The invertebrate odorant-binding protein LUSH is required for normal olfactory behavior in Drosophila. Chem Senses 26: 195-9. Kitabayashi A. N., Arai T., Kubo T., and Natori S. (1998). Molecular cloning of cDNA for plO, a novel protein that increases in the regenerating legs of Periplaneta americana (American cockroach). Insect Biochem Mol Biol 28: 785-90. 30 Krieger J., Raming K., Prestwich G. D., Frith D., Stabel S., and Breer H. (1992). Expression of a pheromone-binding protein in insect cells using a baculovirus vector. Eur J Biochem 203: 161-6. Krieger M . J., and Ross K. G. (2002). Identification of a major gene regulating complex social behavior. Science 295: 328-32. Kruse S. W., Zhao R., Smith D. P., and Jones D. N. (2003). Structure of a specific alcohol-binding site defined by the odorant binding protein LUSH from Drosophila melanogaster. Nat Struct Biol 10: 694-700. Lartigue A., Campanacci V. , Roussel A., Larsson A. M . , Jones T. A., Tegoni M . , and Cambillau C. (2002). X-ray structure and ligand binding study of a moth chemosensory protein. JBiol Chem 277: 32094-8. Lartigue A., Gruez A., Briand L. , Blon F., Bezirard V., Walsh M . , Pernollet J. C , Tegoni M . , and Cambillau C. (2004). Sulfur single-wavelength anomalous diffraction crystal structure of a pheromone-binding protein from the honeybee Apis mellifera L. J Biol Chem 279: 4459-64. Lartigue A., Riviere S., Brossut R., Tegoni M . , and Cambillau C. (2003). Crystallization and preliminary crystallographic study of a pheromone-binding protein from the cockroach Leucophaea maderae. Acta Crystallogr D Biol Crystallogr 59: 916-8. Lee D., Damberger F. F., Peng G., Horst R., Guntert P., Nikonova L. , Leal W. S., and Wuthrich K. (2002). NMR structure of the unliganded Bombyx mori pheromone-binding protein at physiological pH. FEBS Lett 531: 314-8. Lobel D., Jacob M . , Volkner M . , and Breer H. (2002). Odorants of different chemical classes interact with distinct odorant binding protein subtypes. Chem Senses 27: 39-44. Lobel D., Marchese S., Krieger J., Pelosi P., and Breer H. (1998). Subtypes of odorant-binding proteins; heterologous expression and ligand binding. Eur J Biochem 254: 318-24. Maibeche-Coisne M . , Sobrio F., Delaunay T., Lettere M . , Dubroca J., Jacquin-Joly E. and Nagnan-Lemeillour P. (1997). Pheromone binding proteins of the moth Mamestra brassicae: specificity of ligand binding. Insect Biochem Mol Biol 27: 213-221. Maida R., Krieger J., Gebauer T., Lange U., and Ziegelberger G. (2000). Three pheromone-binding proteins in olfactory sensilla of the two silkmoth species Antheraea polyphemus and Antheraea pernyi. Eur J Biochem 267: 2899-908. Marchese S., Angeli S., Andolfo A., Scaloni A., Brandazza A., Mazza M . , Picimbon J., Leal W. S., and Pelosi P. (2000). Soluble proteins from chemosensory organs of Eurycantha calcarata (Insects, Phasmatodea). Insect Biochem Mol Biol 30: 1091-8. 31 McKenna M . P., Hekmat-Scafe D. S., Gaines P., and Carlson J. R. (1994). Putative Drosophila pheromone-binding proteins expressed in a subregion of the olfactory system. J Biol Chem 269: 16340-7. Mohanty S., Zubkov S., and Gronenborn A. M . (2004). The solution NMR structure of Antheraea polyphemus PBP provides new insight into pheromone recognition by pheromone-binding proteins. J Mol Biol 337: 443-51. Mombaerts P. (1999). Seven-transmembrane proteins as odorant and chemosensory receptors. Science 286: 707-11. Monteforti G., Angeli S., Petacchi R., and Minnocci A. (2002). Ultrastructural characterization of antennal sensilla and immunocytochemical localization of a chemosensory protein in Carausius morosus Brunner (Phasmida : Phasmatidae). Arthropod Structure & Development30: 195-205. Mosbah A., Campanacci V., Lartigue A., Tegoni M . , Cambillau C , and Darbon H. (2003). Solution structure of a chemosensory protein from the moth Mamestra brassicae. Biochem J369: 39-44. Nagnan-Le Meillour P., Cain A. H. , Jacquin-Joly E. , Francois M . C , Ramachandran S., Maida R., and Steinbrecht R. A. (2000). Chemosensory proteins from the proboscis of Mamestra brassicae. Chem Senses 25: 541-53. Oduol F., Xu J., Niare O., Natarajan R., and Vernick K. D. (2000). Genes identified by an expression screen of the vector mosquito Anopheles gambiae display differential molecular immune response to malaria parasites and bacteria. Proc Natl Acad Sci U S A 97: 11397-402. Paesen G. C , and Happ G. M . (1995). The B proteins secreted by the tubular accessory sex glands of the male mealworm beetle, Tenebrio molitor, have sequence similarity to moth pheromone-binding proteins. Insect Biochem Mol Biol 25: 401-8. Paolini S., Scaloni A., Amoresano A., Marchese S., Napolitano E. , and Pelosi P. (1998). Amino acid sequence, post-translational modifications, binding and labelling of porcine odorant-binding protein. Chem Senses 23: 689-98. Park S. K., Shanbhag S. R., Wang Q., Hasan G., Steinbrecht R. A., and Pikielny C. W. (2000). Expression patterns of two putative odorant-binding proteins in the olfactory organs of Drosophila melanogaster have different implications for their functions. Cell Tissue #^300: 181-92. Pelosi P., Baldaccini N. E. , and Pisanelli A. M . (1982). Identification of a specific olfactory receptor for 2-isobutyl-3-methoxypyrazine. Biochem J201: 245-8. Pelosi P., and Maida R. (1995). Odorant-binding proteins in insects. Comp Biochem Physiol B Biochem Mol Biol 111: 503-14. 32 Pevsner J., Sklar P. B., and Snyder S. H. (1986). Odorant-binding protein: localization to nasal glands and secretions. Proc Natl Acad Sci USA 83: 4942-6. Pevsner J., Trifiletti R. R., Strittmatter S. M . , and Snyder S. H. (1985). Isolation and characterization of an olfactory receptor protein for odorant pyrazines. Proc Natl Acad Sci USA 82: 3050-4. Picimbon J. F., Dietrich K., Breer H., and Krieger J. (2000). Chemosensory proteins of Locusta migratoria (Orthoptera: Acrididae). Insect Biochem Mol Biol 30: 233-41. Picimbon J. F., Dietrich K., Krieger J., and Breer H. (2001). Identity and expression pattern of chemosensory proteins in Heliothis virescens (Lepidoptera, Noctuidae). Insect Biochem Mol Biol 31:1173-81. Picimbon J., and Leal W. S. (1999). Olfactory soluable proteins of cockroaches. Insect Biochem Mol Biol 29: 973-978. Plettner E. (2003). The peripheral pheromone olfactory system in insects: targets for species-selective insect control agents. In Insect Pheromone Biochemistry and Molecular Biology: The biosynthesis and detection of pheromones and plant volatiles (Blomquist, G.J. and Vogt, R . C , eds.), pp. 477-507. Elsevier, San Diego, Ca. Plettner E. , Lazar J., Prestwich E. G., and Prestwich G. D. (2000). Discrimination of pheromone enantiomers by two pheromone binding proteins from the gypsy moth Lymantria dispar. Biochemistry 39: 8953-62. Riviere S., Lartigue A., Quennedey B., Campanacci V., Farine J. P., Tegoni M . , Cambillau C , and Brossut R. (2003). A pheromone-binding protein from the cockroach Leucophaea maderae: cloning, expression and pheromone binding. Biochem J371: 573-9. Robertson H. M . , Martos R., Sears C. R., Todres E. Z., Walden K. K., and Nardi J. B. (1999). Diversity of odorant binding proteins revealed by an expressed sequence tag project on male Manduca sexta moth antennae. Insect Mol Biol 8: 501-18. Rogers M . E. , Jani M . K., and Vogt R. G. (1999). An olfactory-specific glutathione-S-transferase in the sphinx moth Manduca sexta. J Exp Biol 202 (Pt 12): 1625-37. Ronnett G. V. , and Moon C. (2002). G proteins and olfactory signal transduction. Annu Rev Physiol 64: 189-222. Rothemund S., Liou Y. C , Davies P. L. , Krause E. , and Sonnichsen F. D. (1999). A new class of hexahelical insect proteins revealed as putative carriers of small hydrophobic ligands. Structure Fold Des 7: 1325-32. Rybczynski R., Reagan J., and Lerner M . R. (1989). A pheromone-degrading aldehyde oxidase in the antennae of the moth Manduca sexta. J Neurosci 9: 1341-53. Sabatier L. , Jouanguy E. , Dostert C , Zachary D., Dimarcq J. L. , Bulet P., and Imler J. L. (2003). Pherokine-2 and -3. Eur J Biochem 270: 3398-407. 33 Sanchez D., Ganfornina M . D., and Bastiani M . J. (2000a). Lazarillo, a neuronal lipocalin in grasshoppers with a role in axon guidance. Biochim Biophys Acta 1482: 102-9. Sanchez D., Ganfornina M . D., Torres-Schumann S., Speese S. D., Lora J. M . , and Bastiani M . J. (2000b). Characterization of two novel lipocalins expressed in the Drosophila embryonic nervous system. IntJDev Biol 44: 349-59. Sandler B. H., Nikonova L. , Leal W. S., and Clardy J. (2000). Sexual attraction in the silkworm moth: structure of the pheromone-binding-protein-bombykol complex. Chem Biol 7: 143-51. Shanbhag S. R., Hekmat-Scafe D., Kim M . S., Park S. K., Carlson J. R., Pikielny C , Smith D. P., and Steinbrecht R. A. (2001). Expression mosaic of odorant-binding proteins in Drosophila olfactory organs. Microsc Res Tech 55: 297-306. Slavik J. (1982). Anilinonaphthalene sulfonate as a probe of membrane composition and function. Biochim Biophys Acta 694: 1-25. Steinbrecht R.A., Ozaki M . and Ziegelberger G. (1992). Immunocytochemical localization of pheromone-binding protein in moth antennae. Cell Tissue Res 270: 287-302. Storch J., Bass N. M . , and Kleinfeld A. M . (1989). Studies of the fatty acid-binding site of rat liver fatty acid-binding protein using fluorescent fatty acids. J Biol Chem 264: 8708-13. Sugiyama Y., Yamada T., and Kaplowitz N. (1982). Newly identified organic anion-binding proteins in rat liver cytosol. Biochim Biophys Acta 709: 342-52. Tegoni M . , Campanacci V., and Cambillau C. (2004). Structural aspects of sexual attraction and chemical communication in insects. Trends Biochem Sci 29: 257-64. Tegoni M . , Pelosi P., Vincent F., Spinelli S., Campanacci V., Grolli S., Ramoni R., and Cambillau C. (2000). Mammalian odorant binding proteins. Biochim Biophys Acta 1482: 229-40. Troemel E. R., Chou J. H., Dwyer N. D., Colbert H. A., and Bargmann C. I. (1995). Divergent seven transmembrane receptors are candidate chemosensory receptors in C elegans. Cell 83: 207-18. Vierstraete E. , Cerstiaens A., Baggerman G., Van den Bergh G., De Loof A., and Schoofs L. (2003). Proteomics in Drosophila melanogaster: first 2D database of larval hemolymph proteins. Biochem Biophys Res Commun 304: 831-8. Vogt R. G. (2002). Odorant binding protein homologues of the malaria mosquito Anopheles gambiae; possible orthologues of the OS-E and OS-F OBPs OF Drosophila melanogaster. J Chem Ecol 28: 2371-6. 34 Vogt R. G., Callahan F. E., Rogers M. E., and Dickens J. C. (1999). Odorant binding protein diversity and distribution among the insect orders, as indicated by LAP, an OBP-related protein of the true bug Lygus lineolaris (Hemiptera, Heteroptera). Chem Senses 24: 481-95. Vogt R.G., Prestwich G.D. and Lerner M.R. (1991). Odorant-binding-protein subfamilies associate with distinct classes of olfactory receptor neurons in insects. J Neurobiol 22: 74-84. Vogt R.G. and Riddiford L.M. (1981). Pheromone binding and inactivation by moth antennae. Nature 293: 161-163. Vogt R. G., Rogers M. E., Franco M. D., and Sun M. (2002). A comparative study of odorant binding protein genes: differential expression of the PBP1-GOBP2 gene cluster in Manduca sexta (Lepidoptera) and the organization of OBP genes in Drosophila melanogaster (Diptera). J Exp Biol 205: 719-44. Vosshall L. B., Wong A. M., and Axel R. (2000). An olfactory sensory map in the fly brain. Cell 102: 147-59. Wang Q., Hasan G., and Pikielny C. W. (1999). Preferential expression of biotransformation enzymes in the olfactory organs of Drosophila melanogaster, the antennae. J Biol Chem 274: 10309-15. Wanner K.W., Willis L.G., Theilmann D.A., Isman M.B., Feng Q., and Plettner E. (2004). Analysis of the insect os-d-like gene family. J Chem Ecol 30: 889-911. Willett C. S. (2000). Evidence for directional selection acting on pheromone-binding proteins in the genus Choristoneura. Mol Biol Evol 17: 553-62. Willett C. S., and Harrison R. G. (1999). Pheromone binding proteins in the European and Asian corn borers: no protein change associated with pheromone differences. Insect Biochem Mol Biol 29: 277-84. Xu P. X., Zwiebel L. J., and Smith D. P. (2003). Identification of a distinct family of genes encoding atypical odorant-binding proteins in the malaria vector mosquito, Anopheles gambiae. Insect Mol Biol 12: 549-60. Zhou J. J., Zhang G. A., Huang W., Birkett M. A., Field L. M , Pickett J. A., and Pelosi P. (2004). Revisiting the odorant-binding protein LUSH of Drosophila melanogaster. evidence for odor recognition and discrimination. FEBS Lett 558: 23-6. 35 CHAPTER TWO Analysis of the insect chemosensory protein gene family* 2.1 Introduction The insect chemosensory protein (CSP) gene family is represented by a group of small, highly soluble proteins with hydrophobic binding pockets. Named after the founding member that was cloned from the olfactory segment of D. melanogaster antennae, they have also been referred to as OS-D-like and sensory appendage proteins (SAPs) based upon their association with sensory organs (McKenna et al., 1994; Robertson et al., 1999). The crystalline and NMR structure of MbraCSPA6 from the moth M. brassicae reveals a globular structure composed of six amphiphatic a-helices that surround a hydrophobic binding pocket, and two disulfide bonds that form a - a loops (Lartigue et al, 2002; Mosbah et al, 2003). CSP proteins have been compared to insect odorant binding proteins (OBPs) that are similar in size, solubility and overall structure (OBPs have six amphiphatic a-helices, joined by three disulphide bonds, that surround a hydrophobic binding pocket) (Rothemund et al, 1999; Sandler et al, 2000; Lee et al, 2002). In many (but not all) cases, OBPs are specifically expressed in the hydrophyllic sensillum lymph that surrounds olfactory neurons (for example, Galindo and Smith, 2001; Shanbhag et al, 2001; Vogt et al, 2002) where they are involved in the transport of hydrophobic odorants. In contrast, CSP proteins are broadly expressed in various tissues (Kitabayashi et al, 1998; Picimbon et al, 2000; Jacquin-Joly et al, 2001 * A version of this chapter has been published in the Journal of Chemical Ecology: Wanner, K.W., Willis, L.G., Theilmann, D.A., Isman, M.B., Feng,Q. and Plettner, E. (2004). Analysis of the insect os-d-like gene family. Journal of Chemical Ecology 30: 883-905. 36 Picimbon et al., 2001), including sensillum lymph in some cases (Angeli et al, 1999; Nagnan-Le Meillour et al, 2000; Monteforti et al., 2002). Some CSPs bind short to medium chain length fatty acid derivatives with low specificity (Nagnan-Le Meillour et al, 2000; Jacquin-Joly et al, 2001; Briand et al, 2002; Lartigue et al, 2002; Campanacci et al, 2003), and a more general physiological function relating to the transport/solubility of hydrophobic ligands in various tissues has been proposed. In this first chapter I have reported four new CSP sequences, three from the Eastern spruce budworm, C. fumiferana, and one from the cabbage looper, Trichoplusia ni. To provide a theoretical framework for further studies, the protein sequences were analyzed within the context of the insect CSP family as a whole. To accomplish this, I have expanded the number of known CSP-like sequences by identifying nine new members from the D. pseudoobscura and A.mellifera genome sequencing projects, and two from a B. mori expressed sequence tag (EST) database. Combined together with GenBank sequences from the D. melanogaster and A. gambiae genomes, and cDNA from the insect orders Dictyoptera, Hymenoptera, Lepidoptera, Orthoptera and Phasmatodea, a total of 70 CSPs are analyzed in this chapter. 2.2 Results 2.2.1 Cloning CSP cDNAs Four new CSP complementary DNAs (cDNAs) were cloned. I have simply named them according to their GenBank accession numbers, since a system of nomenclature has not yet been established for this gene family. Two unique sequences, CfumAY426538 & CfumAY426539 (Figure 2.1), were identified from a C. fumiferana larval cDNA library by a tBlastx (translated query versus a translated database) search of random sequences; both deduced amino acid sequences have four invariant Cys residues ( C X 6 C X 1 8 C X 2 C ) consistent with the CSP family (Figure 2.2). 37 CfvaaAY426538 TTATCCTGTCAACTCTGGTGGTCCTGGGCCGTTGGGCAAGGGACGTATACGGCTGAGAATGACGACTTGGACATCGATGGCATCGTG L S C Q L W W S W A V G Q G T Y T A E N D D L D I D G I V AAAGATCCGAAGAAGCTGCAAGAATGGTTCGGCTGCTTCGTTGACAAGAGCCCTTGCGATAACGTGCAGCTTAGCTTCAAAGCTGAC K D P K K L Q E W F G C F V D K S P C D N V Q L S F K A D ATGCCAGAAGCAATCCGGGAAGCTTGCGCGAAATGCACCACGGCACAGAAGGGAATCTTGAAGAAATTCCTCGTAGGCCTCGAGGAG M P E A I R E A C A K C T T A Q K G I L K K F L V G L E E AAGGCCCCAGCTGATTACGAAGTGTTCAAGAAGAAATACGACTCTGAAAACAAATACATTGAGCCGCTTAAAAAAGCTATTGCATAA K A P A D Y E V F K K K Y D S E N K Y I E P L K K A I A * AATCGGGGTTAATACATGTACTAATAATACTCTCCCTGCCAACAATATAAGGCTTTCCAAGTTCAAGTAGCCATGAGAGTTATGAGA GCGCCGTGGATCACAAAAATAAATAAAAATGGGCACAGCCTGAATATAAAGAACTTCGAAGATGCTTGTGTCTGT CfvaaAY426539 GACTCGCCCGAAACCCGTACCAGGCAATCATAACCATGAAGAGTATACTCTATCTCGTGCTGACGGTGGTGGTGACCTGCTCGGCG M K S I L Y L V L T V V V T C S A CAGCAACAGTACTACAACAACCGATATGACAACCTCAACGCTGACTCTATCGTGCAGAACGAACGAGTTCTGCTCGCCTACTACAAG Q Q Q Y Y N N R Y D N L N A D S I V Q N E R V L L A Y Y K TGCGTTATGGACAAAGGACCTTGCACTAAGGACGGAAAAAACTTCAAACGCGTACTACCAGAGACGCTGTCAACAGCCTGCGCCCGC C V M D K G P C T K D G K N F K R V L P E T L S T A C A R TGCTCCCCGAAGCAGAAAGGACTCGTACGCACCCTCCTTCTCGGCATCAGGGTCAAAAGCGAGCCGCGTTTCAACGAACTTCTGGAC C S P K Q K G L V R T L L L G I R V K S E P R F N E L L D AAATACGACCCCGACCGCTCCAACCGGGATGACCTGTACAAGTTCTTAGTCACCGGCAACTAAATTTAAATACCACCGCGAACCGGA K Y D P D R S N R D D L Y K F L V T G N * ATAGTTGAGCCACTCGCGCTATTCATAATCCAGCCGGGTTATATACTCGTAGTAAAATATCGTCATTAATAATACCCGAAAAGCATA CGGCCAAATTACGAACATTGGCATGTGGCCCAGCAATCCCGGTGCGCGACTGTA CfvaaAY426540 GATACATTGCCTGATGCTTTGGAAAATGAGTGCAACAAATGCACGGAAAAACAGAAGTCGGGATCAGATAAGGTTATCAGGCATCTC D T L P D A L E N E C N K C T E K Q K S G S D K V I R H L GTTAACAAACGTCCCGAAATGTGGAAGGAGCTGTCGGTGAAGTACGACCCTGATCATATCTATGAAGGCAGGTACAAGGACCAGATT V N K R P E M W K E L S V K Y D P D H I Y E G R Y K D Q I GAGAAGATCAAGGCGTAGGAAGGGGAACTGATGTCTCCAAGGAGCACTGAAACTGTTGGACGTTTTGAGATTGGTTTTCCTTGTTTG E K I K A * T A A T A A A C G T C C T T T T A C C A T Tn±AY456191 GAAACTCTTCCCGATGCTCTCGAACACGAGTGCGTCAAGTGTACGGAGAAACAGAAGTCTGGCTCAGAGAAGGTGATCAGACACTTG E T L P D A L E H E C V K C T E K Q K S G S E K V I R H L GTGAACAGGCGTCCGGACTTGTGGAAGGAGCTGGCGACCAAGTATGACCCTGACAACATCTACCAAGACAGATACAAGGACAAGATC V N R R P D L W K E L A T K Y D P D N I Y Q D R Y K D K I CAAGCTGCCAAGGGCCAGTAACGTAGCAAGACTTAGCTGTGAACGTTCCACAGCTAGTGAATAGGTGATGGAAGTGTAAATTGTGAT Q A A K G Q * T A A A A A C G C A G T T A C C A A A G C T T T A A T T G G A G T T T A T T T C A A T A A A T T A A T T T G T A T C C Figure 2.1. Nucleic acid sequence and conceptual translation of cDNA isolated from a C. fumiferana larval cDNA library {CfumAY426538 & CfumAY426539) and cloned using RT-PCR products amplified from C. fumiferana and T.ni using a redundant primer based on the conserved amino acid sequence C(T/S/A)(P/A/DA^)(D/E)(G/A)KELK (CfumAY426540 and TniAY456191). Start and stop codons are underlined; conserved Cys residues are bolded and underlined. 38 CfumAY426539 contains the complete open reading frame (ORP), while the amino acid sequence of CfumAY426538 begins within the putative signal peptide. Both sequences are unique, sharing more homology with A. gambiae CSPs than with other lepidopteran members (Blastp [protein-protein Blast] scores, e-10 and e-14, respectively). A redundant primer corresponding to the conserved amino acid sequence C(T/S/AXP/A/D/V)(D/E)(G/A)KELK was used to amplify cDNA from a specific lepidopteran subclass. Two cDNA fragments, approximately 300 bp in length, were cloned from C. fumiferana and T.ni and found to encode peptides of 63 and 64 amino acids that are most similar by Blastp search to MsexSAP4 and HvirCSP2 (Figures 4.1 & 4.2, Table 2.1). The clones, termed CfumAY426540 and TniAY456191, encode the C-terminal halves of the protein. 2.2.2 Identification of C S P s from sequence databases In total, 66 CSP sequences representing the insect orders Diptera, Dictyoptera, Hymenoptera, Lepidoptera, Orthoptera, and Phasmatodea were identified from sequence databases: 55 from GenBank (identified by a PSI-blast [position specific iterated Blast] search using all known CSP sequences, and a Blastp using the conserved Cys spacing motif CX6-8CX16-19CX2C as a PHI [pattern hit initiated] pattern), nine from the ongoing D. pseudoobscura and A. mellifera genome sequencing projects (constructed herein from trace files), and two from a B. mori EST database (Silkbase, WWW.ab.a.u-tokyo.ac.jp/silkbase/) (Table 2.1). Several A. gambiae CSPs identified from GenBank (Table 2.1) were not entirely correct in their sequence. AgamEAA12601, AgamEAA12322, AgamEAA12353, AgamEAA12338 and AgamEAA12591 all have additional peptide sequence at the N-terminus and AgamEAA 12702 is missing the C-terminal half. Genomic contigs were downloaded from the 39 Figure 2.2. Alignment of the CSP family. Shaded regions indicate conserved motifs, with conservative substitutions highlighted as white text on a black background. Aromatic residues at positions 27, 85 and 98 are marked with an *; the four conserved Cys residues are in bold text and are marked by an U. The sequence of MbraCSPA6, for which the crystalline and NMR structures have been solved (Lartigue et al., 2002; Mosbah et al, 2003), is italicized. Underlined sections of the ruler correspond to the six helices of MbraCSPA6. An —> signifies sequences translated from cDNA clones described herein; the names of new sequences identified herein from sequence databases are bolded. 40 => o u C J U O C J O C J o o C J O C J O C J o C J C J O C J o o o U C J C J C J C J C J O o O t d t d t d t d t d t d t d a t d t d t d a t d t d t d c d t d t d t d t d t d t d t d t d t d t d t d t d t d t d t d t d < < rt rt < I d CO CO CO CO => C J U u u O C J o C J o C J C J u o u o u C J C J u C J o C J C J U C J C J C J C J o C J C J C J w w I d i d cd cd u Cd i d I d a I d I d I d CD t d CD a Q 2 2 a I d Cd Q Q Q 2 Q a Q Cd 2 ss 2 2 2 2 2 2 2 EH 2 2 2 EH EH EH 2 EH EH EH EH EH EH CO EH CO CO X 2 CO CO CO CO CO CO H H EH H > EH I d I d c d I d CT rt a a i d t d t d a S t d t d a I d Cd H! n l H i H i H i H i Hi H i n i H i H i H i H i H i H i H i H i H i J H i H i H i H i K3 H i H i H i t-l Hi H1 fiC < rt rt rt ft rt rt ft < < J3L ft rt ft ft < < «T ft < ft ft < ft Q Q Q Q Q Q Q a a Q a Q a a Q 151 a Q a a ED a a a a Q a a a a a a n 0 . CM CM 04 CM CM CM CM CM CM CM CM CM CM CM CM CM CO CM CM CM CM CM CM CM CM CM CM CM CM CM CM K3 131 H i HH M E l H 1—1 H) 1—i H i t-t H i r l H! M H i i-i H i H ! H i H1 H i 1-1 H! H i H i H i Hi H i H « : t d < > > > a fiC > CO fl* < < i-l I—I tc s > M M M CO CO 1—1 1—1 2 HI H i H! H EH t d CO CO K < ft > fin Q t d I d c d a CO c d c d t d cd t d t d Cd a a t d c d c d I d w • H t d t d t d t d t d t d t d t d t d t d t d t d t d t d t d EH t d t d t d t d t d t d t d t d t d t d t d t d t d t d t d 4-> 1-1 H1 i-l Hi H i 1-1 H i H i Hi H i H i H i HM H! t u •4. Hi H i H i • 1 H i H i H! H i H i Hi H i l - l H i H i H! H i 0 w I d I d w I d b l W u W I d w I d I d I d 6M t d I d I d I d I d I d t d Cd s s Cd Cd I d I d < < X t d t d t d t d Isd i d t d t d t d t d a I d O t d I d 2 2 2 2 t d t d t d t d CD o CD CD CD CD CD CD is CD o CD CD E l CD CD (J o CD CD C J CD EH EH ts IS iS o Q Q Q Q Q Q Q Q Q Q Q Q Ct Q mi 151 a Q 151 Q a Q Q a a Pi PI Q Q > < Q rt ft < rt CM CM CM CM CM CM CM CM CM CM CM CM CM CM cx CM CM CM CM < < CM CM H E-i EH EH EH EH E-I CM ft CM EH £H H EH EH EH EH H H H EH EH EH CJ o td td o o O IS P u CD CD CD CD CD CD CD CD CD CD 1 CD a o CV c td td Cd Cd td cd cd cd td td td cd I Q S Q Q a Q U a Q a Cd Cd I d CD CD Q Id 1 S H i H! H i Hi Hi H! j H i H i s S S Hi Hi H i 1 Hi M I—I M HH HH HH s > EH 5 > c> > > > > 1 H! CJ u CJ CJ CJ o o CJ CJ u CJ O CJ CJ o CJ 1 u td td td td td tc td td td Z 2 a 2 Q 2 Q 1 z > > HI HH HH > HH 1—1 M > > MM > > > 1 > >H X >H >H X >. t-H i x » tH X X X X 1 X CM CM CM CM CM CM < CM z < < CO CO ft CD 1 ft > > > > O HH X > X H i Hi i> H i Cd t d td l 2 H i l-l S H! • J k l J H! Hi H i H1 H i H1 H i i-l H i 1 Hi H i td Hi > J Hi H1 H! H! H! H1 Hi H i J H i 1 H i cd cd cd cd cd cd cd td CM cd cd Cd cd cd cd cd i cd cd c d ed tc CO cd tc cd td X td s cd td td I d 1 Cd 2 2 2 2 2 2 2 2 2 2 2 a 2 2 2 Z 1 EH CO CD CD Q CO CO Cd Cd I d Cd •=? CO Id Id < 1 2 H i H i H i H i RI E l E l H i H i H i H i H i H i H i H i 1 H i HH i—t HH a H i B HH HH HH HH HH HH HH HH t—H 1 «a I d I d I d I d Cd a H I d I d t d t d fc! I d I d I d I d 1 > Q Q Q Q Q Q Q Q td Q Q Q Q O O Q 1 Q H i H i •1 H i H i H i > > >H Hi H i • d H i H i > 1 Q Q Q Z Z Q Z Q 2 2 2 Q 2 z Z 1 Q > > > > > > E HH HH i—f Hf H Hf 1—! i—1 > 1 > Z CO CO Z Z EH CO Z Z Z z Z td 2 2 1 Z Q Q Q Q Q Q Q Q Q > Q Q o Q Q Q 1 Q >H tM >H >H >H >H » >H t-H >H tH » X X X 1 X td r"*i td t d td td td t d t d t d td t d l t d Q a B B a Q tea Q EH Q a Q 1 Q Q Q 1 EH EH EH EH EH EH EH EH EH EH H ^ EH 1 EH EH EH 1 EH >H >H t-H >H >H >• >H >H | >H tM 1 X X X 1 X CC CC CC td a s td td | td td 1 td td 1 td < a a Q Q Cd Q Cd Cd | Q Q j Q Q Q 1 Q 3 I d i d CM Q CM td 3 CM- 1 Q id 1 Q CO Q 1 Id I I I I I CD CD EH ft Z Q S HH > Cu C J C J 2 Q X X X X o a z z cd cd Hi H! I d cd Q Q 2 2 2 ft H i H i < to to CO CD cd td td td cd CD CD Id td Q H! H1 H! H1 H! ft HH HH > H1 O CJ CJ CJ CJ td 2 2 s a nl H H H H a a o a rt cd cd cd cd td cd co to co co J > > > cd x > > M H! cd > > > td a a a a a a z z z Q CD CD td cd a a •J 2 l u > CJ o 2 td CD to ft td EH EH Q Id Q 2 n l 2 > t u t u C J C J C J td CD CD • Hi Hi 2 > CO H cd D Q C J H < Z Z Q o o IS 1 CM 1 CM 1 CM 1 ! 1 CO 1 CD 1 CD ! td O td i a 2 Q 1 > H! H i l Cu H! H! 1 CJ o O * o CD td 2 1 ro MM EH > l K B X X l CN a CO CO l c td EH 1 n l H! H! 1 td Hi 1 td cd cd 1 CM t d id 1 P 2 2 + o CO cd 1 (NJ Hi 1 HH l H CD I d Cd l a P CO cd Q P Q | Cu Cu l Q CO CD 1 P Q Q o Z Q X 1 H Cd M I d j < CM t i 1 EH I d 1 X X X l td EH l CD P CO l O a 2 I LO I I ro in r-- I C N r O ^ M U Q Q i r ) Q> Cu Cu D-i Pu CU I I Cu O U O O O U O O ^ CL> QJ QJ 0) D - t-T t J 1 C n QJ i - l )^ j.) - H - H T-t - H 4^ tr> Cri rji t r i £ g g .S i i—i CN i—i Q Q H CO H (M I I O i O i di cu CO W CO CO CO CO o o o C J a u C7> t j i i—I i—t I—I - H - H - H (0 fd fd e £ o o o CJ CO CM CM . . to ft; Id O CM cC CO B T ) « H E nj io g cu ta Cs] i—i a CN] CM O Cs] r- ro CM CM eC £) m i •<» ft ft ft cu O co O ft ft ro CM H l H l I d C d l d c C H l C M Cn c5^ cji £ ft < < a td CM CM CM ~, X CD ft CO CdCMCJOCJldCdCMCJCOO • g 6 H IS X M n co to oj to cu - M Bl Di Cn B Qi to > a < rt a a s x CD CO _ Oi g a a •H CM ro CO H CO CO CO m CM CM CM CM — CO CO CO fit! O O O CO u 10 10 x . . HI cu > jQ jQ co X 2 2 2 X fl! O CO O HH C> CD o MO o r- r o rM •=<; T 1 rM CM CM CM CM Oi CM CM CM CM CM > rt > < CO CO CO CO CO •d rt < CO ft CO CJ CJ CJ CJ CJ CD CO HH X M X SH g CO CO i n u X O cu o 10 a • H E o CO cn 2 ca 2 X X 2 rt cn CJ 2 I I j -O Xi I 1 rt rt rt CM n l rt rt I H rt CM CJ ft < I CJ Id CO I 0 0 i n r o o r o CO C D MO CO L D vo C O o C O o C M o C O C M C M C O CM o o CM CM CM a B X D fit X CO < 0 rt fiC CO C J CO u u u E: n X M X CD 0 o 3 o CD 3 O CD rM e S CM S CO CM e CO m cn ->c cn 2 C D t cn s cd SgreCSPl SgreCSP2 SgreCSP3 SgreCSP4 LmigCSPI-5 LmigCSPI-1 LmigOS-D3 LmigOS-D5 SgreAAP57461 LmigOS-Dl LmigOS-D2 LmigCSPII-10 EcalCSP3 EcalCSPl EcalCSP2 AgamEAA12703 LmadCSP PamePIO AmelASP3C AgamSAPl AgamIR7 AgamEAA12591 DraelPeblll D p s e C S P 3 DmelOS-D DpseCG4 AgamEAA12702 AgamEAA12322 DmelPHK3 DpseCG2 MsexSAP4 HvirCSP2 ->CfumAY426540 ->TniAY45 6191 HvirCSPl MbraCSPB2 MbraCSPB3 MsexCSP3 BmorceN1900 BmorAV406169 MsexSAP5 BmorAV406021 MsexSAP7 HvirCSP3 HarmCSPl MbraCSPA6 AipsCSP BmorCSPl CcacCLPl MsexSAP2 AmelCG5 AmelCG4 AmelCG6 PdomCSP LhumCSPl AmelCGl AmelCG2 DmelAAM68292 DpseCGl AgamEAA12601 MsexSAPl Bmorce2366 BmorAU004850 ->CfumAY42 6539 BmorAU000875 MsexSAP6 ->CfumAY426538 BmorCSP2 MsexSAP8 Ruler * M o t i f C * NDKQKEGTKKVLKHLINHKPDIWAQLKAKYDPD3TYSKKYEDKEKELHE NDKQKEGTKKVLKHLINHKPDIWAQLKAKYDPD3TYSKKYEDKEKELHE NEKQKEGTKKVLKHLINHKPDIWAQLKAKYDPDGTYSKKYEDREKELHQ NEKQKEGTKKVLKHLINHKPDVWAQLKAKYDPDGTYSKKYEDREKELHQ NDKQKEGTKKVLRHFINNKPDVWQQLKAKYDPD3TYTKKYEDREKELHQ NEKQKEGTKKVLKHLINHKPDVWQKLKAKYDPD3TYSKKYEDREKELHQ NEKQKNGAEKVIRFLIKEKPDLWTPLEKKYDPN3TYRQKYGEELKKVSS NEKQKAGAEKVIRFLIKEKPDLWTPLEKKYDPTGSJRQKYDQELKRVSA NEKQKAGAEKVIRFLIKEKPDLWTPLESKYDPTGSYRQKYG*(PARTIAL ORF)* NEKQKAGAEKVIKFLVKEKPDLWEPLEKKYDPSGSFLRQKYGPELKKVSA NEKQKEGSNKVIRFLIQKKEDLWKPLQAKYDPEGTYLKKH-PEL--LSA NEKQKAGAEKVIKFLIKEKPDLWEPLEKKYDPTGSJRQKYDQELKRVSA SEKQKAGVETTIVFLIKNKPEVWESFKKKYDPTHKYQTFYDNLLKQAEEKAKSS SEKQKAGVETTIVFLIKNKPEIWESFKKKYDPTHKYEKIYERYIKQAEEKARKS SDRQKAIVKAIVEFLKKNKPDDLQKLVNKBDPDGSYRAKYGDSLEKIYS SPIQKENALKIITRLYYDYPDQHRALRE^DPSGEYHP^JEEYLRGLQFNQI DRJ SDKQKNGTRRVLKFLIDNEPDRHKELENK|JDPEGTYRKKYEKEAKEYLS SDKQBAGAEKVINFLYNKKKPMWESLQKKYDPENTYVTKYADRLKELHD TDK^EVIKKVIKFLVENKPELWDSLANKYDPDKKYRVKBEEEAKKLGING SEKQBDGAIKVINYLIQNRKDQWDVLQKK||DPENKYLEKYRGQAQKEGIKLD SEKOJSGAIKVINYVIENRKEQWDALQKKYDPENLYVEKYREEAKKEGIKLE SEKQKSGTEKVINYLIDNRKDQWENLQKKYDPENIYVNKYREDAKKKGINL S E K Q f f l Q N T D K V I R Y I I E N K P E E W K Q L Q A K Y D P D E I Y I K H Y R A T A E A S G I K V SER3SNTDKVIRFIIDNKPEEWKQLQTK||DPEDIYIK[YRAQATNAGIKI TEKQGYGAEKVTRHLIDNRPTDWERLEKIYDPEGTYRIKYQEMKSKANEEP TEKQKIGAEKVTRHLIDNRPNDWERLEKIYDPEGTYRFKYLKSKANGNKSL SEKQBIGSDKVIKFIVANRPDDJAILEQLYDPTGEYRRKYMQSDALAEHVKQEDRDLS SEKQHTSSRKVIAHLEERKPQEWKKLLDKYDPEGIYKSKJJEKINKRS TEVQBKNSQKVINYLRANKAGEWKLLLNKYDPQGIYRAKHEGH TAAQJRNSEKVINILRSKYPGEWKQLLDKYDSKGIYRSKYEAAAKKQH TEKQKVGSEKVIRNLVNKRPALWKELSAKYDPNNLYQEKYKDKIDSIKGQ TEKQKAGSDKVIRYLVNKRQDLWKELSAKYDPNNIYQD TEKQKSGSDKVIRHLVNKRPEMWKELSVKYDPDHIYEG TEKQKSGSEKVIRHLVNRRPDLWKELATKYDPDNICQD TEAQKKGTRRVIGHLINNEADYWNELTA: TDAQKKGTRRVIAHLINHEEDFWNELTA IKDKIEAVKGQ IKDQIEKIKA YKDKIQAAKGQ IDPEKKYVQKYEKELKEVKA IDPERKHTAKYEKELKDIKE TETQKNGTRRVIGHLINHEDAYWKELTAKYDPQSKJTAKYEKELKEIKH TNAQKNGTRRVIQHLINHEPEYWQELGDKYDPERKYTVKYEKELREIKA TEAQKKGTRRVIGHLINNESKSWNELTAKYDPENKBTAKYEKELREIKA TKAQKGGTEKMIGHLINHEAEFWEELKAKYDPTNEJTKKYETELKRVTA TDAQKKAIRHVIKHLIEHEHDFWALLVEKYD|HRIYTTRYEAEMKRTMRSKEQMSE T D K Q K V S A R K I V K H I K Q H E A D Y W E Q M K A K Y D P K D E H K E I Y E G F L A G Q N S D K Q K Q G A R D V I Q H L E K H E P E Y ^ E L R A K Y D P N N E J TEAQEKGAYKVIEHLIKNELDIWRELTAKYDPKGDJ TEAQEKGAYKVIEHLIKNELDIWRELAAKYDPKGDJ TENQEKGA YRVIEHLIKNEIEIWREL TAKYDPTGH TENQEKGSYRVIEHLIKNELDLWRELCAKBDPTGEJ TEAQEKGAETSIDYLIKNELEIWKELTAHBDPDGK! TEAQEKGAYTVIEHLIKNEIEIWRQLADF TKPQEEGATKVIDFLIKNKLEVWRELVAL •ESTMRDFLAGKI IR-KYEDRARANGIQIPE JRKKYEDRARANGIQIPE MKKYEDRAKAAGIVIPEE JRQKYEDRARANGIEIPKD JRKKYEDRAKAKGIVIPE IDPERKYRKKYEDRARAKGIEIPE DPEGKHRKKYEDRARANGIVIPE NEKQKHTANKVVNYLKTKRPKDWERLSAKYDSTGEYKKRYEHGLQFAKNN SEKQKKIADKVVQFLIDNKPEIWVLLEAKYDPTGAYKQHYLSESS TSRQIGIANTLIPFMQQNYPYEWQLILRIYKIMKYY TEIQKTNFEKLAIWYNENRPDEWTALIKKFLMEDAKKQNS TERQKDGLEKVVVWYTENRPEEWSALVVHLIEEAKKQNITPVSGGFI TEIQKQNLDKLAEWFTTNEPEKWNHFVEIMIKKKDEGA SPEETRQIKKVLSHIQRTYPKEWSKIVQQYAGVS S PQQAQKAQKLTT FLQTRYPDVWAMLLRKYDSA S PQQAQKAQKLTT FLQTRYPDVWAMLIRKYQSV SPQQAQNAQKLTNFLQTRYPEVWAMLIRKYGAV TPEQKAVFEESMKILEEKFNND|JKEIIAKYA TEKQKANVRKVIKVIQQKHSTEWEKLVKKHDPSGKHRAD|DKFLLGS SNKQKAAFRTLLLAIRARSEPSBLELLDKYDPSRSNRELLYTFLATGL SPKQKGLVRTLLLGIRVKSEPRHNELLDKYDPDRSNRDDLYKFLVTGN TPAQKHLFKRFLEVVKDKLPQEHEAFKTKYDPQGKHFDALLSAVANS TDKQKHITKRYFEGLEEKYPELBQAFKNKYDPENKYFAALKAAIAKF TTAQKGILKKFLVGLEEKAPADFLEVFKKKYDSENKYIEPLKKAIA TDKQKQMAKQLAQGIKKTHPELWDEFITFYDPQGKYQTSHKDFLES THRQKENADLMIQYMEENRPADWNKLELKYDANETYGTILLDGDKKVTNGNHTSAEV 70 H4 80 H5 90 100 H6 11 42 Table 2.1. Summary list of all CSP sequences identified from the Insecta. OrdenFamily Species Protein Name1'2 Accession # Expression3 Binding Studies O. Diptera DmelPEBIII AAF47140 F. Drosophilidae Drosophila DmelOS-D * AAA21358 melanogaster DmelAAM68292 AAM68292 DmelPHK3 AAF47307 D. pseudoobscura DpseCGl to 4 F. Culicidae AgamEAA 12601 EAA12601 Anopheles gambiae AgamEAA 12322 EAA12322 AgamSAPl EAA12353 AgamIR7 EAA12338 AgamEAA12591 EAA12591 AgamEAA 12703 EAA 12703 AgamEAA 12702 EAA 12702 O. Hymenoptera Apis mellifera AmelCGl & 2 F. Apidae and 4 to 6 AmelASP3C AAN59784 F. Formicidae Linepithema humile LhumCSP AAN01363 F. Vespidae Polistes dominulus PdomCSPl AAP55719 O. Lepidoptera BmorCSPl AAM34276 F. Bombycidae Bombyx mori BmorCSP2 AAM34275 BmorAV406169 AV406169 Bmorce2366 ce2366 BmorceN1900 ceN1900 BmorAU004850 AU004850 BmorAV406021 AV406021 BmorAU000875 AU000875 F. Noctuidae Agrotis ipsilon AipsCSP AAP57460 Helicoverpa armigera HarmCSP AAK53762 Heliothis virescens HvirCSPl AAM77040 HvirCSP2 AAM77041 HvirCSP3 AAM77042 Mamestra brassicae MbraCSPA6** AAF71289 MbraCSPB2 AAF19653 MbraCSPB3 AAF71290 L La P W (Sabatier et al, 2003) A (McKenna et al., 1994) E L P (Sabatier et al, 2003) A B (Biessmann et al., 2002) B (Oduol et ai, 2000) > A (Briand et ai, 2002) A (Ishida et al, 2002) A A b H L T h (Picimbon et ai, 2000) Briand et al., 2002 A L T h (Picimbon et al., 2001) A H L T h (Picimbon et al., 2001) A L (Picimbon et al., 2001) A Pr Pg (Nagnan-Le Meillour et al., 2000) Pr (Nagnan-Le Meillour et al., 2000) A P g (Jacquin-Joly et al, 2001) see footnote Trichoplusia ni ->TniAY456191 AY456191 F. Pyralidae Cactoblastis cactorum CcacCLPl AAC47827 F. Sphingidae Manduca sexta MsexSAPl AF117574" MsexSAP2 AF117592 MsexSAP3 AF117585 MsexSAP4 AF117599-> MsexSAP5 AF117594 MsexSAP6 BEO15509 MsexSAP7 CA798851 MsexSAP8 CA798912 F. Tortricidae Choristoneura ->CfumAY426538 AY426538 fumiferana -»CfumAY426539 AY426539 -•CfumAY426540 AY426540 O. Dictyoptera F. Blaberidae Leucophaea maderae LmadCSP AAM77025 F. Blattidae Periplaneta americana PameP 10 AAB84283 O. Orthoptera Locusta migratoria LmigOS-Dl CAB65177 F. Acrididae LmigOS-D2 CAB65178 LmigOS-D3 CAB65179 LmigOS-D5 CAB65181 LmigCSPI-1 AAO16783 LmigCSPI-5 AAO16787 LmigCSPII-10 AAO 16793 Schistocerca gregaria SgreCSPl** AAC25399 SgreCSP2** AAC25400 SgreCSP3** AAC25401 SgreCSP4** AAC25402 SgreCSP5** AAC25403 SgreAAP57461 AAP57461 O. Phasmatodea EcalCSPl AAD30550 F. Phasmatidae Eurycantha calcarata EcalCSP2 AAD30551 EcalCSP3 AAD30552 L a (Maleszka & Strange, 1997) A (Robertson et al., 1999) Ban et al, 2003 Ban et al., 2002 1 -> signifies sequences translated from cDNA clones described herein, the names of new sequences identified herein from sequence databases are bolded. 2 * indicates expression associated with sensilla,** protein has been found in sensillum lymph. 3A=antennae, Ab=abdomen, B=body, C=cuticle, E=embryo, H=head, L=legs, La=labrum or labial palp, P=pupae, Pg=pheromone gland, Pr=proboscis, T=tarsi, Th=thorax, W=wings. 4MbraCSPA6; the 3-D structure has been solved, and several binding studies conducted (Bohbot et ai, 1998 ; Campanacci et al., 2003 ; Lartigue et al., 2002; Mosbah et al, 2003; Nagnan-LeMeilour et al., 2000). Ensembl database (WWW.ensembl.org/Anopheles_gambiae/), and the genes and their translation products predicted manually as outlined in the Materials and Methods section. EST sequences that code for AgamEAA12353 & AgamEAA12338 have been reported: AgamSAPl (AAL84186; Biessmann et al, 2002) and AgamIR7 (AF283263; Oduol et al, 2000), respectively. The sequence of Agameaal2703 is anomalous. The coding region of the first exon (bordered at the 5' end by a stop codon) is incomplete and does not begin with an A U G / G U G start codon. The coding region of the second exon (terminated by a stop codon) is unusually long. Translation of Agameaal2703 results in a protein with 335 residues, the first 112 of which are homologous to CSPs (beginning within the putative signal peptide) (Figure 2.2). The additional protein sequence is not homologous to any entered on GenBank, and it contains an unusually long stretch of threonine (Thr) residues. No sequence errors could be detected. MsexSAP5 (Table 2.1), translated from cDNA, is similarly anomalous. It has an extended C-terminus consisting of six imperfect repeats of a 14 amino acid motif, making the protein length 231 residues in total (Robertson et al, 1999). Four CSP genes were identified from the unassembled D. pseudoobscura genome; the translated proteins are referred to as DpseCGl- 4 (Figure 2.2). Each gene was constructed using the following NCBI (National Center for Biotechnology Information) trace archive files: Dpsecgl (149079898, 154972476, 155144441, 155211255 & 155212923); Dpsecg2 (151302570, 153344964, 155268885, 156510153 & 158760462); Dpsecg3 (153386764, 155134102, 167465427, 168214922 & 168250005); and Dpsecg4 (149219044, 159241862, 168274023, 169327986 & 182677706). Six genes were identified by tBlastn searches of the unassembled A mellifera genome; the translated proteins are referred to as AmelCGl-6 (Figure 2.2). Each gene was constructed using the following trace files: Amelcgl (165820474, 166380969 & 166555471); Amelcg2 45 (165857634, 165954160, 171052956, 173768694 & 173904508); Amelcg3 (159575920, 165888936, 166568511, 173284880 & 174014423); Amelcg4 (161249969, 166197062, 173383798, 173396536 & 173485622); Amelcg5 (160832583, 161246793, 165954381 & 174042462); and Amelcg6 (166299781, 166383082 & 174048292). Amelcgl&4 contain complete ORFs. Due to incomplete sequence coverage, Amelcg3 begins within the N-terminal coding region and Amelcg5 & 6 begin within the C-terminal coding region. An CSP protein identified from the antennae of A. mellifera (AmelASP3C, Briand et al, 2002, Table 2.1) is almost identical to the partial sequence of AmelCG3 (83/85 identities); for further analysis, AmelASP3C was used in place of AmelCG3. A start codon could not be identified for Amelcg2, however, sequence coverage in this region was poor. 2.2.3 Genomic organization and intron structure All A. gambiae CSP genes are located within a 120 kb section of chromosome 3R, and four are clustered within a 20 kilobase (kb) region (Figure 2.3, Ensembl database, WWW.ensembl.org/Anopheles_gambiae/). DmelAAM68292 and DmelPhk3 are located within 5 kb of each other, and approximately 900 kb from DmelPeblll on chromosome 2R. Dmelos-d is located on chromosome 3L (Figure 2.3, Flybase, WWW.flybase.bio.indiana.edu/). An alignment of the D. melanogaster and D. pseudoobscura genomes (WWW.pipeline.lbl.gov/) places Dpsecgl-4 in the same genomic locations as DmelAAM68292, DmelPhk3, DmelPeblll and Dmelos-d, respectively. Three A. gambiae and four Drosophila genes lack introns. All other known CSP genes from A. gambiae, A. mellifera , D. melanogaster, and D. pseudoobscura have single, typically small introns (Figure 2.3). The introns olA.gamEAA!2703 and A.gamEAAl2702 are significantly larger, approximately 7.0 and 19.6 kb, respectively. 46 A) 1 2 °> Chromosome 3R L ^ J L_J L _ _ l _ B) Chromosome 2R S Chromosome 3R t A) A. gambiae 19 6 Kb f— Agamsapl Agamir7 Agameaal2591 Agameaa12702 ? Agameaa12703 Agameaa12322 Agameaa12601 B) D. melanogaster Dmelpeblll Dmelaam68292 Dmelphk3 Dmelos-d AAGCGC 0 05 Kb(0.07) C) A. mellifera 0.11 Kb AAAA GT 0 33 Kb Amelcgl Figure 2.3. Genomic organization and intron structure of CSP genes. A) A. gambiae (Ensembl database, WWW.ensembl.org/Anopheles_gambiae/), B) D. melanogaster (Flybase, WWW.flybase.bio.indiana.edu/), and C) A. mellifera. A solid block ( __|) denotes coding regions with homology to CSP proteins; an ->? indicates that a start and/or stop codon was not identified; and / • • indicates incomplete sequence resulting in translation of a partial coding region. Introns are represented by a line joining two adjacent coding regions. Codons flanking the conserved intron splice site (located one nucleotide past a conserved Lys codon, AAA/G), and the equivalent position in genes lacking introns, are listed below the coding region/intron boundary. The location and intron structure of D. pseudoobscura genes is virtually identical to that of D. melanogaster based on an alignment of the two genomes (www.pipeline.lbl.gov/); therefore, only the differences have been noted (bracketed values following intron sizes and/or intron boundary sequence). 47 The intron splice site is conserved, always located one nucleotide past a conserved lysine (Lys) codon (Figure 2.3) that corresponds to amino acid position 48 in Figure 2.2. This Lys residue is conserved in all members of the CSP protein family identified to date, with only two exceptions: LmadCSP and MsexSAPl have a conservative substitution of an arginie (Arg) for the Lys (Figure 2.2). The conserved splice site, therefore, may be a general characteristic of the CSP gene family. The genes without introns have likely lost them secondarily, since all retain the conserved sequence associated with the splice site (Figures 4.2 & 4.3). 2.2.4 Protein similarity groups The protein alignment in Figure 2.2 clearly indicates that many members of the CSP protein family, representing at least six insect orders, have conserved several sequence motifs, including: A) N-terminal YTTKYDN(V/I)(N/D)(L/V)DEIL, B) central DGKELKXX(I/L)PDAL, and C) C-terminal KYDP. In addition, aromatic residues at positions 27, 85 and 98 that may be functionally important (see discussion) are also highly conserved, along with residues at positions 66/67 (glutamine/lysine) and 101/102 (lysine/tyrosine). In contrast, the sequences of 17 CSP proteins from the orders Diptera, Hymenoptera and Lepidoptera (but not the Dictyoptera, Orthoptera, and Phasmatodea) clearly diverge from the conserved motifs A - C identified in Figure 2.2. In addition, they vary in the retention of, or diversion from, the two aromatic residues at positions 27 and 98 in three general categories: A) both are retained, B) position 98 has diverged, and C) both positions have diverged and these proteins are truncated at the C-terminus. Ten protein similarity classes were identified using the neighbor joining method to construct an unrooted distance phenogram representing all known CSP sequences (Figure 2.4). With the exception of two similarity classes from the Orthoptera, all are represented at, or 48 Figure 2.4. Neighbor joining distance phenogram of all known C S P protein sequences, collapsed to nodes with 50% or greater bootstrap support, n = 1000 replicates. Branch lengths are proportional and the scale represents percent sequence distance. A n -> indicates sequences translated from c D N A reported herein; new sequences identified herein from sequence databases are bolded; and an * is used to label sequences that diverge from conserved motifs A - C identified in Figure 2.2. 49 10% Distance - MsexSAP4 . HvirCSP2 • TnlAY456191 4-• CfumAY426540 Lepidoptera Class 1 BmorAV406021 . MsexSAP7 MsexSAP5 BmorAV406169 - BmorCEN1900 MsexSAP3 HvirCSPI — MbraCSPB2 MbraCSPB3 Lepidoptera Class 2 • AipsCSP MsexSAP2 MMDSPA6 1001—HvirCSP3 - HarmCSP - BmorCSPI - CcacCLPI • BmorCE2366 * Lepidoptera Class 3 - BmorAU004850 * • CfumAY426539 * «-MsexSAPI* - CfumAY426538 * . - BmorAU000875 * -MsexSAP6 * Lepidoptera Class 4 . MsexSAP8 * — EcalCSPI BmorCSP2 * S3 LmigCSPI-5 LmigCSPI-1 SgreCSP2 SgreCSPI 9olSgreCSP5 SgreCSP3 l00LSgreCSP4 LmigOS-D1 — LmigOS-D4 LmigOS-D5 SgreAPP57461 -LmigOS-D2 LmigCSPII-12 - EcalCSP3 - PamePIO Orthoptera Class 1 (CX8CX18CX2C) -\_r r HsCT 55C 94T_ Orthoptera Class 2 (CX8CX18CX2C) LmadCSP —AgamEAA12591 AgamSAPI — AgamlR7 — DmelPEBIII DpsoCG3 — DmelOS-D — DpseCG4 AgamEAA12702 AgamEAA12322 — DmelPHK3 DpseCG2 Diptera Class 1 (Lack Introns) Diptera Class 2 — AmelCG2 * . AmelCG6 * - AgamEAA12601 - DmelAAM68292 — DpseCGI * AgamEAA12703 . AmelCGI * Diptera/Hymenoptera Class 1 • PdomCSP * LhumCSP* Hymenoptera Class 1 (CX6CX1.9CX2C) AmelASP3C AmelCG4 AmelCGS 50 higher than, the family level of taxonomy. Seven are characterized by retention of the conserved motifs A - C (Diptera Class 1 & 2, Lepidoptera Class 1-3 and Orthoptera Class 1 & 2) and three are characterized by diversion from these conserved motifs (Diptera/Hymenoptera Class 1, Hymenoptera Class 1 and Lepidoptera Class 4) (Figure 2.4). Of significance, each similarity class identified is homogenous with respect to retention of, or diversion from, the conserved motifs. The Hymneoptera/Diptera similarity class (Figure 2.4), a group that diverges from the conserved motifs, is represented by two different insect orders. The grouping of Diptera Class 1 (Figure 2.4) is further supported by the fact that all the members (Agamsapl, Agamir7, Agameaal2591, Dmelpebll and Dpsecg3) lack introns (Figure 2.3). The fact that Agamsapl, Agamir7, Agameaal2591 are clustered closely together, share 75-79% amino acid identity, and group together with only one homologous Drosophila member, may indicate that they resulted from gene duplication that occurred within the suborder Nematocera (A. gambiae), but not the suborder Brachycera (Drosophila), a taxonomic division that occurred early within the ancestral Diptera (Figure 2.5). However, it should be noted that members of Diptera Class 2 are mixed with regard to intron retention (Dmelphk3IDpsecg2 lack introns while Agameaal2322 has an intron), and further data is required to establish evolutionary relationships. Diversion from the conserved Cys spacing pattern C X 6 C X 1 8 C X 2 C is uncommon, and when it does occur, it is associated with specific similarity classes. All Orthopteran CSP proteins identified to date (representing a single taxonomic family) are characterized by the insertion of two additional residues between the first and second conserved Cys residues (CXgCXi8CX2C) (Orthoptera Class 1 & 2, Figures 2 & 4). Hymenoptera Class 1, represented by three different taxonomic families, has an additional residue located between the second and third conserved Cys residues (CX6CX19CX2C). MsexSAPl represents a single example of a deletion, between the 2 n d and 3 r d conserved Cys residues (CX 6 CXi7CX 2 C) (Figure 2.2). 51 2.3 Discussion CSP proteins share some features in common with insect OBPs (both are small, highly soluble proteins with hydrophobic binding pockets), but they do not share sequence homology and they represent two distinct classes. An association with sensory organs, including the sensillum lymph in some cases, led to the hypothesis that CSP proteins may be a new and different type of OBP (Bohbot et al, 1998; Angeli et al, 1999; Marchese et al, 2000; Nagnan-LeMeillour et al, 2000; Monteforti et al, 2002). Dmelos-d, the first to be discovered, was cloned from the olfactory segment of adult D. melanogaster antennae using subtractive cDNA methods (McKenna et al, 1994). Subsequently, CSP proteins were identified from the chemosensory organs (such as the antennae, labial palps, and proboscis) of several insect species (Table 2.1), including Cactoblastis cactorum (Maleszka and Stange, 1997), Periplaneta americana, P. fuliginosa, Blattella germanica (Picimbon and Leal, 1999), Schistocerca gregaria (Angeli et al, 1999), Locusta migratoria (Picimbon et al, 2000), M. brassicae (Nagnan-Le Meillour et al., 2000), Manduca sexta (Robertson et al., 1999), Eurycantha calcarata (Marchese et al, 2000), A. mellifera (Briand et al, 2002), A. gambiae (Biessman et al, 2002) and Polistes dominulus (Ishida et al, 2002). The immuno-histological localization of CSP proteins to the sensillum lymph that surrounds sensory neurons (Angeli et al, 1999; Nagnan-Le Meillour et al, 2000; Monteforti et al, 2002;) supported an olfactory function for these proteins, along with the demonstration that an CSP protein from antennae was able to bind pheromone components (Bohbot et al, 1998; Jaquin-Joly et al, 2001). However, in contrast to a purely olfactory function, CSP proteins have also been isolated from non-chemosensory organs (Table 2.1) (Kitabayashi et al, 1998; Picimbon et al, 2000; Jacquin-Joly et al, 2001; Picimbon et al, 2001), and in conjunction with evidence indicating a broad ligand binding specificity (Nagnan-Le Meillour et al, 2000; Jacquin-Joly et al, 2001; Briand et al, 2002; Lartigue et al, 2002; Campanacci et al, 2003), a more general 52 physiological function relating to the transport of hydrophobic molecules in various tissues has been proposed. PamePIO was isolated from regenerating cockroach legs at a concentration 30 times greater as compared to mature legs, and was associated with the developing epidermis (Kitabayashi et al, 1998). CSP genes are expressed generally in the thoraces, abdomen, legs and heads of two Lepidopteran species (Picimbon et al., 2000; Picimbon et al, 2001) and MbraCSPA6 is expressed in the pheromone gland (Jacquin-Joly et al, 2001). Ligand binding studies indicate that AmelASP3C and MbraCSPA6 bind short to medium chain length (14-18 carbon) fatty acids and their derivatives with dissociation constants in the uM range, in a non specific manner (Briand et al, 2002; Campanacci et al, 2003; Jacquin-Joly et al, 2001; Lartigue et al, 2002; Nagnan-Le Meillour et al, 2000). However, two orthopteran proteins, S. gregaria CSP4 and L. migratoria CSPII-10, are able to bind larger ligands, such as the fluorescent reporter 1-NPN (Ban et al, 2002; Ban et al, 2003;). In contrast, PBPs, one type of insect OBP, also bind fatty acid derivatives with Kd values in the pM range, but are able to distinguish between different fatty acids (Maibeche-Coisne et al, 1997; Maida et al, 2000; Plettner et al, 2000). Recent screening based studies that have detected differential expression of D. melanogaster CSP genes raise more questions about the function of this family. Dmelpeblll and Dmelphk3 were identified as putative targets of the clock transcription factor that regulates circadian rhythms (Claridge-Chang et al, 2001; McDonald and Rosbash, 2001) and Dmelphk3 as a target of the dorsal transcription factor, involved in embryo and tissue development (Stathopoulos et al, 2002). Additionally, Dmelpeblll and Dmelphk3 were found to be immune responsive when challenged with virus and bacteria, respectively; functions ranging from tissue repair to the recognition of invading pathogens were suggested (Sabatier et al, 2003). Similarly, transcription of Agamir7 increased six hours after adult mosquitoes were challenged with bacterial lipopolysaccharide (Oduol et al, 2000). Interestingly, Dmelpeblll and Agamir7 53 are both members of Diptera Class I identified in Figure 2.4. Finally, Dmelpeblll was also identified as a candidate gene responsible for a smell impaired mutant phenotype (Anholt and Makay, 2001). Clearly, further research is required to determine the specific functions of CSP proteins. Herein I have cloned four new members of the insect CSP gene family, and identified 11 more from sequence databases, allowing the identification of four new protein similarity classes (Lepidoptera Class 1 & 4, Hymenoptera Class 1 and Diptera/Hymenoptera Class 1) (Figure 2.4). Orthoptera Class 1 and 2, and Lepidoptera Class 2 and 3, have been described previously (Ban et al., 2003; Jacquin-Joly et al., 2001; Picimbon et al., 2001). Each similarity class was characterized by the retention of, or diversion from, several highly conserved motifs. Although even a single residue substitution can have profound effects on protein function, the conserved motifs must impart some level of common function to members that retain them. The recently solved crystalline and NMR structure of MbraCSPA6 from the moth M. brassicae (Campanacci et al., 2003; Lartigue et al, 2002; Mosbah et al, 2003) provides an insight into functional constraints that may be contributing to the conservation of some residues. MbraCSPA6 is a small globular protein composed of six amphiphatic alpha helices that surround an internal hydrophobic pocket; the four conserved Cys residues form two disulphide bonds that create a - a loops (Figure 2.6, Lartigue et al, 2002; Mosbah et al, 2003). Three amino acid positions (27, 85 and 98, Figure 2.2) are highly conserved as aromatic residues within the CSP family. In MbraCSPA6, a tyrosine (Tyr) residue located at position 27, and a tryptophan (Trp) residue at position 98, may act as gates to the hydrophobic pocket (Lartigue et al, 2002; Mosbah et al, 2003) (Figure 2.2 and 6). Evidence indicates that the Trp residue at position 85 of AmelCSP3b faces the binding pocket and may interact with the ligand (Briand et al, 2002). Several residues that are located at either mouth to the hydrophobic pocket of 54 MbraCSPA6 are also conserved within the OS-D family. These include residues at positions 12, 14 and 17 of Figure 2.2 (N-terminal mouth) and 66/67 and 101 (C-terminal mouth) (Lartigue et al, 2002; Mosbah et al, 2003). Thus, residues involved in ligand binding may be functionally conserved. Conserved motifs exposed on the surface of the protein could be involved in protein regulation and/or interactions. The N-terminal sequence of MbraCSPA6, where the conserved motif A, YTTKYDN(V/I)(N/D)(L/V), is located, is predicted to form an extended region (although less ordered) when in solution (Figure 2.6A, Mosbah et al, 2003). Motif B, DGKELKXX(I/L)PDAL, spans the 3r d amphiphatic a-helix; within this region, the aspartic acid (D), glutamic acid (E) and lysine (K) residues are exposed at the surface of the protein, whereas the remaining hydrophobic residues contribute to the hydrophobic pocket. CSP proteins that have diverged from the conserved motifs share less sequence identity with MbraCSPA6; homology modeling however, predicts that they form similar 3-D structures. For example, MsexSAPl and Diptera/Hymenoptera Class 1 (Figure 2.4) lack both putative aromatic gate residues and are truncated at the C-terminal end (Figure 2.2). The homology model of MsexSAPl, however, indicates that the general 3-D structure of the protein is maintained, even with the loss of the 6 th a -helix (Figure 2.6C and E). Interestingly, the conserved aromatic residue at position 85 (that faces the internal pocket of AmelASP3C), appears to be able to take a position where it could act as gate, similar to the aromatic group at position 98 of proteins that retain the 6 th a-helix (Figure 2.6B). Thus, the basic 3-D structure of MbraCSPA6 may be conserved within all CSP classes identified herein, including those that have diverged from the conserved motifs. Circular dichroism (CD) spectrum and NMR data 55 Figure 2.5. Evolutionary tree of the Insecta (from The Tree of Life Web Project, WWW.tolweb.org/tree/phylogeny.html). Phyla from which CSP proteins have been identified are represented in bolded text. Ephemeroptera — Odonata , — Insecta. i — Archaeognatha Pterygota — | Thysanura 4: 1— Neoptera H— Hemipteroid Complex Endoterygota Plecoptera Embiptera Phasmatodea Orthoptera Mantophasmatodea Zoroptera Dictyoptera Dermaptera Grylloblattodea Megaloptera t Raphidioptera Neuroptera Coleoptera Nematocera I— Diptera — Brachycera t Mecoptera Siphonoptera C Lepidoptera Trichoptera Hymenoptera 56 Figure 2.6. Ribbon drawings of: A) One of 20 energy minimized NMR structures of MbraCSPA6, Protein Data Bank (PDB) (Berman et al, 2000) # 1K19 (Mosbah et al, 2003), B) The crystal structure of MbraCSPA6, PDB # 1KX9 (Lartigue et al, 2002), and C) MsexSAPl as determined by homology modeling with the crystal structure of MbraCSPA6 (PDB # 1KX9). a-Helices 1 through 6 are colored in succession from the N-terminus: Dark blue, light blue, dark green, light green, yellow and red, respectively. Disulphide bonds formed by Cys residues at position 30 & 39 and 59 & 62 (Figure 2.2) are displayed in pink. The aromatic side chains of residues 27 & 98 are coloured gold (in the case of Figure 2.6C, the aromatic side chain at position 85 is displayed; position 98 and the 6 th a-helix are missing). D) Cross section view of a space filling model of MbraCSPA6 in complex with 12-bromododecanol (pink) (PDB #1N8V, Lartigue et al, 2002). Residues colored blue are predicted to have more than 30% of their surface area exposed (accessible) to the surrounding solvent. The aromatic residues at positions 27 & 98 are coloured green and gold, respectively. E) Cross section view of MsexCSPl, as determined by homology modeling with the crystal structure of MbraCSPA6 (PDB # 1KX9). The images and initial models were created using Deep View / Swiss-PdbViewer 3.7 software (Guex and Peisch, 1997); models were optimized by Swiss-Model software on the ExPASy molecular biology server (WWW.us.expasy.org). 57 58 from AmelASP3C and SgreCSP4 indicate that they have the same general fold and disulphide bonding pattern (Briand et al, 2002; Picone et al, 2001). Highly conserved motifs among members representing the orders Diptera, Hymenoptera and Lepidoptera, as well as the Dictyoptera, Orthoptera and Phasmatodea, indicates that a common ancestral CSP gene predated the ancestral Neoptera (Figure 2.5). Therefore, similargenes should be discovered in other neopteran orders, such as the Coleoptera and Hemiptera. The degree of diversification that may have occurred prior to the ancestral Neoptera is not clear; many of the CSP similarity classes may have diverged within specific orders, but at least one class appears to have diverged prior to the Diptera/Hymenoptera taxonomic division. While most classes are represented at the family or higher level of taxonomy, Orthoptera Class 1 & 2, characterized by high levels of sequence identity, provide some evidence for more recent diversification; if this is the case, then these genes may be clustered in the genome. The D. melanogaster genome has 51 different OBP genes, but their deduced amino acid sequences share very little homology (Graham and Davies, 2002; Hekmat-Scafe et al, 2002; Vogt et al, 2002). A large family of divergent proteins may be an adaptation required for the binding of diverse odorant ligands (Vogt et al, 1999). By comparison, the genomes of D. melanogaster, D. pseudoobscura, A. gambiae and A. melifera have fewer CSP genes (4, 4, 7 and 6, respectively). Also, it has been noted that CSP proteins from diverse taxonomies share more than 40% to 60% identity (Angeli et al, 1999; Marchese et al, 2000; Nagnan-LeMeillour et al, 2000) as compared to OBPs that share 10 to 38% identity (Vogt et al, 1999). Sequence identities of 40% can be largely accounted for by the conserved A-C motifs described herein (for example, up to 40 conserved residues between proteins with about 110 total residues). It should be noted that protein classes identified herein that diverge from the conserved A-C motifs share less homology with other family members (Figure 2.4). For 59 example, the median sequence identity (variable signal peptide removed) of AmelCG2, CfumAY426538 and MsexSAPl with other family members (outside of their similarity class) is 18 (14-27), 25 (18-30) and 21 (16-27)%, respectively. Whether CSP proteins have functions similar to that of OBPs remains to be determined. Regardless, it will be interesting to investigate the function of proteins that have been found to occur both in sensory (sensillum lymph) as well as non-sensory tissues, and that share amino acid motifs (and presumably functional constraints) that have been conserved across diverse insect orders. 2.4 Materials and methods 2.4.1 Nomenclature New protein sequences were reported as OS-D-like, due to homology with the founding member isolated from the antennae of D. melanogaster (McKenna et al, 1994). Subsequently, they have been named chemosensory proteins (Angeli et al, 1999), sensory appendage proteins (SAP) (Robertson et al, 1999) and most recently, pherokines (Sabatier et al, 2003). Since a nomenclature standards have not yet been established for this gene family, I have used the nomenclature associated with each published sequence where available; or, the GenBank accession number is used to refer to unnamed sequences. Genes that we have identified herein from ongoing genome sequencing projects, and that have not yet been annotated, are referred to as conceptual genes (CG). 2.4.2 Molecular cloning of CSP cDNAs Second instar C. fumiferana larvae were supplied from a colony continuously maintained by the Insect Production Unit at the Great Lakes Forestry Centre, Sault Ste. Marie, Ontario Canada. Larvae were reared on artificial diet at 24°C and a 16 hr light: 8 hr dark photoperiod, through to pupation and adult emergence. A cDNA library of C. fumiferana was constructed using the Uni-ZAP XR vector (Stratagene, La Jolla, CA, USA) with mRNA isolated from larvae that were molting from the 60 fifth to the sixth instar. cDNA clones were randomly selected and sequenced from the 5' ends to generate ESTs. The ESTs were annotated based on Blastn and Blastx searches against the non-redundant GenBank database. Clones identified as CSP were further sequenced using forward and reverse sequencing primers (Barr Pharmaceuticals Inc.), the Taq Dye Deoxy Terminator Cycle Sequencing kit (Applied Biosystems) and an automatic DNA sequencer (model 310, Applied Biosystems Inc). Total RNA was extracted in guanidium isothiocyanate, from the head (including antennae, proboscis and labial palps) and front tarsi of adult C. fumiferana (mixed male and female), and purified by acid phenol/chloroform extraction (Sambrook, 1989). First strand cDNA was synthesized from 5 micrograms (pg) of total RNA using an anchored oligo deoxynucleotide (d) T primer: 5' G C G C C G C G G C C G C T , , (CT)(ACG) 3'. Reaction conditions were: 200 units (U) Superscript II reverse transcriptase in lx 1st strand buffer (Gibco), 0.5 millimolar (mM) deoxy-nucleotide triphosphate (dNTP) mix, 10 U cloned RNase inhibitor (Invitrogen) and 500 nanograms (ng) oligo dT primer in 20 pi total volume and a 50 minute (min) incubation at 42 degrees Celsius (°C). For the reverse transcriptase - polymerase chain reaction (RT-PCR), a truncated dT primer (5' G C G C C G C G G C C G CTT 3') was used in combination with a redundant primer designed to match a conserved amino acid sequence. The CSP redundant primer A (5' T G C A[CG]T C C T GA[CG] GG[ACGT] A A A GA[AG] CT[CT] A A 3') was designed to match the conserved amino acid sequence C(T/S/A)(P/A/D/V)(D/E)(G/A)KELK. An alignment of cDNA sequences was used to estimate codon preference and to reduce the redundancy of the primer. PCR reaction conditions were: 2 pi of 1st cDNA, 1.25 U Taq DNA polymerase in IX Taq buffer (Invitrogen), 2.0 mM MgCl 2 , 0.4 mM truncated oligo dT primer, 4 mM redundant primer, in 50 pi total volume. PCR reactions were amplified using a Geneamp 2400 thermo-cycler (Perkin Elmer) as follows: one cycle of 2 min at 94°C, 30 cycles of 30 seconds (s) at 94°C, 40 s at 50°C, 1 min at 72°C and 61 finally one cycle of 7 min at 72°C. PCR products were gel purified using Qiex beads (Qiagen) and blunt-end cloned into pBluescript II KS (GenBank # ARBL2KSM) cut with the EcoRV restriction enzyme (Invitrogen). Plasmid clones were sequenced in both directions. A PCR product was also cloned from T. ni as outlined above (insects were supplied from a colony continuously maintained on artificial diet at the laboratory of Dr. M . Isman). 2.4.3 Identification of CSP proteins from sequence databases GenBank non-redundant and EST databases were searched using PSI Blast, Blastp with a PHI pattern for conserved Cys residues (CX6CX16.19CX2C) and Blastn (Altschul et ah, 1997). All known CSP sequences were used as query sequences. EST databases available on the World Wide Web (WWW) were also searched: SilkBase (WWW.ab.a.u-tokyo.ac.jp/silkbase/) and the Honey Bee Brain EST Project (WWW.titan.biotec.uiuc.edu/bee/honeybee_project.htm). D. pseudoobscura and A. mellifera genome trace files were downloaded from the NCBI trace archive and searched for nucleotide sequences that coded for CSP protein sequences using a stand alone implementation of tBlastn. Individual trace files were assembled (minimum 5x coverage where possible) into contiguous sequences containing the complete ORF. 2.4.4 Gene predictions and conceptual translation Contiguous sequences were used as tBlastn queries against the GenBank non-redundant database to identify general exon/intron boundaries by homology with known CSP ORFs. Intron splice sites were identified by the conserved intron start (GT) and end (AG) sequences. Coding regions were combined to form conceptual ORFs that were translated using the standard genetic code. In the case of cDNA sequences, and genes without introns, the ORF was identified by translating a continuous sequence between a start and stop codon. Conceptual protein sequences were then assessed for the characteristic features of the CSP 62 protein family: four invariant Cys residues, sequence homology with known members, and a length of 110 - 160 amino acid residues. 2.4.4 Protein Similarity Groups. Putative signal peptides were identified by comparison with mature CSP proteins reported in the literature. For analysis (and calculation of protein sequence identities), proteins were truncated at a point 28 residues prior to the first conserved Cys residue to eliminate the highly variable signal peptides. Truncated protein sequences were aligned using C L U S T L _ X (Thompson et al, 1997); an unrooted neighbor joining distance phenogram was constructed with bootstrap support using PAUP* 4.0 Beta Version 10 Windows interface (Swofford, 2002). 2.5 References Altschul S. F., Madden T. L., Schaffer A. A., Zhang J., Zhang Z., Miller W., and Lipman D. J. (1997). Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25: 3389-402. Angeli S., Ceron F., Scaloni A., Monti M . , Monteforti G., Minnocci A., Petacchi R., and Pelosi P. (1999). Purification, structural characterization, cloning and immunocytochemical localization of chemoreception proteins from Schistocerca gregaria. Eur J Biochem 262: 745-54. Anholt R. R., and Mackay T. F. (2001). The genetic architecture of odor-guided behavior in Drosophila melanogaster. Behav Genet 31: 17-27. Ban L. , Scaloni A., Brandazza A., Angeli S., Zhang L. , Yan Y., and Pelosi P. (2003). Chemosensory proteins of Locusta migratoria. Insect Mol Biol 12: 125-34. Ban L. , Zhang L. , Yan Y., and Pelosi P. (2002). Binding properties of a locust's chemosensory protein. Biochem Biophys Res Commun 293: 50-4. Berman H. M . , Westbrook J., Feng Z., Gilliland G., Bhat T. N., Weissig H. , Shindyalov I. N., and Bourne P. E. (2000). The protein data bank. Nucleic Acids Res. 28: 35-242. Biessmann H., Walter M . F., Dimitratos S., and Woods D. (2002). Isolation of cDNA clones encoding putative odourant binding proteins from the antennae of the malaria-transmitting mosquito, Anopheles gambiae. Insect Mol Biol 11: 123-32. 63 Bohbot J., Sobrio F., Lucas P., and Nagnan-Le Meillour P. (1998). Functional characterization of a new class of odorant-binding proteins in the moth Mamestra brassicae. Biochem Biophys Res Commun 253: 489-94. Briand L. , SwasdipanN., Nespoulous C , Bezirard V., Blon F., Huet J. C , Ebert P., and Penollet J. C. (2002). Characterization of a chemosensory protein (ASP3c) from honeybee (Apis mellifera L.) as a brood pheromone carrier. Eur J Biochem 269: 4586-96. Campanacci V., Lartigue A., Hallberg B. M . , Jones T. A., Giudici-Orticoni M . T., Tegoni M . , and Cambillau C. (2003). Moth chemosensory protein exhibits drastic conformational changes and cooperativity on ligand binding. Proc Natl Acad Sci USA 100: 5069-74. Claridge-Chang A., Wijnen H., Naef F., Boothroyd C , Rajewsky N., and Young M . W. (2001). Circadian regulation of gene expression systems in the Drosophila head. Neuron 32: 657-71. Galindo K., and Smith D. P. (2001). A large family of divergent Drosophila odorant-binding proteins expressed in gustatory and olfactory sensilla. Genetics 159: 1059-72. Graham L. A., and Davies P. L. (2002). The odorant-binding proteins of Drosophila melanogaster. annotation and characterization of a divergent gene family. Gene 292: 43-55. Guex N., and Peisch M.C. (1997). SWISS-MODEL and the Swiss-Pdb Viewer: an environment for comparative protein modeling. Electrophoresis 18: 2714-2723. Hekmat-Scafe D. S., Scafe C. R., McKinney A. J., and Tanouye M . A. (2002). Genome-wide analysis of the odorant-binding protein gene family in Drosophila melanogaster. Genome Res 12: 1357-69. Ishida Y., Chiang V., and Leal W. S. (2002). Protein that makes sense in the Argentine ant. Naturwissenschaften 89: 505-7. Jacquin-Joly E. , Vogt R. G., Francois M . C , and Nagnan-Le Meillour P. (2001). Functional and expression pattern analysis of chemosensory proteins expressed in antennae and pheromonal gland of Mamestra brassicae. Chem Senses 26: 833-44. Kitabayashi A. N., Arai T., Kubo T., and Natori S. (1998). Molecular cloning of cDNA for plO, a novel protein that increases in the regenerating legs of Periplaneta americana (American cockroach). Insect Biochem Mol Biol 28: 785-90. Lartigue A., Campanacci V. , Roussel A., Larsson A. M . , Jones T. A., Tegoni M . , and Cambillau C. (2002). X-ray structure and ligand binding study of a moth chemosensory protein. JBiol Chem 111-. 32094-8. 64 Lee D., Damberger F. F., Peng G., Horst R., Guntert P., Nikonova L. , Leal W. S., and Wuthrich K. (2002). NMR structure of the unliganded Bombyx mori pheromone-binding protein at physiological pH. FEBS Lett 531: 314-8. Maibeche-Coisne M . , Sobrio F., Delaunay T., Lettere M . , Dubroca J., Jacquin-Joly E. and Nagnan-Lemeillour P. (1997). Pheromone binding proteins of the moth Mamestra brassicae: specificity of ligand binding. Insect Biochem Mol Biol 27: 213-221. Maida R., Krieger J., Gebauer T., Lange U., and Ziegelberger G. (2000). Three pheromone-binding proteins in olfactory sensilla of the two silkmoth species Antheraea polyphemus and. Antheraea pernyi. Eur J Biochem 267: 2899-908. Maleszka R., and Stange G. (1997). Molecular cloning, by a novel approach, of a cDNA encoding a putative olfactory protein in the labial palps of the moth Cactoblastis cactorum. Gene 202: 39-43. Marchese S., Angeli S., Andolfo A., Scaloni A., Brandazza A., Mazza M . , Picimbon J., Leal W. S., and Pelosi P. (2000). Soluble proteins from chemosensory organs of Eurycantha calcarata (Insects, Phasmatodea). Insect Biochem Mol Biol 30: 1091-8. McDonald M . J., and Rosbash M . (2001). Microarray analysis and organization of circadian gene expression in Drosophila. Cell 107: 567-78. McKenna M . P., Hekmat-Scafe D. S., Gaines P., and Carlson J. R. (1994). Putative Drosophila pheromone-binding proteins expressed in a subregion of the olfactory system. J Biol Chem 269: 16340-7. Monteforti G., Angeli S., Petacchi R., and Minnocci A. (2002). Ultrastructural characterization of antennal sensilla and immunocytochemical localization of a chemosensory protein in Carausius morosus Brunner (Phasmida : Phasmatidae). Arthropod Structure & Development30: 195-205. Mosbah A., Campanacci V., Lartigue A., Tegoni M . , Cambillau C , and Darbon H. (2003). Solution structure of a chemosensory protein from the moth Mamestra brassicae. Biochem J 369: 39-44. Nagnan-Le Meillour P., Cain A. H., Jacquin-Joly E. , Francois M . C , Ramachandran S., Maida R., and Steinbrecht R. A. (2000). Chemosensory proteins from the proboscis of Mamestra brassicae. Chem Senses 25: 541-53. Oduol F., Xu J., Niare O., Natarajan R., and Vernick K. D. (2000). Genes identified by an expression screen of the vector mosquito Anopheles gambiae display differential molecular immune response to malaria parasites and bacteria. Proc Natl Acad Sci U S A 97: 11397-402. Picimbon J., and Leal W. S. (1999). Olfactory soluable proteins of cockroaches. Insect Biochem Mol Biol 29: 973-978. 65 Picimbon J. F., Dietrich K., Breer H., and Krieger J. (2000). Chemosensory proteins of Locusta migratoria (Orthoptera: Acrididae). Insect Biochem Mol Biol 30: 233-41. Picimbon J. F., Dietrich K., Krieger J., and Breer H. (2001). Identity and expression pattern of chemosensory proteins in Heliothis virescens (Lepidoptera, Noctuidae). Insect Biochem Mol Biol 31: 1173-81. Picone D., Crescenzi O., Angeli S., Marchese S., Brandazza A., Ferrara L. , Pelosi P., and Scaloni A. (2001). Bacterial expression and conformational analysis of a chemosensory protein from Schistocerca gregaria. Eur J Biochem 268: 4794-801. Plettner E., Lazar J., Prestwich E. G., and Prestwich G. D. (2000). Discrimination of pheromone enantiomers by two pheromone binding proteins from the gypsy moth Lymantria dispar. Biochemistry 39: 8953-62. Robertson H. M . , Martos R., Sears C. R., Todres E. Z., Walden K. K., and Nardi J. B. (1999). Diversity of odourant binding proteins revealed by an expressed sequence tag project on male Manduca sexta moth antennae. Insect Mol Biol 8: 501-18. Rothemund S., Liou Y. C , Davies P. L., Krause E. , and Sonnichsen F. D. (1999). A new class of hexahelical insect proteins revealed as putative carriers of small hydrophobic ligands. Structure Fold Des 7: 1325-32. Sabatier L. , Jouanguy E. , Dostert C , Zachary D., Dimarcq J. L. , Bulet P., and Imler J. L. (2003). Pherokine-2 and -3. Eur J Biochem 270: 3398-407. Sambrook J., Fritsch, E.F., and Maniatis, T. (1989). "Molecular cloning: A laboratory manual, 2nd edition.," Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York. Sandler B. H. , Nikonova L. , Leal W. S., and Clardy J. (2000). Sexual attraction in the silkworm moth: structure of the pheromone-binding-protein-bombykol complex. Chem Biol 7: 143-51. Shanbhag S. R., Hekmat-Scafe D., Kim M . S., Park S. K., Carlson J. R., Pikielny C , Smith D. P., and Steinbrecht R. A. (2001). Expression mosaic of odorant-binding proteins in Drosophila olfactory organs. Microsc Res Tech 55: 297-306. Stathopoulos A., Van Drenth M . , Erives A., Markstein M . , and Levine M . (2002). Whole-genome analysis of dorsal-ventral patterning in the Drosophila embryo. Cell 111: 687-701. Swofford, D.L. 2002. PAUP*. Phylogenetic Analysis Using Parsimony (*and Other Methods). Version 4. Sinauer Associates, Sunderland, Massachusetts. Thompson J. D., Gibson T. J., Plewniak F., Jeanmougin F., and Higgins D. G. (1997). The C L U S T A L _ X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res 25: 4876-82. 66 Vogt R. G., Callahan F. E., Rogers M . E., and Dickens J. C. (1999). Odorant binding protein diversity and distribution among the insect orders, as indicated by LAP, an OBP-related protein of the true bug Lygus lineolaris (Hemiptera, Heteroptera). Chem Senses 24: 481-95. Vogt R. G., Rogers M . E., Franco M . D., and Sun M . (2002). A comparative study of odorant binding protein genes: differential expression of the PBP1-GOBP2 gene cluster in Manduca sexto. (Lepidoptera) and the organization of OBP genes in Drosophila melanogaster (Diptera). J Exp Biol 205: 719-44. 67 CHAPTER THREE Developmental expression patterns of four chemosensory protein genes from the Eastern spruce budworm, Chroistoneura fumiferana* 3.1 Introduction Two CSPs from orthopteroid species were localized specifically to the lymph that surrounds sensory neurons housed within sensilla (Angeli et al, 1999; Monteforti et al, 2002). Based upon their specific localization, 3-D structures, and in vitro binding studies, chemosensory and odorant binding proteins are thought to transport hydrophobic stimuli across the hydrophilic sensillum lymph to sensory neurons (Bohbot et al, 1998; Angel i et al, 1999; Nagnan-Le Meil lour et al, 2000; Jacquin-Joly et al, 2001; Briand et al, 2002; Lartigue et al, 2002; Monteforti et al, 2002; Campanacci et al, 2003). However, specificity of gene expression is one key difference that distinguishes OBPs from CSPs. In contrast to the sensillum specific expression of most OBPs, the majority of CSP genes are expressed broadly in tissues that do not contain olfactory or gustatory neurons. CSPs are expressed generally in the thorax, abdomen, leg and head tissues of two lepidopteran species (Picimbon et al, 2000; Picimbon et al, 2001), the pheromone gland of female moths (Jacquin-Joly et al, 2001) and in regenerating cockroach legs (Kitabayashi et al, 1998). Recent studies indicate that CSPs from the Diptera may be immune responsive; Dmelpeblll and DmelphkS expression increased when adult flies were challenged with virus and bacteria, respectively (Sabatier et al, 2003). Similarly, transcription oiAgamirl increased six hours after adult mosquitoes were challenged with lipopolysaccharide, a bacterial compound known * A version of this chapter has been accepted by the Journal of Insect Molecular Biology: Wanner, K.W., Theilmann, D.A., Isman, M.B., Feng, Q. and Plettner, E. (2004). Developmental expression patterns of four chemosensory protein genes from the Eastern spruce budworm, Chroistoneura fumiferana. Journal of Insect Molecular Biology. Accepted. 68 to induce the immune system (Oduol et al., 2000). Therefore, the function of CSPs does not appear to be restricted to gustatory and olfactory sensilla; determining the relationship between their function in sensillum lymph and non-olfactory tissues is an interesting challenge that remains. Within the Diptera, the CSP gene family is relatively small as compared to the OBP gene family. The D. melanogaster genome has four CSP genes, and the A. gambiae genome has seven (Chapter 2, Wanner et al., 2004). In contrast, 51 D. melanogaster and 57 A. gambiae genes have been identified with amino acid homology to OBPs (Graham and Davies, 2002; Hekmat-Scafe et al., 2002; Xu et al., 2003). Additional CSP genes may be found in the Lepidoptera; to date, eight cDNAs encoding CSPs have been identified from both M. sexta and B. mori via EST projects (Robertson et al., 1999; SilkBase, WWW.ab.a.u-tokyo.ac.jp/silkbase/). In this chapter, we characterize the non sensory expression of four members of the CSP gene family from the Eastern spruce budworm. Specifically, the sex, tissue and development specific expression of each gene was determined by Northern blotting. 3.2 Results 3.2.1 Both conserved and divergent members of the C S P gene family are found in the Eastern spruce budworm The nomenclature used for this gene family is varied, therefore, each sequence will be referred to as named in its original publication, or, where none exists, by its GenBank accession number. A list of the accession numbers for sequences used in Figure 3.2 can be found in Table 2.1. A unique CSP clone, CfumAY701858 (Figure 3.1 A), was identified after a tblastn search of random sequences from a C. fumiferana larval cDNA library. A total of four CSP sequences have now been cloned from the ESB, three others were reported in Chapter 2. The complete sequence of CfumAY426540.2 was determined herein by 5' R A C E (Figure 1A). 69 CfumAY701858 and CfumAY426540.2 encode ORFs that conceptually translate into proteins with 120 and 127 amino acid residues, respectively. Both protein sequences have features characteristic of CSPs, including four conserved Cys residues with a CX6CX18CX2C spacing pattern and putative N-terminal signal peptides (approximately the first 15 amino acid residues). CfumAY426540.2 has 78% amino acid identity with HvirCSP2 (GenBank# AAM77041), while CfumAY701858 has 65% identity with a CSP translated from ce2366 (Silkbase, www.ab.a.u-tokyo.ac.jp/silkbase) a clone from B. mori. A total of 30 CSP sequences have been identified from lepidopteran species, most of which resolve into at least four homologous subgroups when analyzed using an un-rooted neighbor joining tree (Figure 3.2; Jacquin-Joly et al., 2001; Picimbon et al., 2001; Chapter 2). In addition, CSP sequences from each subgroup can be characterized by the retention of, or diversion from, several amino acid motifs that are highly conserved across diverse insect orders (Figure 3.IB; Chapter 2). CfumAY426540.2 forms a subgroup with three other lepidopteran CSPs (Figure 3.2), all of which maintain the conserved motifs illustrated in FigurelB. CfumAY701858, CfumAY426538 and CfumAY426539 all exhibit varying levels of diversion from the highly conserved amino acid motifs (Figure IB). CfumAY426538 forms a subgroup with MsexSAP6 and BmorAU000875, both of which diverge from the conserved motifs (Chapter 2). CfumAY701858 and CfumAY426539 do not fall within defined subgroups, but each pairs most closely with a divergent CSP from B. mori (Figure 3.2) (CfumAY701858 with Bmorce2366; CfumAY426539 with BmorAU004850). As more CSP sequences are reported from the Lepidoptera, they may resolve into additional subgroups. It is worth noting that all but one of the CSPs characterized as divergent were randomly identified by EST projects; many of the CSPs that maintain the conserved motifs were identified by 70 CfumAY701858 CGGGCTGCAGGAATTCGGCACGAGAACAAAATGAGGAGCTGTATTGTGTTCGCTTGCCTGCTCGTTTCG M R S C I V F A C L L V S GTGTTTGCTGCTGAGAAATACAATTCCAAATATGACAACTTTGATGTGGAGACCCTGATCTCAAACGAC V F A A E K Y N S K Y D N F D V E T L I S N D AGGCTACTGAAGTCGTATGTCAACTGCTTCTTGGACAAAGGCCGGTGCACTCCAGAAGGCACAGATTTT R L L K S Y V N C F L D K G R C T P E G T D F AAAAAAACCCTTCCCGACGCCGTCGAGACCACTTGTGCGAAGTGCACAGACAAACAGAAGACAAACATC K K T L P D A V E T T C A K C T D K Q K T N I AAAAAAGTGATCAAGGCCATCCAGACCAGACATCCTAGGCAATGGGACGAACTCGTGAAGAAGAACGAT K K V I K A I Q T R H P R Q W D E L V K K N D CCTACCGGCAAACACATCGTGAATTTCAATAAGTTCATTGAGAGCTAAATTAGGAAACAATAAATTAAG P T G K H I V N F N K F I E S * TTAGGCTAAGTATCTGTTGTGCAAACAATCGAAAAAAAAACAAAATTTGCTTTTTAAACAGACGAGGGA ATTGTCGTTAATGGCAGTTTAGACCTATTCGAGACGGTTACAAGCGTAATCCCGGCACAACTAAATTCT ATGTCGACCTATTCGTTTGAATACCTAATAGGTACTTTACGCTTGTTATTGTCTTGTCTGGAACCACTG TACGGAGAGTCTTATTTATTGTAACTCCATGACATAGACTGATAATGTAATAAAAGTATCCGTACTAA CfumAY426540.2 GATCACAGCATAAACAAACGAGCTATACAACGATAATCATGAAACTCCTACTGGTCTCACTTCTGGCG M K L L L V S L L A TGCCTGGTGGCCGTAGCTTTTGGACGTCCGCAGAACAAATACACTGACAAATGGGACAACATCAACATC C L V A V A F G R P Q N K Y T D K W D N I N I GACGAAATCCTGGAATCCCAGCGNCTCCTCAAAGCCTACATAGACTGCCTTCTGGACAAAGGCCGTTGT D E I L E S Q R L L K A Y I D C L L D K G R C ACTCCTGACGCGAAAACTCTTAAAGATACATTGCCTGATGCTTTGGAAAATGAGTGCAACAAATGCACG T P D A K T L K D T L P D A L E N E C N K C T GAAAAACAGAAGTCGGGATCAGATAAGGTTATCAGGCATCTTGTTAACAAACGTCCTGAAATGTGGAAG E K Q K S G S D K V I R H L V N K R P E M W K GAGCTGTCGGTGAAGTACGACCCTGATCATATCTATGAAGGCAGGTACAAGGACCAGATTGAGAAGATC E L S V K Y D P D H I Y E G R Y K D Q I E K I AAGGCGTAGGAAGGGGAACATTCCAAGGAGATTTTGAGATTGGTTTTCCTTGTTTGTAATAAACGTCAT K A * TTTACCATCGAAAAAAAAAAA Figure 3.1 A ) Nucleotide sequence of two c D N A s cloned from the Eastern spruce budworm, and the conceptually translated amino acid sequence. Start and stop codons, and the four conserved Cys residues, are underlined. 71 Orthoptera Phasmatodea Dictyoptera Hymenoptera Diptera-Nema. Diptera-Brach. Lepidoptera CfumAY426540.2 CfumAY42 6539 CfuntAY426538 CfumAY701858 Ruler INV YTTKYDNVDLDEIL DGKELK -EK KPS DES S F QEQ G EDK EDK D QNK D W I QQY NNR L A QGT AEN DL I AEK NS F ; ANDRLLNK VQ FENERLFAS KE| GSKRLLNN HSDRLLNN FK KSDRLFNN FK KSDRLFGN FK ANKRLLVA VN ESQRLLKA I D jVQNERVLLA V Tt ^ rt •» t» 1- k la CN 00 3 ™ .2 2 « « « ^ CN W W W W (/> W Q. v I— £ f e e ro ro a a a * - * J * J 33 J ; i_ 3 _i m m co co co co 0 . Q . C L CfumAY426540 <- °-55 k b CfuntAW26*38 • « t • * • « # « I • * - 0-8 kb CfumAY701858 <— 0.7 kb CfumAY426539 <~ 2 2 k b Figure 3.5. CSP gene expression, as determined by Northern blotting, during the development of the Eastern spruce budworm from early 5th instar larvae through to pupation. Each lane was loaded with 5 ug of total RNA. Ribosomal bands are illustrated at the bottom of the figure (in the same order as the Northern blot) as an indication of total RNA loaded. Head capsule is abbreviated HC. 79 Figure 3.6. Expression of four CSP genes at 8 hour intervals during the development of the last instar of the Eastern spruce budworm, determined by Northern blotting. The 120 h time point was the last point for male larvae, which developed more quickly as compared to the females. Thus, the male 128 h and 144 h time points are blank. Each lane was loaded with 5 pg of total RNA; CfumAY426539 was repeated using 15 pg of total RNA per lane, the male 16h, 40 h, 72 h and 96 h time points are blank because RNA samples were not available. Ribosomal bands are illustrated at the bottom of the figure (in the same order as the Northern blot) as an indication of total RNA loaded. 80 8 1 hybridized to new CSP sequences that have not yet been cloned and sequenced. However, the secondary signals could also represent alternative splice products. 3.2.3 CSP transcripts can be positively or negatively regulated by the ecdysteroid cascade The results from Northern blots used to characterize the developmental expression of CSP genes in larvae indicated that some were either more or less abundant in stages undergoing a molt, indicating they may be regulated by ecdysteroids. To assess this observation, we induced new 6 th instar ESB larvae into a premature molt. Within 16 h after treatment with an ecdysteroid agonist (100 ng oral dose of tebufenozide), larvae stopped feeding and head capsule slippage (HCS) was apparent. By 24 h after treatment, the old head capsule had slipped approximately halfway off, and the larvae remained in an arrested state of apolysis, through to 72 h post treatment, by which time the larvae had become moribund. These symptoms are typical of ecdysteroid agonists that induce lethal molts (Retnakaran et al., 1997; Dhadialla et al., 1998; Retnakaran et al., 2003). Normally, a peak in ecdysteroid levels triggers the beginning of a larval molt (apolysis), and the decline to basal levels initiates the release of eclosion hormone and the completion of the molt (ecdysis) (Retnakaran et al., 2003; Riddiford et al, 2003). Tebufenozide binds to the ecdysteroid receptor complex and initiates the molting cascade, but because the tebufenozide is not cleared, and the titer remains high, the molting larvae remain in an arrested state of apolysis (Retnakaran et al., 2003). CSP transcripts that were either more, or less, prevalent in molting stages were also up-or down-regulated during an induced premature molt. CfumAY426539 transcripts were induced 12 to 18 h after treatment with tebufenozide, but were then absent 36-72 h after treatment; no transcript was detected in control larvae (Figure 3.7). CfumAY426539 was detected at very specific stages such as HCS, and in the very late stages of the last instar just prior to prepupation (which may correspond temporally to an ecdysteroid peak that 82 Tim© post treatment Figure 3.7. CSP gene expression during a premature molt. Newly molted 6 instar ESB larvae were induced to molt again by an oral dose of the ecdysteroid agonist tebufenozide. Transcript levels were assessed by Northern blotting 12, 18, 36 and 72 h after treatment; 5 ug of total R N A was loaded into each lane, with the exception of CfumA Y426539, where 15 ug of total R N A was used. Ribosomal bands are illustrated at the bottom of the figure (in the same order as the Northern blot) as an indication of total RNA loaded. 83 triggers pupation). CfumAY701858 transcripts were not induced until 18 h after treatment, but then continued to be expressed 72 h post treatment (Figure 3.7). This is consistent with the developmental profile that found peak expression to occur either during HCS or at the completion of ecdysis (white head capsule, WHC), and a continuation at lower levels during subsequent 6 th instar larval development. While CfumAY426539 and CfumAY701858 transcripts were up-regulated during the induced molt, CfumAY426538 appears to be down-regulated at the 18 h time point, as compared to control larvae (Figure 3.7). In vivo, CfumAY426538 transcripts were less prevalent during 5 th instar HCS, and after peaking during late 6 th instar stages, the transcript levels decline again immediately prior to prepupation, a time point that may correspond temporally to an ecdysteroid peak that induces pupation. 3.3 Discussion It has been hypothesized that chemosensory proteins transport hydrophobic stimuli across the sensillum lymph of chemosensory organs, a function analogous to that of odorant binding proteins. This hypothesis was supported by several studies: 1) Native CSPs were selectively isolated and identified from insect sensory organs, and localized specifically to the sensillum lymph (Angeli et al, 1999; Nagnan-Le Meillour et al., 2000; Monteforti et al., 2002); 2) Ligand binding studies demonstrated that short to medium length fatty acid derivatives, including pheromone components, could bind to CSPs (Bohbot et al., 1998; Nagnan-Le Meillour et al., 2000; Jacquin-Jolly et al., 2001; Briand et al, 2002; Lartigue et al., 2002; Campanacci et al, 2003); and, 3) The crystal and NMR structures of MbraCSPA6 (Lartigue et al, 2002; Mosbah et al, 2003) revealed a small, soluble, globular protein composed of six amphiphatic a-helices surrounding an internal hydrophobic binding pocket, structural features shared in common with OBPs (although the two families do not share amino acid sequence homology). 84 However, the common expression of CSP genes in many non-sensory tissues (classically defined as tissue lacking olfactory and gustatory neurons) indicates that they have additional, non-olfactory functions. CSPs or their transcripts have been identified in regenerating cockroach legs (Kitabayashi et al., 1998), moth pheromone glands (Jacquin-Joly et al, 2001), fly haemolymph (Sabatier et al, 2003), and generally in the thorax, abdomen, leg and head tissue of moths (Picimbon et al, 2000; Picimbon et al, 2001). cDNA libraries constructed from tissues such as the Apis mellifera brain (Honey Bee Brain EST Project, WWW.titan.biotec.uiuc.edu/bee/honeybee/_project.htm), B. mori imaginal disks (Silkbase, WWW.ab.a.u-tokyo.ac.jp/silkbase/) and D. melanogaster 0-24 hour old embryos (GenBank Acession# BI363615) have included CSPs. Sabatier et al. (2003) constructed transgenic D. melanogaster that expressed green fluorescent protein (GFP) under the control of the DmelPebHI upstream regulatory sequence; GFP was detected in many tissues, some of which included the larval gut, ring gland, and testes, and the adult fly legs, wing veins and gut. However, it should be noted that chemosensation and signal transduction can occur in these tissues, but not the classical senses of taste and smell associated with sensory neurons. Each of the four CSP genes identified from the ESB exhibited distinct patterns of expression during development, inconsistent with a classic sensory function associated with olfactory and gustatory neurons. However, we used whole insects to assess global levels of transcription; under these conditions, up-regulation in one tissue could be masked by simultaneous down-regulation in another. CfumAY426540.2 was detected only in adult moths; CfumAY426538 and CfumAY426539 were detected in all stages except the adults; and CfumAYl 01858 was detected in all stages. CfumAY426539 and CfumAY701858 transcripts were relatively more abundant in stages undergoing ecdysis, such as the 5 t h to 6th instar molt. Opposite to this trend, the expression of general odorant binding protein 2 (gobpl) in larval antennae was down-regulated during a molt, corresponding temporally to the rise in 85 ecdysteroids (Vogt et al, 2002). Furthermore, CfumAY426538, CfumAY426540.2 and CfumAY701858 were generally expressed in all three segments, the head, thorax and abdomen. The thorax and abdomen of caterpillars lack any known olfactory or gustatory organs. This contrasts the highly specific expression of OBPs in the support cells of olfactory and gustatory sensilla (Vogt and Riddiford, 1981; Vogt et al, 1991; Steinbrecht et al, 1992; Galindo and Smith, 2001; Shanbhag et al, 2001; Vogt et al, 2002); in the most extreme case, P-galactosidase expression under the control of the up-stream regulatory sequence of DmelOBP57d&e was detected in only four cells found on each of the six tarsi of adult transgenic flies (Galindo and Smith, 2001). The differential expression of the four ESB CSP genes may reflect the fact that each protein belongs to a different homologous subgroup within the Lepidoptera (Figure 3.2), and each is presumably under different functional constraints or selection pressures. The transient up- or down-regulation of CSP transcripts during a natural molt (or a premature molt induced by an ecdysteroid agonist) in patterns that did not overlap suggests that members of this gene family may each play a differential role in development. An ecdysteroid peak in the presence of juvenile hormone initiates a larval molt (apolyis), by the temporal and transient regulation of several transctiption factors in a cascading fashion (Riddiford et al, 2003). CfumAY426538 transcript appeared to be down-regulated 18 h post treatment with an ecdysteroid agonist; CfumAY426539 was transiently up-regulated within 12 h of treatment; and, CfumAY701858 was up-regulated at 18 h post treatment, and its expression then continued at lower levels. Expression of a CSP gene (DmelPhk-3) in cultured S2 cells was repressed by treatment with 20-hydroxyecdysone (Sabatier et al, 2003). Supporting a role in development, CSP genes are commonly expressed in developing tissues, such as: the imaginal disks of B. mori (Silkbase, WWW.ab.a.u-tokyo.ac.jp/silkbase/) and D. melanogaster (WWW.ncbi.nlm.nih.gov/geo/gds/gds_browse.cgi?gds=l 92); early pupal stages 86 undergoing metamorphosis (Sabatier et al., 2003; Figure 3.3 & 5); stage 13-16 fruit fly embryos (RE09339, Berkeley Drosophila Genome Project, Patterns of gene expression in Drosophila embryogenesis, httpr/Zvvww.fruitfly.org/cgi-bin/ex/insitu.pl); and, regenerating cockroach legs (Kitabayashi et al., 1998). In addition, Dmelphk3 was identified as a target of the dorsal transcription factor, involved in embryo and tissue development (Stathopoulos et ah, 2002). The function of CSPs in developing insect tissues remains uncertain, and in vitro ligand binding studies indicate they can bind a variety of chemicals. CSPs from A. melifera and M. brassicae generally bound 14-18 carbon fatty acid derivatives (Briand et al., 2002; Lartigue et ah, 2002), while two CSPs from orthopteroid species were able to bind larger more bulky ligands, such as the fluorescent reporter 1-NPN and cinnamaldehyde derivatives (Ban et al., 2002; 2003). The crystal structure of MbrCSPA6 was solved in complex with three molecules of 12-bromo-dodecanol, which resulted in significant expansion of the binding pocket (Campanacci et al., 2003). The configuration of the disulfide bonds of CSPs is thought to allow for greater flexibility (Tegoni et al., 2004). If the binding specificity of CSPs is similarly broad in vivo, a wide variety of physiologically relevant compounds could be candidate ligands. Constitutive gene expression does not necessarily exclude expression in sensilla. DmelOBP19D (PBPRP2) was found to occur in the outer cavity of double walled sensilla, in epidermal cells, and the corresponding subcuticular space (Park et al., 2000; Shanbhag et al., 2001). If individual CSPs are found to occur both in the sensillum lymph and constitutively in non-sensory tissues, determining the functional relationship in the different tissues will be an interesting challenge. DmelOBP19D was not found throughout the epidermis, rather it was associated with the antennae and maxillae (Park et al., 2000). It was suggested that DmelOBP19D may transport insoluble cuticle components or may be involved 87 in cleaning deleterious compounds from the antennal surface. PamePIO, isolated from regenerating cockroach legs, was associated with the developing epidermis (Kitabayashi et ah, 1998) and EcalCSPl was isolated from the subcuticular layer of a phasmid (Marchese et al., 2000). These results could help explain why CSPs are commonly found in organs such as the antennae, legs and wings. Recent experimental results point towards a potential role for CSPs in the insect immune response. Levels of Dmelpeblll were found to increase in the adult fly haemolymph 48 h post infection with Drosophila C virus; however, there was no detectable change in transcript level, and no induction was detected in flies treated with buffer or challenged with bacteria (Sabatier et al., 2003). In contrast, increased levels of DmelphkS transcript were detected 3-6 h post infection with bacteria, but no change was observed for DmelOS-D or DmelPeblll. DmelPeblll was up-regulated in S2 cells treated with bacterial lipopolysaccharide, a compound known to stimulate the immune system, as was transcription of Agamirl six hours after adult mosquitoes were challenged lipopolysaccharide (Oduol et al., 2000). Functions ranging from tissue repair to the recognition of invading pathogens were suggested (Sabatier et al., 2003). It is interesting to note that DmelOS-D was found to interact with D. melanogaster eukaryotic initiation factor 4E (DmeleIF-4E) with high confidence (Giot et al., 2003), and that has been associated with the insect immune response. Many CSP sequences from diverse insect orders, representing more than 300 million years of evolutionary divergence, maintain several highly conserved amino acid domains (Figure 3. IB; Chapter 2), suggesting some level of common function. Emerging data that indicates a role for CSPs in development and/or the immune response could provide a common functional link. Some homologous CSP subgroups can be characterized by diversion from the highly conserved amino acid motifs (Figure 3.IB & 2; Chapter 2), and these may also diverge functionally. It is interesting to note that CfumAY426540.2 (conserved motifs) was 88 detected only in adult moths, while the two most divergent ESB CSPs (CfumAY426538 and CfumAY426539) were detected in developing larvae and pupae, and not adult moths. Comparative characterizations of conserved and divergent members of the CSP gene family may yield insights into their function. 3.4 Methods and Materials 3.4.1 Insect rearing, staging, and collections Second instar C. fumiferana larvae were supplied from a non-diapausing colony continuously maintained by the Insect Production Unit at the Great Lakes Forestry Centre, Sault Ste. Marie, Ontario Canada. Larvae were reared on artificial diet at 23°C and a 16:8 LD photoperiod, through to pupation and adult emergence. Insects were collected for three independent experiments. First, 3r d and 6 th instar larvae, and one to three day old pupae and adult moths, were collected to determine general developmental gene expression patterns in whole insects. Male larvae were identified by the presence of darkly pigmented testes that are clearly visible during the 4 th instar, and reared separately from female larvae. Two independent samples were collected from staged insects. Insects were staged by collecting 5 th instar lavae (males and females reared separately) that were beginning to molt to 6 th intars, as indicated by head capsule slip (HCS). Larvae that had molted to the 6 th instar overnight were collected the next day (< 12 h old) and maintained in the growth chamber. Additional collections were made after 48 and 96 h in the growth chamber (between 1100 h and 1300 h, to control for variation in expression due to circadian rhythms). Similarly, new pupae (< 12 h old) were staged, and collected again after 72 h. A subset of the 6 th instar larvae collected after 48 h in the growth chamber, and 1 to 3-day-old adult moths, were retained and the head, thorax and abdomen segments separated by dissection. Larval heads were excised from the thorax between the head capsule and the prothoracic shield; the larval thorax was then cut after the last segment bearing true legs, leaving the remaining abdomen portion. Adult 89 moth heads, bearing the antennae and gustatory organs, were removed from the thorax using tweezers, and the thorax bearing the legs and wings was cut from the abdomen. In a second, more precise staging experiment, 6 th instar larvae were collected immediately after completing the molt from 5 th instar larvae, as indicated by white head capsules that had not started melanization. Three collections were made over a two-day period, and the male and female larvae were reared separately in a growth chamber, three individuals per diet cup. Subsequently, by collecting larvae each day between 1100 h and 1300 h, samples were obtained at 8 h time intervals, from newly molted 6 th instars through to pupation. At 23°C and a 16:8 LD photoperiod, 6 th instar male larvae began pupating 120-128 h after molting to the 6 th instar, female larvae began pupating approximately 16-24 h after the males. In both sexes, pupation was complete within 16-24 h. Additional insect samples were collected at distinct developmental stages, including: 1) 5 th instar HCS (approximately one-half the way over the new 6 th instar head capsule), 2) prepupae, and 3) newly formed "green" pupae prior to melanization. In all cases, samples were immediately frozen in liquid nitrogen and stored at -80 °C. 3.4.2 Induction of a premature molt Fifth instar ESB larvae undergoing a molt to the final 6 t h instar (as indicated by HCS) were collected and transferred to cups without diet. The next day, larvae that had completed the molt were placed singly into 0.5 ml microcentrifuge tubes containing a one pi aliquot of 100 mM sucrose and 100 mM Tris hydroxymethylaminoethane (Tris) pH 8.0. The aliquot was colored with red food dye, and treatments amended with 100 parts per million (ppm) tebufenozide (formulated as Mimic®). The microcentrifuge tubes oriented upside down in a cardboard box were placed in a dark room with an overhead light source. Within 0.5 to 1.0 h, 80% of the larvae crawled up to the sucrose droplet and consumed it; these larvae were then transferred to cups with artificial diet. Red food coloring visible in the larval gut and frass 90 confirmed ingestion of the dose. This assay is essentially the same as that reported in Hughs et al. (1986) and Van Frankenhuyzen et al. (1997). Control and treated larvae were collected 12, 18, 36 and 72 h post treatment, frozen in liquid nitrogen, and stored at - 8 0 ° C for RNA isolation and Northern blot analysis. 3.4.3 Rapid amplification of cDNA ends (RACE) CfumAY426540 was reported in Wanner et al. (2004) and is a partial cDNA clone. The complete cDNA sequence was obtained by 5' R A C E . First strand cDNA was synthesized from 5 pg of total RNA (isolated from adult male moths) using Superscript II reverse transcriptase (Gibco) and an oligonucleotide primer specific to CfumA Y426540 ( 5 ' C A G T T C C C C T T C C T A C G C C T T G A T 3') following the manufacturer's protocol. Heat inactivated 1st cDNA was purified using Microcon YM50 spin columns (Millipore) and 3' poly adenylated for one hour at 37°C with TdT (Invitrogen) following the manufacturer's protocol. An anchored oligo dT primer (5'GCG C C G C G G C C G CTi , [CT][AGC] 3') was used to synthesize 2 n d cDNA by primer extension using a 4:1 mixture of Taq and Pfu polymerase (2.5 mM MgCl 2 , 0.125 mM dNTP mix and lx Taq buffer [Gibco], in 10 pi). The 2 n d strand reaction mix was heated to 94°C for 3 min, 45°C for 3 min and 72°C for 20 min. Aliquots of double stranded (ds) cDNA (1.4 pi) were used as template in a 50 pi Taq PCR reaction using a truncated dT primer (5' G C G C C G C G G C C G CTT 3') combined with a second, internal, gene specific primer ( 5 ' G T C C T T G T A C C T G C C T T C A T A G 3') (2.5 mM MgCl 2 , 0.1 mM dNTP mix and 0.2 pM of each primer in lx Taq buffer). PCR reactions were amplified using a Geneamp 2400 thermo-cycler (Perkin-Elmer) set at one cycle of 94°C for 60 s, 25 cycles of (94°C for 30 s; 50-68°C gradient for 30 s; and 72°C for 60 s), and one final cycle of 72°C for 7 min. The PCR product was sequenced using the gene specific primer 5 ' G T C C T T G T A C C T G C C T T C A T A G 3' and the Taq Dye Deoxy Terminator Cycle 91 Sequencing Kit combined with an automatic DNA sequencer (model 310, Applied Biosystems). 3.4.4 Northern blot analysis Total RNA was extracted from 50-100 mg insect samples in gaunidium isothiocyanate and purified by acid phenol/chloroform extraction (Sambrook, 1989), and then further purified using RNEasy mini columns (Qiagen). RNA concentrations were measured by absorption at a wavelength of 260 nm (A26o)and analyzed on a 1% agarose gel. Five pg aliquots of total RNA, stored at - 8 0 ° C , were mixed with 0.2 pg of ethidum bromide, 20 pi of loading buffer (50% formamide, lx MOPS (40 mM morpholinopropanesulfonic acid, 10 mM sodium acetate and 1 mM ethylenediaminetetraacetic acid [EDTA]; pH 7.2), 16% formaldehyde, 13.3% glycerol and bromophenol blue dye) and incubated at 65°C for 15 min. The RNA samples were resolved on a 1.3% agarose/2.2 M formaldehyde gel using lx MOPS (running buffer, transferred to Zeta-Probe GT nylon membranes (Bio-Rad), and fixed in an oven at 80°C for 2 h. Single stranded RNA probes labeled with 3 2 P-UTP were prepared by a T3 or T7 RNA polymerase transcription reaction using linear cDNA clones as template: 2 pg linear plasmid, 0.2 mM ATP, CTP >P, 0.01 mM UTP, lx transcription buffer (Fermentas), 20-50 uCi 3 2 P-UTP, 10 U of RNAse inhibitor (Invitogen), 20-30 units of T3 or T7 RNA polymerase (Fermentas), and incubated at 37°C for 2 h. Membranes were pre-hybridized for 2-4 h in 5-15 ml of formamide prehybridization buffer (5x SSC, 5x Denhardt's solution , 50% formamide, 0.5% SDS and 100 pg/ml denatured salmon sperm DNA) at 60°C. Single stranded (ss) RNA probes were hybridized to the membranes (> 106 counts per min [cpm] / ml) overnight at 60°C, and washed with SSC under stringent conditions (twice each with 2x SSC/0.1% SDS at room temperature, 0.2x SSC/0.1% SDS at room temperature, and O.lx SSC/0.1% SDS at 70°C, 20-92 30 min at each wash). The membranes were exposed to a phosphorimager screen overnight and scanned with a Molecular Dynamics Storm. A 0.24 - 9.5 kb ssRNA ladder (Invitrogen) was used as a standard to calculate the size of each transcript, which was reported as an average of all replicated blots. CfumAY426539 did not yield a signal at 5 ug of total RNA. The Northern blots were repeated using 15 ug per lane where samples were available. In these cases, a ds DNA probe I T labeled with P was prepared from a 292 base pair (bp) PCR product (using the Stratagene random primer labeling kit) amplified from the CfumA Y426539 cDNA clone using forward ( 5 ' C T C G G C G C A G C A A C A G T A 3') and reverse (5 'TGGAGCGGTCGGGGTCGTATTT 5') primers, ds cDNA probes, made from 300-500 bp PCR products amplified from cDNA clones, were also used in the Northern blot data in Figure 3.6. The CfumAY'/'01858 transcripts in Figure 3.7 were detected using 15 ug total RNA. In cases where ds cDNA probes were used, the final two washes were conducted at 60°C. 3.5 References Angeli S., Ceron, F., Scaloni, A., Monti, M . , Monteforti, G., Minnocci, A., Petacchi, R., and Pelosi P. (1999). Purification, structural characterization, cloning and immunocytochemical localization of chemoreception proteins from Schistocerca gregaria. Eur J Biochem 262: 745-54. Ban L. , Scaloni A., Brandazza A., Angeli S., Zhang L. , Yan Y., and Pelosi P. (2003). Chemosensory proteins of Locusta migratoria. Insect Mol Biol 12: 125-34. Ban L. , Zhang, L. , Yan, Y., and Pelosi, P. (2002). Binding properties of a locust's chemosensory protein. Biochem Biophys Res Commun 293: 50-4. Bohbot J., Sobrio, F., Lucas, P., and Nagnan-Le Meillour, P. (1998). Functional characterization of a new class of odorant-binding proteins in the moth Mamestra brassicae. Biochem Biophys Res Commun 253: 489-94. 93 Briand L. , Swasdipan, N., Nespoulous, C , Bezirard, V. , Blon, F., Huet, J. C , Ebert, P., and Penollet, J. C. (2002). Characterization of a chemosensory protein (ASP3c) from honeybee (Apis mellifera L.) as a brood pheromone carrier. Eur J Biochem 269: 4586-96. Campanacci V. , Lartigue, A., Hallberg, B. M . , Jones, T. A., Giudici-Orticoni, M . T., Tegoni, M . , and Cambillau, C. (2003). Moth chemosensory protein exhibits drastic conformational changes and cooperativity on ligand binding. Proc Natl Acad Sci USA 100: 5069-74. Dhadialla T. S., Carlson G. R., and Le D. P. (1998). New insecticides with ecdysteroidal and juvenile hormone activity. Annu Rev Entomol 43: 545-69. Galindo K., and Smith, D. P. (2001). A large family of divergent Drosophila odorant-binding proteins expressed in gustatory and olfactory sensilla. Genetics 159: 1059-72. Giot L. , Bader J. S., Brouwer C , Chaudhuri A., Kuang B., Li Y., Hao Y. L. , Ooi C. E. , Godwin B., Vitols E. , Vijayadamodar G., Pochart P., Machineni H., Welsh M . , Kong Y., Zerhusen B., Malcolm R., Varrone Z., Collis A., Minto M . , Burgess S., McDaniel L. , Stimpson E. , Spriggs F., Williams J., Neurath K., Ioime N., Agee M . , Voss E. , Furtak K., Renzulli R., Aanensen N., Carrolla S., Bickelhaupt E. , Lazovatsky Y., DaSilva A., Zhong J., Stanyon C. A., Finley R. L. , Jr., White K. P., Braverman M . , Jarvie T., Gold S., Leach M . , Knight J., Shimkets R. A., McKenna M . P., Chant J., and Rothberg J. M . (2003). A protein interaction map of Drosophila melanogaster. Science 302:1727-36. Graham L. A., and Davies, P. L. (2002). The odorant-binding proteins of Drosophila melanogaster. annotation and characterization of a divergent gene family. Gene 292: 43-55. Hekmat-Scafe D. S., Scafe, C. R., McKinney, A. J., and Tanouye, M . A. (2002). Genome-wide analysis of the odorant-binding protein gene family in Drosophila melanogaster. Genome Res 12: 1357-69. Hughs, P.R., van Beek, N .A.M. and Wood, H.A. (1986). A modified droplet feeding method for rapid assay of Bacillus thuringiensis and baculoviruses in noctuid larvae. J Invert Pathol 48: 187-192. Jacquin-Joly E. , Vogt, R. G., Francois, M . C , and Nagnan-Le Meillour, P. (2001). Functional and expression pattern analysis of chemosensory proteins expressed in antennae and pheromonal gland of Mamestra brassicae. Chem Senses 26: 833-44. Kitabayashi A. N., Arai, T., Kubo, T., and Natori, S. (1998). Molecular cloning of cDNA for plO, a novel protein that increases in the regenerating legs of Periplaneta americana (American cockroach). Insect Biochem Mol Biol 28: 785-90. Lartigue A., Campanacci, V., Roussel, A., Larsson, A. M . , Jones, T. A., Tegoni, M . , and Cambillau, C. (2002). X-ray structure and ligand binding study of a moth chemosensory protein. JBiol Chem 277: 32094-8. 94 Marchese S., Angel i , S., Andolfo, A . , Scaloni, A . , Brandazza, A . , Mazza , M . , Picimbon, J., Leal , W . S., and Pelosi, P. (2000). Soluble proteins from chemosensory organs of Eurycantha calcarata (Insects, Phasmatodea). Insect Biochem Mol Biol 30: 1091-8. Monteforti G . , Angel i S., Petacchi R., and Minnocci A . (2002). Ultrastructural characterization of antennal sensilla and immunocytochemical localization of a chemosensory protein in Carausius morosus Brunner (Phasmida : Phasmatidae). Arthropod Structure & Development30: 195-205. Mosbah A . , Campanacci, V . , Lartigue, A . , Tegoni, M . , Cambillau, C , and Darbon, H . (2003). Solution structure of a chemosensory protein from the moth Mamestra brassicae. Biochem J369: 39-44. Nagnan-Le Meil lour P., Cain, A . H . , Jacquin-Joly, E . , Francois, M . C , Ramachandran, S., Maida, R., and Steinbrecht, R. A . (2000). Chemosensory proteins from the proboscis of Mamestra brassicae. Chem Senses 25: 541-53. Oduol F. , X u , J., Niare, O., Natarajan, R., and Vernick, K . D . (2000). Genes identified by an expression screen of the vector mosquito Anopheles gambiae display differential molecular immune response to malaria parasites and bacteria. Proc Natl Acad Sci U S A 97: 11397-402. Park S. K . , Shanbhag S. R., Wang Q., Hasan G. , Steinbrecht R. A . , and Pikielny C. W . (2000). Expression patterns of two putative odorant-binding proteins in the olfactory organs of Drosophila melanogaster have different implications for their functions. Cell Tissue ResZM: 181-92. Picimbon J. F. , Dietrich K . , Krieger J., and Breer H . (2001). Identity and expression pattern of chemosensory proteins in Heliothis virescens (Lepidoptera, Noctuidae). Insect Biochem Mol Biol 31: 1173-81. Picimbon J. F. , Dietrich K . , Breer H . , and Krieger J. (2000). Chemosensory proteins of Locusta migratoria (Orthoptera: Acrididae). Insect Biochem Mol Biol 30: 233-41. Retnakaran A . , K r e l l P., Feng Q., and A r i f B . (2003). Ecdysone agonists: mechanism and importance in controlling insect pests of agriculture and forestry. Arch Insect Biochem Physiol SA: 187-99. Retnakaran A . , Smith, L .F .R . , Tomkins, W . L . , Primavera, M . J . , Pal l i , S.R., Payne, N . , and Jobin, L . (1997). Effect of RH-5992, a nonsteroidal ecdysone agonist, on the spruce budworm, Choristoneura fumiferana (Lepidoptera: Tortricidae): Laboratory, greenhouse and ground spray trials. Can Entomol 129: 871-885 Riddiford L . M . , Hiruma K . , Zhou X . , and Nelson C. A . (2003). Insights into the molecular basis of the hormonal control of molting and metamorphosis from Manduca sexta and Drosophila melanogaster. Insect Biochem Mol Biol 33: 1327-38. 95 Robertson H. M . , Martos, R., Sears, C. R., Todres, E. Z., Walden, K. K., and Nardi, J. B. (1999). Diversity of odourant binding proteins revealed by an expressed sequence tag project on male Manduca sexta moth antennae. Insect Mol Biol 8: 501-18. Sabatier L. , Jouanguy E. , Dostert C , Zachary D., Dimarcq J. L. , Bulet P., and Imler J. L. (2003). Pherokine-2 and -3. Eur J Biochem 270: 3398-407 Sambrook J., Fritsch, E.F., and Maniatis, T. (1989). "Molecular cloning: A laboratory manual, 2nd edition.," Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York. Shanbhag S. R., Hekmat-Scafe, D., Kim, M . S., Park, S. K., Carlson, J. R., Pikielny, C , Smith, D. P., and Steinbrecht, R. A. (2001). Expression mosaic of odorant-binding proteins in Drosophila olfactory organs. Microsc Res Tech 55: 297-306. Stathopoulos A., Van Drenth, M . , Erives, A., Markstein, M . , and Levine, M . (2002). Whole-genome analysis of dorsal-ventral patterning in the Drosophila embryo. Cell 111: 687-701. Steinbrecht, R.A., Ozaki, M . and Ziegelberger, G. (1992). Immunocytochemical localization of pheromone-binding protein in moth antennae. Cell Tissue Res 270: 287-302. Tegoni M . , Campanacci, V. , and Cambillau, C. (2004). Structural aspects of sexual attraction and chemical communication in insects. Trends Biochem Sci 29: 257-64. Thompson, J.D., Gibson, T.J., Plewniak, F., Jeanmougin, F., and Higgins, D.G. (1997). The ClustalX windows interface: Flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res 25: 4876-4882. van Frankenhuyzen, K., Gringorten, L. , Dedes, J., and Gauthier, D. (1997). Susceptibility of different instars of the spruce budworm (Lepidoptera: Tortricidae) to Bacillus thuringiensis var. kurstaki estimated with a droplet-feeding method. J Econ Entomol 90: 560-565. Vogt, R.G., Prestwich, G.D. and Lerner, M.R. (1991). Odorant-binding-protein subfamilies associate with distinct classes of olfactory receptor neurons in insects. J Neurobiol 22: 74-84. Vogt, R.G. and Riddiford, L . M . (1981). Pheromone binding and inactivation by moth antennae. Nature 293: 161-163. Vogt R. G., Rogers, M . E . , Franco, M . D., and Sun, M . (2002). A comparative study of odorant binding protein genes: differential expression of the PBP1-GOBP2 gene cluster in Manduca sexta (Lepidoptera) and the organization of OBP genes in Drosophila melanogaster (Diptera). J Exp Biol 205: 719-44. Wanner, K.W., Willis, L .G. , Theilmann, D.A., Isman, M.B., Feng, Q., and Plettner, E. 2004. Analysis of the os-d-like gene family. J Chem Ecol 30: 889-911. 96 Xu P. X., Zwiebel, L. J., and Smith D. P. (2003). Identification of a distinct family of genes encoding atypical odorant-binding proteins in the malaria vector mosquito, Anopheles gambiae. Insect Mol Biol 12: 549-60. 97 CHAPTER FOUR Functional characterization of a divergent chemosensory protein from the Eastern spruce budworm, Choristoneura fumiferana 4.1 Introduction The crystalline and NMR structures of M. brassicae CSPA6 have been solved (Lartigue et al., 2002; Mosbah et al., 2003), revealing a globular structure composed of six amphiphatic a-helices that surround an internal, hydrophobic binding pocket (Figure 2.6). Competitive ligand binding assays using fluorescent reporters have been used to characterize the binding specificity of several CSPs (referenced in Table 1.2). CSPs from A. mellifera and M. brassicae bound to C12-C18 chain length fatty acid derivatives with binding constants in the 0.2 to 1.2 uM range, with little discrimination between ligands (Briand et al., 2002; Campanacci et al., 2003). However, two orthopteran CSPs preferentially bound to more bulky compounds, such as cinnamaldehyde derivatives, as compared to fatty acid derivatives, but the binding constants were weaker, ranging from 9 to 40 pM (Ban et al., 2002; 2003). The crystal structure of MbraCSPA6 was solved with three molecules of 12-bromododecanol in the binding pocket, resulting in a tripling of the binding pocket volume and drastic conformational changes (Campanacci et al., 2003). If this phenomenon is not an artifact of crystallization, it may explain the broad binding specificity of CSPs, which could accommodate numerous classes of physiologically relevant chemicals. Gene expression results presented in Chapter 3 indicate that CfumAY426538 may function in the development of the Eastern spruce budworm. The function of CSPs does not appear to be restricted to gustatory and olfactory sensilla; determining the relationship between their function in sensillum lymph and non-olfactory tissues is an interesting challenge that remains. In this chapter I used three approaches to characterize the non-sensory function of 98 CfumAY426538. First, protein expression patterns were characterized during the development of the ESB. Next, I used a technique of post-transcriptional gene silencing (PTGS), termed RNA interference (RNAi) (Fire et al., 1998), in an attempt to reduce in vivo transcript and protein levels of CfumAY426538. Finally, I used the fluorescent reporter 1-NPN to characterize the ligand binding specificity of CfumAY426538. Many binding studies have been designed from the paradigm of an olfactory function, where pheromones and plant volatiles were selected as putative ligands. In this study, I have screened the binding ability of several ligands of physiological relevance to the insect. 4.2 Results 4.2.1 CfumAY426538 protein expression indicates a function in preparation for pupation Western blot analysis detected CfumAY624538 protein during late stages of the last instar and at the prepupal stage (Figure 4.1). Signals were not detected during the 5 th to 6 th instar, and this corresponds well to transcript levels that were characterized in chapter 3. In both cases, gene and protein expression begin at low levels during early time points of developing 6 t h instar larvae, and peak during the later time points, just prior to pupation. CfumAY624538 transcripts decline at the last time points prior to the prepupal stage; this declining trend is reflected in the expression of CfumAY426538 protein in male larvae, but not in female larvae (Figure 4.1). In female larvae, protein levels continue to remain high through to the prepupal and green pupal stages. 99 m CD •<- w W S5 S 2 £ Male Female Figure 4.1. Expression of CfumAY426538 protein at 8 h time intervals during the development of the last instar of ESB larvae, determined by Western blot analysis. Males began pupating after 120 h, therefore there are no male samples at the 128 h and 144 h time points, and, male prepupal samples were not available for analysis. Total protein concentrations were determined using the Bradford assay, 15 pg of total protein was loaded into each lane. HCS = head capsule slip, WHC = white head capsule. 4.2.2 CfumAY426538 binds to the fluorescent reporter 1 -NPN, but not 1 - A M A Two fluorescent techniques have been commonly used to study the binding specificity of CSPs and OBPs; one technique uses fluorescent reporters such as 1-NPN and 1-AMA, while the other measures the intrinsic fluorescence of the conserved tryptophan residues. When excited at a wavelength of 337 and 298 nm, 1-NPN and 1-AMA, respectively, exhibit weak fluorescence intensities in the 400 - 550 nm range (Campanacci et al., 2001; Ban et al., 2002; 2003). When equilibrated with an equal molar concentration of binding protein, the emission spectra of the fluorophore that is bound to the protein undergoes a blue shift and an increase in intensity, due to hydrophobic changes in its environment. When excited at 295 nm, Trp 100 fluoresces in the 320-350 nm range, and this fluorescence can be quenched when the residue is in close proximity to a ligand (Briand et al., 2002; Lartigue et al., 2002). When 2 uM each of CfurnAY624538 and 1-NPN was equilibrated, and excited at 337 nm, the emission spectrum of 1-NPN was blue-shifted, peaking at 403 nm, and accompanied by an 8-10 fold increase in intensity (Figure 4.2). When the assay was repeated using 1-AMA, there was no change in the fluorescence spectrum, indicating the absence of binding to CfumAY624538. 1000.000 360 400 450 500 550 600 Wavelength (nm.) Figure 4.2. Emission spectra of 1-NPN bound to CfumAY624538. The solid purple line represents CfumAY624538 (2 uM) equilibrated with 1-NPN (2 pM); the dashed purple line represents the spectra of the CfumAY624538 alone; the solid red line is 1-NPN alone (2 u.M). All solutions were dissolved in 50 mM Tris buffer pH 7.6 (solid blue line) and excited at 337 nm. Fluorescent emission of 1-NPN bound to CfumAY624538 peaks at 403 nm, intensity is measured in arbitrary units. 101 Fluorescence measurements can be affected by a phenomenon termed photobleaching, whereby continued excitation of the fluorophore by light can cause the electrons to enter higher states that do not fluoresce (by returning to the ground state). This effect can be reversible when the fluorophore is removed from the excitation beam, or, the effect is irreversible in some cases where the excited fluorphore can react with other chemicals, such as oxygen. Under my experimental conditions, we found that 1 -NPN was very susceptible to photobleaching, but intrinsic tryptophan fluorescence was not. For example, repeated measurements of the 1-NPN fluorescence spectra resulted in successive reductions in the intensity at 403 nm (Figure 4.3); this effect was not observed for intrinsic Trp fluorescence. 1000 OOOi r 1 1 1 1 Wavelength (nm) Figure 4.3. Photobleaching of 1 -NPN fluorescence after repeated measurements. The fluorescence spectrum of a 2.0 pM solution of CfumAY624538 and 1-NPN was recorded after excitation at 337 nm. Seven repeated measurements were recorded, one immediately after the other: 1st spectrum in solid purple, 3 r d spectrum in solid red; 5 th spectrum in solid green; and, 7 t h spectrum in solid blue. After the 7 th measurement, the cuvette was placed in the dark for 10 min, and a final spectrum recorded (dashed purple line). The spectrums peak at 403 nm, intensity is measured in arbitrary units. 102 To confirm that the fluorescence peak at 403 nm corresponds to 1 -NPN bound within the internal pocket of CfumAY624538, fluorescence of the single Trp residue was measured. When excited at 295 nm, CfumAY624538 produces a fluorescent peak at about 335 nm, and when equilibrated with an equal molar concentration of 1 -NPN (2 pM), the fluorescence at 335 nM is quenched (Figure 4.4). I used Trp quenching to estimate the binding curve of 1-NPN, since Trp was not susceptible to photobleaching (Figure 4.5). 1000.000) 800.000 400.000 200.000K 0.0001 1 1 1 1 1 315 320 330 340 350 355 Wavelength (nm.) Figure 4.4. Quenching of intrinsic Trp fluorescence by 1-NPN binding. A 2 pM solution of CfumAY624538 was excited at 295 nm in the presence of 1-NPN, added sequentially to final concentrations of 1.5, 3, 9, 15, 21 and 27 pM using 1 and 10 mM stock solutions dissolved in methanol. The emission spectra peaked at 335 nm, intensity is measured in arbitrary units. 103 1A 400 -i (0 c o 300 -c c 200 -o O) c 100 -ns £ o 0 -10 15 20 [1-NPN] uM 25 30 IB 50 r - i | 40 i 30 & 20 w o 10 100 200 Intensity 300 400 Figure 4.5. A) Binding of 1-NPN to a 2 uM solution of CfumAY624538 as measured by quenching of intrinsic Trp fluorescence at 335 nm, after excitation at 295 nm. 1-NPN was added sequentially to final concentrations of 1.5, 3, 9, 15, 21 and 27 um using 1 and 10 mM stock solutions of 1-NPN dissolved in methanol. B) Scatchard plot of data plotted in Figure 1A, Kd equals 8 u.M. 4.2.3 Ligand competition assays Due to the susceptibility of 1-NPN to photobleaching, limited quantities of protein, and no prior knowledge indicating what the endogenous ligand might be, I opted for a screening approach, where various ligands were tested at a 1:1 ratio against the fluorescent reporter 1 -NPN. A solution of 1-NPN and CfumAY624538 (2 pM each) was equilibrated, dispensed into 1.5 ml aliquots, and baseline, pretreatment fluorescence spectra recorded. Various competitors were then added to each aliquot at an equal molar concentration (2 pM), and incubated in the 104 dark for 10 min, after which the fluorescence spectra were recorded. The post treatment decrease in fluorescence caused by each competitor was calculated and compared to that of control samples that received solvent only. In this way, the control samples accounted for decreases in fluorescence due to photobleaching and/or solvent effects. Of a total of 11 chemicals tested (Figure 4.7), 2-decen-l-ol was the only ligand that significantly displaced 1-NPN (Figure 4.6). Medium to long chain length fatty acids (C13, C15, C17 and C19 and C22) and their methyl esters (C18 and C20 methlyl esters) did not displace 1-NPN, nor did more bulky ligands such as 1-menthol and (+) disparlure. Ligands of physiological relevance, such as the insect juvenile hormone 1 (JH1) and the ecdysteroid agonists halofenoxide and tebufenoxide, also failed to displace 1-NPN in competition at a 1:1 ratio. 90 o c 0) CO S 70 50 cu CO 03 cu o Q 30 10 -10 T3 — ~u O o o (0 to ra o o o 'o o 'o c c c <9 ro. to o d o tu tD tu "O — — ro ra c Q. tu tu D. o n ™ CU — Figure 4.6. Binding of 11 different ligands in competition with 1-NPN. Each ligand was added at a molar concentration equal to that of 1 -NPN (2 uM), and the percent decrease in fluorescence intensity was recorded at 403 nm. The control samples (methanol solvent only) averaged a 12.5% decrease in fluorescence (95% confidence limits, 5% to 22%, illustrated on the graph by horizontal lines). 105 1 -NPN (N-phenyl-1 -naphthylamine) 1 - A M A (1 -amino-anthracene) NH NH 2 oiah E-2-decen-l-ol tridecanoic acid v OH pentadecanoic acid heptadecanoic acid OH o^ nonadecanoic acid docosanoic acid (7/?,85)-(-i-)-disparlure juvenile hormone 1 (ethyl cis-10,1 l-epoxy-7-ethyl-3,l 1-dimethyl-trans, trans-2,6-tridecadienoate) 1 -menthol C H 3 Figure 4.7. Structure of chemicals used in binding assays (Figure 4.6). 106 4.2.4 RNAi is effective in situ, but does not reduce transcript levels in vivo RNAi is a relatively new technique of post-transcriptional gene silencing that uses dsRNA, first demonstrated with the nematode C. elegans (Fire et al., 1998). I validated the ability of this technique to interfere with gene expression (and subsequent protein expression) in lepidoperan insects using cell cultures. The C A T (chloramphenicol acetyltransferase) reporter plasmid was transfected into Spodoptera frugiperda (Sf-9) cell cultures, and five different treatments compared: 1) C A T reporter alone, 2) C A T reporter plus 15 pg of transfected ds C A T RNA, 3) C A T reporter, and the culture media was amended with 15 pg of ds C A T RNA (termed RNAi soaking), 4) dsRNA transfected alone (no C A T reporter), and 5) mock cells with no treatment. Cells transfected with the C A T reporter exhibited reaction rates of 80-100 cpm, which were reduced effectively to zero when dsRNA homologous to the C A T ORF was cotransfected (Experiment 1, Figure 4.8). Soaking the Sf-9 cells with ds C A T RNA after transfection with the C A T reporter also reduced the reaction rate, but only slightly. In a second experiment, treatments 1, 2 & 5 were repeated, and the cotransfected ds C A T RNA was effective at a dose of 5 pg per well (Experiment 2, Figure 4.8). The negative controls, ds C A T RNA alone and mock cells alone, did not exhibit any C A T activity. 107 A) 120 _ 100 A B) SWell #1 El Well #2 Figure 4.8. A) C A T activity in extracts of Sf-9 cells after transfection with a C A T reporter vector, either alone, or in combination with 15 pg of ds C A T RNA cotransfected or amended into the media (soaking). B) The experiment was repeated using 5 pg of ds C A T RNA cotransfected with the C A T reporter. 108 Early 6 instar larvae were injected with three different treatments: 1) buffer control, 2) 5 ug dsRNA complementary to the ORF of CfumAY426538, and 3) 2 ug of siRNA (short interfering RNA), a digestion product of the longer dsRNA. Short segments of dsRNA (21-23 nt) with two nucleotide (nt) 3' overhangs (siRNA) are the natural digestion product of the RNAi pathway enzyme Dicer, and are the active principle in post transcriptional targeting of messenger RNA (mRNA) for degradation (Elbashir et al., 2001). In some cases siRNA has proven to be more potent in PTGS experiments (Agrawal et al., 2003). However, larvae injected with either dsRNA homologous to the nucleotide sequence of CfumAY624538, or siRNA produced by its digestion, did not exhibit any obvious phenotypic changes during development to the pupal and adult stages. Northern blot analysis of a subset of larvae did not detect any reduction in CfumAY624538 transcript levels (data not shown). 4.3 Discussion Levels of CfumAY624538 protein clearly increase during the 6th instar stage, peaking immediately prior to pupation. This pattern of protein expression corresponds well to the pattern of gene expression characterized in Chapter 3 of this thesis, and collectively, indicates a function in preparation for pupation. This expression profile has not been reported for CSPs as yet, and no previous studies have analyzed CSP expression in the larval stages. An attempt to "knock down" in vivo protein levels using RNA, failed. While RNAi works well in cell culture, injections into whole animals often fail due to problems associated with translocation to the target tissues expressing the gene of interest. During the last larval instar, many physiological changes occur to set the stage for the development into pupae and the subsequent metamorphosis into the adult stage. For example, a decrease in the levels of juvenile hormone combined with a subsequent peak in ecdysteroid 109 levels during the last instar serves to reprogram development from the juvenile larval stage, to pupation and metamorphosis into the adult stage (reviewed in Riddiford et al., 2003). Many different physiological changes accompany this reprogramming, some of which may be related to the expression pattern or binding characteristics of some CSPs reported in the literature. Some CSP transcripts have been detected in developing imaginal disks (B. mori, Silkbase, WWW.ab.a.u-tokyo.ac.jp/silkbase/ and D. melanogaster, WWW.ncbi.nlm.nih.gov/geo/gds/gds_browse.cgi?gds=l92). Imaginal disks, which eventually transform into organs such as the adult wings and antennae, begin developing during the last larval instar, after juvenile hormone levels decrease (Riddiford et ah, 2003). The potential expression of CfumA Y456238 within developing imaginal disks could be one explanation for its high levels of expression during late stages of the 6th instar. Alternatively, in vitro assays have demonstrated that CSPs from A. mellifera, M. brassicae and P.dominulus bind medium chain-length fatty acids without discrimination (Briand et al., 2002; Campanacci et al., 2003; Calvello et al., 2003). Fat metabolism and storage accelerates during the last larval instar stage; an involvement of CfumAY456238 in fat metabolism could also explain increasing expression during the 6 th instar. Based on these observations, we used the fluorescent reporter 1-NPN in competitive assays with purified CfumAY426538 protein to test the binding of various physiological ligands, including medium chain length fatty acids, the insect hormone JH1, and the ecdysteroid agonists halofenoxide and tebufenoxide. CfumAY426538 bound 1-NPN, but not 1-AMA, although the chemical structures are similar (Figure 4.7). Both are composed of three benzene rings; 1-AMA is composed of anthracene (three benzene rings joined together) with an amino functional group at the number 1 position while 1-NPN is composed of naphthalene (two benzene rings joined together) with an amino functional group at the number 1 position that connects to a phenyl group (Figure 4.7). Even these small conformational differences likely reflect incompatibilities with the 110 structure of the binding pocket. For example, 1-AMA bound to MbraPBPl and ApolPBPl, but not to BmorPBP 1 (Campanacci et al, 2001). When the binding pocket of MbraPBPl was mutated to more closely resemble that of BmorPBP 1, it no longer bound 1-AMA (Campanacci et al, 2001). Diverse CSPs, however, appear to preferentially bind 1-NPN as compared to 1-A M A . 1-AMA did not bind to two different orthopteran CSPs (Ban et al, 2002; 2003) or to two different lepidopteran CSPs (MbraCSPA6 [Campanacci et al, 2003] and CfumAY426538 herein). At least four different CSPs bind 1-NPN (Table 1.2 and CfumAY426538 herein). The fluorescence of an intrinsic Trp residue was used to confirm binding of 1-NPN in the internal pocket of CfumAY426538. Almost all CSPs identified to date retain two highly conserved aromatic amino acid positions (Figure 2.2, positions 27 and 85). The Tip residue at position 27 of CfumAY426538 is the only Trp residue in the mature protein (Figure 2.2) (the three Trp residues at the N-terminus of CfumAY426538 [Figure 2.1] are part of the native signal peptide which was not included in the expression vector). Based upon the 3-D crystal structure of MbraCSPA6 (Lartigue et al., 2002; Campanacci et al., 2003), the aromatic side chains of the residues at positions 27 and 85 (Figure 2.2) are predicted to face the internal binding pocket. When excited at a wavelength of 295 nm, Trp fluoresces between 320 and 350 nm, and this fluorescence can be quenched by interactions with a ligand (Briand et al, 2002; Lartigue et al, 2002). CfumAY426538 fluorescence at 335 nm was quenched by 1-NPN, and increasing concentrations of 1 -NPN saturated the quenching effect, allowing the binding strength (Kd) to be estimated at approximately 8 pM. Fatty acids with 13, 15, 17, 19 and 22 carbon chain lengths failed to displace 1-NPN from CfumAY426538, as did the methyl esters of strearic acid (C18) and arachidoic acid (C20). These results are opposite to those obtained using MbraCSPA6, AmelASP3C and PdomCSP, which bound various alkyl derivatives with 14-18 carbon chain lengths, but did not 111 bind more bulky ligands (Table 1.2; Briand et al, 2002; Calvello et al., 2003; Campanacci et al., 2003). In contrast, two orthopteran CSPs, CSPLm-II-10 and CSP-Sg4, bound larger more bulky ligands that resembled the fluorescent reporter 1-NPN (Table 1.2; Ban et al., 2002; 2003), and failed to bind 14-18 carbon fatty acids. CfumAY426538 also failed to bind to more bulky ligands such as 1 -menthol and (+) disparlure, and large physiologically relevant ligands such as a racemic mixture of JH1 and two ecdysteroid agonists, halofenoxide and tebufenoxide. Of 11 ligands tested, a 10 carbon unsaturated alcohol, 2-decen-l-ol, was the only ligand that significantly displaced 1-NPN at a 1:1 competitive ratio. Two PBPs and one CSP have been crystallized with short to medium chain-length alcohols in their binding pockets. DmelLush was crystallized with ethanol, propanol and butanol in its binding pocket (Kruse et al, 2003), LmadPBP with hydroxy-butan-2-one (Lartigue et al, 2003) and MbraCSPA6 with three molecules of 12-bromodocecanol (Lartigue et al, 2002; Campanacci et al, 2003). One might expect that fatty acids and alcohols of similar chain length might behave similarly in binding assays, but there does not appear to be a clear trend. For example, MbraCSPA6, PdomCSP and AmelASP3C generally bind fatty acids, but only MbraCSPA6 and PdomCSP also bound to small alcohol derivatives (such as dodecanol), AmelASP3C failed to bind to small alcohol derivatives such as 2-nonanol and 3,7-dimethyl-2,6-octadiene-l-ol (Table 1.2; Briand et al, 2002; Lartigue et al, 2002; Calvello et al, 2003; Campanacci et al, 2003). The orthopteran CSPs that preferentially bound larger ligands did not bind short-chain alcohol derivatives such as 1-nonanol, decanol and 11-bromo-undecanol (Ban et al, 2002; 2003). In addition, it is not clear why PdomCSP binds dodecanol and methyl dodecanoate, but does not bind dodecanoic acid (Calvello et al, 2003). It is not clear whether the in vitro data supporting the binding of short to medium chain-length alcohols and their derivatives can be correlated to their in vivo function. In the 112 case of LmadPBP, hydroxy-butan-2-one is a cockroach pheromone (Lartigue et al., 2003), and in the case of DmelLush, short chain alcohols are host volatiles that are attractive (Kim et al., 1998; Kim and Smith, 2001).. However, ethanol did not bind to DmelLush using in vitro binding assays (Zhou et al., 2004). Several investigators have described a solvent effect when using fluorescent reporters in binding assays, where the ethanol or methanol used to dissolve the ligands being tested quenched fluorescence. For example, Briand et al. (2002) found that an ethanol concentration of 0.2% could quench fluorescence by as much as 10%. In light of these results, it might not be surprising that short chain-length alcohols can reduce the intensity of reporter fluorescence and crystallize in the binding pocket of CSPs and OBPs. However, binding assays typically use ligand concentrations in the pM range (I tested 2-decen-l-ol at a concentration of 2 uM), whereas 0.2% ethanol corresponds to a concentration in the mM range, higher by a factor of 10 . Where possible, several different methods should be employed to measure ligand binding strength, as exemplified in Honson et al. (2003). 4.4 Mater ia ls and methods 4.4.1 CfumAY624538 expression and purification CfumAY426538 was expressed in Escherichia coli BL21 (DE3) cells using the pET 22B vector (Novagen). pET 22B has the option of including a bacterial signal peptide (pelB leader) at the N-terminus which targets the expressed protein to be secreted into the bacterial periplasm. A 447 bp fragment containing the ORF was amplified from a cDNA clone using forward and reverse primers ( 5 ' A T G G C C A T G G G G A C G T A T A C G G C T G A G A A T 3' and 5' C A G G A T C C G C C C A T T T T T A T T T A T T T T T G T G A T 3', respectively). PCR conditions: IX Pfu buffer (Stratagene), 2.5 mM MgCl 2 , 0.1 mM dNTPs, 0.2 pM primers, 1 U Pfu polymerase in 50 pi total volume. PCR reactions were amplified using a Geneamp 2400 thermo-cycler 113 (Perkin-Elmer) set at one cycle of 94°C for 60 s, 25 cycles of (94°C for 30 s; 50-68°C gradient for 30 s; and 72°C for 45 s), and one final cycle of 72°C for 7 min. The forward primer contained the NCOI restriction site and the reverse primer contained the BamHI restriction site. The PCR product was restricted with NCOI and BamHI, gel purified using the MinElute PCR Purification Kit (Qiagen) and cloned into the pET22B vector. The resulting vector produces a transcript that encodes the pelB leader fused in frame to the mature CfumAY426538 protein (minus the native signal peptide); the mature protein begins with the N-teminal amino acid sequence G T Y T A E C D D L (signal peptide prediction software was used to determine the boundary where the mature protein begins). By using the CfumAY426538 stop codon, vector sequence that encodes for C-terminal fusions is not included. The pelB leader is cleaved upon secretion of the protein into the periplasm, leaving a single methionine residue as the only modification to the mature protein (1 + 102 residues in total). The plasmid vector DNA from three positive clones was purified and each was screened for protein expression after transformation into BL21 (DE3) cells. Each clone was cultured at 37°C with shaking, in 5 ml aliquots of Luria-Bertani (LB) broth (Invitrogen) amended with 50 mg/L ampicillin (LB-Amp50), until the cultures reached an optical density of absorption at 595 nm (OD595) of 0.6. A 1 ml aliquot of each culture was mixed with glycerol and stored at - 8 0 ° C ; the remaining culture was then split into two equal fractions. One fraction was amended with 0.4 mM isopropyl pD-thiogalactoside (IPTG) to induce protein expression. Both the control and induced fractions were incubated at 30°C for two hours, and then centrifuged at 4 000 x the force of gravity (g) for 5 min at 4°C. After removing the supernatant, the cell pellets were suspended in 100 pi of 20 mM Tris buffer pH 7.5, sonicated (three intervals of 3 s each), and centrifuged at 20 000 g for 10 min. The soluble fraction (supernatant) was mixed with an equal volume of 2x SDS loading buffer (4% SDS -114 100 mM dithiothreitol [DTT]), boiled for 5 min, and the DNA sheared with a syringe. Samples of each clone, induced and control fractions, were resolved by 12% SDS-PAGE and the proteins visualized by coumasie staining. Positive clones were sequenced in both directions using gene specific primers and the Taq Dye Deoxy Terminator Cycle Sequencing Kit combined with an automatic DNA sequencer (model 310, Applied Biosystems). A clone that expressed CfumAY426538 in the soluble fraction was cultured in 250 to 1000 ml volumes of LB-Amp50 broth to an OD595 of 0.6, and induced with 0.4 mM IPTG overnight at 27°C. The cells were harvested by centrifuging at 5 OOOxg for 15 min; soluble protein in the periplasm was collected using a freeze/thaw protocol as described by Johnson and Hecht (1994). Briefly, the cellular pellet from 1 1 of culture was suspended in 10 ml of Tris buffer pH 8.0 amended with protease inhibitor cocktail (Sigma), subjected to five freeze/thaw cycles in liquid nitrogen, mixed for 1 h at 4°C and the soluble supernatant fraction collected after centrifuging at 10 OOOxg for 20 min at 4°C. The soluble protein was resolved by preparative 10% native PAGE and thirty fractions collected by electroelution using a Biorad whole gel eluter. Fractions containing the expressed protein were analyzed by 13.5% SDS-PAGE, and the molecular mass of purified CfiimAY426538 was verified by MALDI-TOF (matrix assisted laser desorption ionization - time of flight) mass spectroscopy. 4.4.2 Polyclonal antibody preparation Polyclonal antisera was produced by the Animal Care Centre, University of British Columbia, 6199 South Campus Road, Vancouver, B.C., Canada. CfumAY624538 was shipped on ice by overnight courier to the Animal Care Center, and a 0.5 ml sample containing approximately 0.250 mg was mixed with an equal volume of Freund's complete adjuvant (Sigma) and injected into a female New Zealand White rabbit. Two additional boosters (0.250 mg protein mixed with an equal volume of Freund's incomplete adjuvant [Sigma]) were 115 injected at two to three week intervals. Ten days after the last booster shot, the animal was exsanguinated while under deep anesthesia. The blood was received after shipping on ice by overnight courier, incubated at 37°C, and allowed to coagulate overnight at 4 °C. The blood serum was collected after centrifuging at 7 000 g for 20 min, and stored in aliquots at - 2 0 ° C . 4.4.2 Western blot analysis Insect rearing, staging and collections are described in Chapter 3. Multiple samples of each developmental stage were stored at -80°C, thus, the samples used to assess protein expression in this chapter are parallel samples to those used to characterize gene expression in Chapter 3. Insects were homogenized in 0.5 - 1.5 ml aliquots of Tris buffer pH 8.0 amended with protease inhibitor cocktail (Sigma), briefly sonicated, and centrifuged at 12 OOOxg for 20 min at 4°C. The soluble fraction (supernatant) was retained and the total protein concentration quantified using the Bio-Rad protein assay, based upon the Bradford dye-binding procedure (Bradford, 1976). Sample volumes containing 15 pg of total protein were mixed with an equal volume of 2X SDS loading buffer and heated in a boiling water bath for 5 min. Protein samples were separated by denaturing SDS-PAGE and transferred onto Immobilon-P polyvinylidene difluoride membranes (Millipore) using a mini-blotter (Bio-Rad) following the manufacturers' protocol. The membranes were incubated in 5% blotto (5% non-fat dry milk powder dissolved in IX phosphate buffered saline [PBS] and 0.02% Tween-20) for one hour, followed by a one hour incubation in 10 ml of 5% blotto amended with a 1:1000 dilution of CfumAY624538 rabbit polyclonal antiserum. The membranes were washed in IX PBS, and incubated for one hour in 10 ml of 5% blotto amended with a 1:10 000 dilution of goat anti-mouse secondary antibody conjugated to horseradish peroxidase (Jackson Laboratories) to detect bound CfumAy624538 antibody. Secondary antibody was detected using E C L 116 substrate chemiluminescence (Amersham Biosciences) and high-performance chemiluminescence film (Amersham Biosciences). 4.4.3 Fluorescence-based binding assays Fluorescence emission spectra were measured using a Shimadzu RF 1501 spectrophotometer at room temperature. Clear, four-sided methacrylate cuvettes with a 10 mm path length were used, along with a 10 nm slit width for both excitation and emission. 1-NPN was excited at 337 nm, and 1-AMA at 298 nm. All solutions were dissolved in 50 mM Tris buffer pH 7.6. The intrinsic Trp fluorescence of CfumAY624538 (2 pM solution) was measured using an excitation wavelength of 295 nm. A solution of 2 uM CfumAY624538 was equilibrated with 2 uM 1-NPN for 10 min, and was used as a stock solution for competitive binding assays. First, the baseline fluorescence spectra of 1.5 ml aliquots of the CfumAY624538/l-NPN stock was recorded, after which a competing ligand (dissolved in methanol) was added at an equal concentration (2 pM), and the fluorescence spectrum recorded after 10 min incubation in the dark. In this way, no more then two spectral measurements were taken from any individual sample. The methanol concentration did not exceed 1% of the volume, and three samples amended with an equivalent volume solvent were used as control samples. To analyze the data, the intensity of fluorescence at the emission peak was corrected by subtracting the intensity of the protein alone in tris buffer. The percentage difference in intensity due to amendment with competitor was calculated as: (the baseline, pre-amended, intensity) - (competitor amended sample intensity) / corrected baseline intensity. Thus, the percent change in intensity after adding the competitor ligand was measured, and compared to parallel control samples that received only methanol solvent. JH1, halofenoxide and tebufenoxide were kindly supplied by Bill Tomkins, Great Lakes Forestry Centre, Sault Ste. Marie, ON, Canada. Originally, JH1 was synthesized 117 by Ayerst Research Labratories (lot # AY-22 342), and it is a mixture of the eight possible geometric isomers, each of which possesses biological activity. Halofenoxide and tebufenoxide, technical grade material (RH-0345 and RH-5992, respectively), was originally supplied by Rohm and Hass Reseach Laboratories. (+) disparlure and all other chemicals used as competitors were kindly provided by Dr. E. Plettner, Chemistry Department, Simon Fraser University, Vancouver, BC, Canada. 4.4.4 R N A interference Since it is relatively new, the functioning of RNAi in Lepidoptera was first validated using the C A T (chloramphenicol acetyltransferase) reporter transfected into Sf-9 cell cultures. Briefly, C A T catalyzes the acetylation of chloramphenicol; the use of radiolabeled substrate (acetyl-3H-coenzyme A) allows the reaction (which is directly proportional to the amount of C A T enzyme present) to be quantified (Gorman et al, 1982). The C A T assay protocol used herein is based on the method described by Neumann et al. (1987). The Op IE1-CAT reporter, and dsRNA complementary to the C A T ORF, was transfected into iSy-9 cells using lipofectin reagent (Gibco). The following treatments were applied to 106 Sf-9 cells per well in a 6 well plate: 1) Op IE1-CAT alone, 2 ug, 2) Op IE1-CAT plus 15 ug ds C A T RNA cotransfected, 3) Op IE 1-CAT plus 15 ug ds C A T RNA added to the cell culture 5 h post infection, termed RNAi soaking, 4) 15 ug of C A T dsRNA alone, and 5) mock, no treatment control. Each treatment was replicated by two wells. The transfected cells were harvested 48 h after infection, centrifuged at 10 600xg and the pellet resuspended in 100 pi of 0.25 mM Tris-HCI, pH 7.8. After lysing by three freeze-thaw cycles (freeze at - 8 0 ° C and that at 37 °C), the lysate was incubated at 65 °C for 15 min to inactivate cellular deacetlyases. After pelleting the cell debris, the supernatant was added to a cocktail containing 9 mM chloramphenicol, 60 mM Tris pH 7.8, 188 uM acetyl-coenzyme A (Sigma), 0.025 uCi [3H]acetyl-coenzyme A (New 118 England Nuclear, C A T Assay Grade), and distilled H 2 O up to a total volume of 85 pi. Each sample was overlaid with 3 ml of toluene-based scintillation fluor (Econofluor-2; Packard Bioscience Co.) and the enzymatic reaction measured by the scintillilation counter (Beckman; LS 6500). dsRNA was synthesized using the Litmus vector (NewEngland Biolabs) with a 300 bp section of Op IE1-CAT ORF blunt-end cloned into the EcoRV multiple cloning site (MCS), that has T7 RNA polymerase promoters located on each side. T7 RNA poymerase was used to simultaneously synthesize both sense and antisense RNA strands using the HiScribe kit (NewEngland Biolabs) transcription kit, following the manufacturers recommended protocol. 4.5 References Agrawal N., Dasaradhi P. V., Mohmmed A., Malhotra P., Bhatnagar R. K., and Mukherjee S. K. (2003). RNA interference: biology, mechanism, and applications. Microbiol Mol Biol Rev 67: 657-85. Ban L. , Scaloni A., Brandazza A., Angeli S., Zhang L. , Yan Y., and Pelosi P. (2003). Chemosensory proteins of Locusta migratoria. Insect Mol Biol 12: 125-34. Ban L. , Zhang L. , Yan Y., and Pelosi P. (2002). Binding properties of a locust's chemosensory protein. Biochem Biophys Res Commun 293: 50-4. Bradford M . M . (1976). A rapid and sensitive method for the quantitation of microgram quantities of protein utilizing the principle of protein-dye binding. Anal Biochem 72: 248-54. Briand L. , Swasdipan N., Nespoulous C , Bezirard V., Blon F., Huet J. C , Ebert P., and Penollet J. C. (2002). Characterization of a chemosensory protein (ASP3c) from honeybee (Apis mellifera L.) as a brood pheromone carrier. Eur J Biochem 269: 4586-96. Calvello M . , Guerra N., Brandazza A., D'Ambrosio C , Scaloni A., Dani F. R., Turillazzi S., and Pelosi P. (2003). Soluble proteins of chemical communication in the social wasp Polistes dominulus. Cell Mol Life Sci 60: 1933-43. Campanacci V., Krieger J., Bette S., Sturgis J. N., Lartigue A., Cambillau C , Breer H., and Tegoni M . (2001). Revisiting the specificity of Mamestra brassicae and Antheraea polyphemus pheromone-binding proteins with a fluorescence binding assay. J Biol Chem 276: 20078-84. 119 Campanacci V., Lartigue A., Hallberg B. M . , Jones T. A., Giudici-Orticoni M . T., Tegoni M . , and Cambillau C. (2003). Moth chemosensory protein exhibits drastic conformational changes and cooperativity on ligand binding. Proc Natl Acad Sci USA 100: 5069-74. Elbashir S. M . , Martinez J., Patkaniowska A., Lendeckel W., and Tuschl T. (2001). Functional anatomy of siRNAs for mediating efficient RNAi in Drosophila melanogaster embryo lysate. EmboJ20: 6877-88. Fire A., Xu S., Montgomery M . K., Kostas S. A., Driver S. E. , and Mello C. C. (1998). Potent and specific genetic interference by double-stranded RNA in Caenorhabditis elegans. Nature 391: 806-11. Gorman C. M . , Moffat L. F., and Howard B. H. (1982). Recombinant genomes which express chloramphenicol acetyltransferase in mammalian cells. Mol Cell Biol 2: 1044-51. Honson N., Johnson M . A., Oliver J. E. , Prestwich G. D., and Plettner E. (2003). Structure-activity studies with pheromone-binding proteins of the gypsy moth, Lymantria dispar. Chem Senses 28: 479-89. Johnson B. H., and Hecht M . H. (1994). Recombinant proteins can be isolated from E. coli cells by repeated cycles of freezing and thawing. Biotechnology (N Y) 12: 1357-60. Kim M . S., Repp A., and Smith D. P. (1998). LUSH odorant-binding protein mediates chemosensory responses to alcohols in Drosophila melanogaster. Genetics 150: 711-21. Kim M . S., and Smith D. P. (2001). The invertebrate odorant-binding protein LUSH is required for normal olfactory behavior in Drosophila. Chem Senses 26: 195-9. Kruse S. W., Zhao R., Smith D. P., and Jones D. N. (2003). Structure of a specific alcohol-binding site defined by the odorant binding protein LUSH from Drosophila melanogaster. Nat Struct Biol 10: 694-700. Lartigue A., Campanacci V., Roussel A., Larsson A. M . , Jones T. A., Tegoni M . , and Cambillau C. (2002). X-ray structure and ligand binding study of a moth chemosensory protein. JBiol Chem 277: 32094-8. Lartigue A., Gruez A., Spinelli S., Riviere S., Brossut R., Tegoni M . , and Cambillau C. (2003). The crystal structure of a cockroach pheromone-binding protein suggests a new ligand binding and release mechanism. J Biol Chem 278: 30213-8. Mosbah A., Campanacci V., Lartigue A., Tegoni M . , Cambillau C , and Darbon H. (2003). Solution structure of a chemosensory protein from the moth Mamestra brassicae. Biochem J 369: 39-44. Neumann J.R., Morency C.A. and Russian K.O. (1987). A novel rapid assay for chloramphenicol acetyltransferase gene expression. Biotechniques 5: 444-447. 120 Riddiford L. M . , Hiruma K., Zhou X., and Nelson C. A. (2003). Insights into the molecular basis of the hormonal control of molting and metamorphosis from Manduca sexta and Drosophila melanogaster. Insect Biochem Mol Biol 33: 1327-38. Zhou J. J., Zhang G. A., Huang W., Birkett M . A., Field L. M , Pickett J. A. , and Pelosi P. (2004). Revisiting the odorant-binding protein L U S H of Drosophila melanogaster: evidence for odour recognition and discrimination. FEBS Lett 558: 23-6. 121 CHAPTER FIVE Comparative characterization of a non-sensory odorant binding protein from the Eastern spruce budworm, Choristoneura fumiferana 5.1 Introduction Odorant binding proteins, members of a large multigene family, have been identified from the insect orders Coleoptera, Diptera, Hemiptera, Hymenoptera and Lepidoptera (reviewed by Vogt et al., 2002), and more recently, from the Dictyoptera (Riviere et al., 2003), Isoptera (Ishida et al., 2002) and Orthoptera (Ban et al., 2003). Many studies conducted over the last two decades support the general hypothesis that OBPs transport hydrophobic stimuli in the hydrophilic lymph that surrounds sensory neurons housed within sensilla (reviewed recently by Vogt et al., 2002). Facts that support this include the 3-D structure of several OBPs that have been solved, and that reveal a small globular protein composed of six amphiphatic a-helices that surround an internal hydrophobic binding pocket (as referenced in Table 1.1). Consistent with their structure, in vitro assays have demonstrated the ability of OBPs to bind to, and discriminate between, hydrophobic chemicals such as pheromones and general odorants (Du and Prestwich, 1995; Maibeche-Coisne et al, 1997; Plettner et al, 2000; Campanacci et al., 2001). Genetic studies have supported the importance of OBPs to olfactory function. A genetic locus responsible for a polymorphism in fire ants, that determines the number of queens in a colony, was found to encode an OBP - worker ants regulate the numbers in response to chemical cues produced by the queen ants (Krieger and Ross, 2002). A fruit fly mutation that resulted in an inability to avoid high concentrations of ethanol, as compared to wild types, was traced to an OBP gene that was subsequently termed lush (Kim et ah, 1998; Kim and Smith, 2001). The lush protein has been crystallized with ethanol in its binding 122 pocket (Kruse et al, 2003). While supporting an olfactory function, this result indicates that soluble ligands may also be transported by OBPs. The first OBP was identified from the antennae of male moths, and was assigned the function of pheromone binding protein based upon its specific expression in the lymph of sensilla that detect female produced pheromone, and the in vitro ability of the protein to bind to the pheromone components (Vogt and Riddiford, 1981). The specific expression of OBP transcripts in the support cells of sensilla, and secretion of the protein into the lymph surrounding the sensory neurons, is a typical characteristic that is common to insects from diverse orders, and supports the idea that most members of the gene family function as transporters of hydrophobic odorants (Vogt and Riddiford, 1981; Vogt et al., 1991; Steinbrecht et al, 1992; Galindo and Smith, 2001; Shanbhag et al, 2001; Vogt et al, 2002). Two OBP genes from D. melanogaster were detected in only four sensilla found on each of the six tarsi of adult transgenic flies (Galindo and Smith, 2001). The specific expression of PBP genes in patterns that are consistent with their function can be illustrated by a recently discovered cockroach PBP; its transcripts are expressed specifically in the adult female antennae, that detect the sex pheromone produced by the male gender (Riviere et al, 2003). The fact that some OBPs are expressed in non-sensory tissues has received less attention, and most of the data has resulted incidentally from indirect sources. Studies designed to detect genes and proteins expressed specifically in the haemolymph (Graham et al, 2001; 2003; Vierstraete et al, 2003), accessory sex glands (Paesen and Happ, 1995), salivary glands (Area et al, 2002; Calvo et al, 2002; 2004) and the brain (Fujii and Amrein, 2002) have identified OBPs from coleopteran, dipteran and lepidopteran species. OBPs are highly divergent; the six Cys residues that form three disulphide bridges are the only residues conserved in all "classical" OBPs (Vogt et al, 2002). Interestingly, most of the non-sensory OBPs identified to date lack two of the conserved Cys residues, and the proteins form only two 123 disulphide bonds (4-Cys group). As many as ten different isoforms of the 4-Cys OBPs were isolated from T. molitor haemolymph, and they comprised as much as 5% of the total protein (Graham et al, 2003). In this last chapter, a cDNA clone that encodes a 4-Cys OBP (named CfumSericotropin-like) was isolated from the larval stage of the Eastern spruce budworm, C. fumiferana. I have characterized its gene and protein expression pattern during development, and compared it to that of the CSPs characterized in chapters 3 and 4. Since both protein families have a similar structure and function as it relates to the transport of hydrophobic ligands, a comparative characterization of their non-sensory expression pattern may provide useful insights into their relative non-sensory functions. 5.2 Results 5.2.1 A 4-Cys OBP cloned from Eastern spruce budworm larvae A cDNA clone encoding a protein sequence with homology to the OBP family was identified from a larval ESB EST library by a tblastn search of GeneBank. The complete ORF encodes a protein with 133 amino acid residues (Figure 5.1) that is 78-83% identical to OBPs from three other lepidopteran species representing the families Pyralidae (Galleria mellonella, sericotropin, GenBank# AAA85090), Plutellidae (Plutella xylostella, sericotropin-like, GenBank# BAD26681) and Sphingidae (M. sexta, ABP8, GenBank# AAL60426). Consistent with the nomenclature established in the literature, the clone was named CfumSericotropin-like. The first 14-16 residues are predicted to form a hydrophobic signal peptide (using SIGFIND version 2.10, http://139.91.72.10/sigfind/sigfind.html), a characteristic feature of the OBP family. The molecular mass of the mature protein is predicted to be between 13.4 and 13.6 kDa, depending on the exact size of the putative signal peptide. Rather then the typical six conserved Cys residues that all classic OBPs retain, CfumSericotropin-like has only four (Figure 5.2), as do its homologues within the Lepidoptera. CfumSericotropin-like lacks the 124 two Cys residues that normally form one of the conserved disulphide bridges (C2-C5) (Figure 5.2). CfumSericotropin-like and the homologous proteins mentioned above also group together with several other proteins from the Lepidoptera to form a class that is characterized by the lack of the same two Cys residues (Figure 5.2 & 5.3). Two other 4-Cys classes are formed by a large group of haemolymph proteins (TmolTHP12A-F & THP13A-D) from the beetle T. molitor (Graham et al., 2003) and the fat bodies of the medfly C. capitata (suborder Brachycera) (Christophides et al., 2000). The lepidopteran and coleopteran 4-Cys classes have been linked by a phylogenetic analysis, but a common root with the dipteran class could not be established (Vogt, 2002, Figure 5.3). 5.2.2 CfumSericotropin-like transcripts are highest during metamorphosis CfumSericotropin-like is expressed throughout the development of the Eastern spruce budworm, including larval, pupal and adult stages, but least so in the adults (Figure 5.4). CfumSericotropin-like transcripts were detected at higher relative levels during the 5th to 6 th instar molt (head capsule slippage and white head capsule stages) as compared to one day old 5 t h or 6 th instars (Figure 5.5 & 5.6). The highest levels of expression were detected during the last time points of the last instar, and it continued on through the prepupal, early pupal and four-day-old pupal stages. The transcript size was estimated to be between 700 and 800 bases (the variation between the estimates is the result of variation between the replicated blots) and corresponds to the cDNA insert which was estimated to be 700 bases. CfumSericotropin-like transcripts appeared to be specific to male 6 th instar larvae in Figure 5.4, but this trend reversed in Figure 5.5, where transcripts were detected specifically in female 6 th instar larvae. A more detailed time course analysis of expression in the last instar revealed asynchronous peaks in expression between male and female larvae. In each case, expression of CfumSericotropin-like peaked at the last larval time point, immediately prior to 125 CfumSericotropin-like AACATGAAGACTTTCATCGTATTAGCTATCTGCCTTGTCGCTGCTCAGGCCCTGACAGACGAG M K T F I V L A I C L V A A ..Q.....A... L T D E CAGAAAGAAAAACTGAAGAAACACCGCTCAGAATGCCTCACGGAAACCAAGCCTGACCAGCAA Q K E K L K K H R S E C L T E T K P D Q Q TTGGTAGAAAAACTTAAGACTGGAGACTTTAAGACGGACAATGAGGCACTGAAGAAGTATGTT L V E K L K T G D F K T D N E A L K K Y V CTCTGCATGATGATCAAATCAGAACTGATGACCAAGGATGGGAAATTCAAGAAGGACGTCGCT L C M M I K S E L M T K D G K F K K D V A CTCGCCAAAGTTCCTAACAGTGCTGACAAGCCACTAGTTGAAAAGGTCATTGACAGCTGTCTG L A K V P N S A D K P L V E K V I D S C L GCCAACAAGGGCAACACACCGCAACAGACCGCCTGGAATTACGTCAAATGCTACCACGAGAAA A N K G N T P Q Q T A W N Y V K C Y H E K GACCCCAAGCACTCCATCATCGTCTAAAACCACCAAGCCACCTACGAACAGTCTGTTTCAATG D P K H S I I V * ACAACTCTGATCTTCGTTTGTTTCAAGACCGATTCAATTTGAATGCCGATCGTCCTCTGTTCG TGTTTTTGGATTCTTCGCGATGTCATCTTGTTTAATTTTAAAACCGGCTTGCTTTTTTTTATG TACCTAATTGCATTTTATTGGCGT Figure 5.1. Nucleotide sequence and conceptually translated protein sequence of CfumSericotropin-like. Start and stop codons are underlined, as is the putative signal peptide. The four conserved Cys residues are highlighted as red colored font. 126 Figure 5.2. A) Protein alignment of 4-Cys OBPs representing the orders Coleoptera, Diptera and Lepidoptera, and B) a subset of classical OBPs with the typical six conserved Cys residues. Cys residues are highlighted in red, and are numbered 1 through 6 in their order of occurrence from the protein N-terminus to the C-terminus. Disulphide bridges are formed between C1-C3, C2-C5 and C4-C6. The 4-Cys OBPs lack the C2-C5 pair. GenBank Accession numbers for sequences not reported herein can be found in Vogt (2002). 127 % distance r—=—I 1000 1000 1000 795 1000 681 997 1000 1000 709 BmorGOBP2 MsexGOBP2 HvirGOBP2 HvirGOBPI MsexGOBPI BmorGOBPI MsexPBPI HvirPBPI BmorPBPI — MsexABP4 -MsexABP2 MsexABP6 — MsexABPI -MsexABP3 • MsexABPX MsexABP5 General Odourant Binding Proteins 1 &2 Pheromone Binding Proteins Antennal Binding Protein X 1000 802 838 996 1000 • c TmolB2 TmolBI Tmol13A Tmoll13D 1000 rTmol13C 1000 953 600 612 466 1000 457 860 970 713 1000 1000 965 836 986 -Tmol13B' Tmol12F 961 Tmol12E Tmol12A Tmol12D Tmol12C -MsexABP7 - PxylSericoptropin-like • MsexABP8 —CfumSericotropin-like "GmelSericotropin —BmorAV400108 -DmelOBP99C CcapMSSPgl — CcapMSSPbl - CcapMSSP — DmelOBP99B - DmelOBP99A • DmOBP44A Dme8A -DmelOBP99D Coleoptera & Lepidoptera 4 - C y s Odourant Binding Proteins Brachycera 4 - Cys Odourant Binding Proteins Figure 5.3. Neighbor joining distance phenogram of the 4-Cys OBPs from the orders Coleoptera, Diptera (suborder Brachycera) and Lepidoptera, and, a subset of classical lepidopteran OBPs, that retain all six conserved Cys residues. The phenogram is collapsed to nodes with 50% or greater bootstrap support, n = 1000 replicates. Branch lengths are proportional and the scale represents percent sequence distance. GenBank Accession numbers for sequences not reported herein can be found in (Vogt, 2002). 128 the prepupal stage (Figure 5.6). However, this peak is at 20 h for males, and at 144 h for females, since male larvae develop more quickly. Under the rearing conditions herein, male larvae began pupating approximately 20 h before female larvae. When only a few time points are sampled, asynchronous development between males and females could appear as sex specific expression. However, expression levels in Figure 5.6 appear to be higher in male larvae across all time points. Interestingly, while CfumSericotropin-like is expressed in all three segments of both larval and adult stages, it is by far the highest in the heads of larvae and moths as compared to the thorax and abdomen segments (Figure 5.7). 5.2.3 CfumSericotropin-l ike protein also is most abundant dur ing metamorphosis Expression of CfumSericotropin-like protein during development, as determined by Western blotting, mirrors the gene expression pattern presented in section 5.2.2. Weak bands are evident during the 5 th to 6 th instar molt (5th instar HCS), and continue for the first 40 hours of the 6 t h instar larval stage (Figures 5.8 & 5.9). High levels of protein begin to appear at the last time points of the last instar, and continue on through the prepupal and pupal stages (Figures 5.8 & 5.9). In adult insects however, only a very weak band is visible (Figure 5.8). Clearly, expression of CfumSericotropin-like transcripts peak immediately prior to prepupation, where high levels of protein are also first detected, and that continue through the pupal stages. A second, faint band appears above the strong signals that correspond to the prepupa and green pupa samples (Figure 5.9). This band may represent a related isoform of CfumSericotropin-like, with similar epitopes that are recognized by the polyclonal antibody. At least ten isoforms of the T. molitor haemolymph protein (THP) have been isolated (Graham et al, 2003). 129 s a w a c £ » C in C 00 Q. 3 CL 3 •D Figure 5.4. Expression of CfumSericotropin-like, determined by Northern blot analysis, at different developmental stages of the Eastern spruce budworm. Each lane was loaded with 5 ug total RNA. Ribosomal bands are illustrated at the bottom of the figure as an indication of total RNA loaded. 130 itaM M l i M Figure 5.7. Expression of CfumSericotropin-like within the head, thorax and abdomen segments of 6 t h instar larvae (48-60 h old), and adults (24-72 h old), of the Eastern spruce budworm. Each lane of the Northern blot was loaded with 5 ug of total R N A . Ribosomal bands are illustrated at the bottom of the figure as an indication of total R N A loaded. 133 CO o o I I § o Q L. ro CD ro ro 00 to "co c _ c c r: .c CO s CO 03 a =5 CL CD CO Q . CL C CD CD O Q co Q ro ro ^ S, a. 3 a. Q. < Figure 5.8. CfumSericotropin-like protein expression during developmental stages of the Eastern spruce budworm, determined by Western blot analysis. Total protein concentrations were determined using the Bradford assay, and 15 pg of total protein was loaded into each lane; male and female samples were combined. H C S = head capsule slip, W H C = white head capsule. 134 Male Female o i r 5 CO CO I 1 Figure 5 . 9 . CfumSericotropin-like protein expression at eight hour intervals during the development of the last instar of the Eastern spruce budworm, determined by Western blot analysis. Males began pupating after 120 h, therefore there are no male samples at the 128 h and 144 h time points, and, male prepupal samples were not available for analysis. Total protein concentrations were determined using the Bradford assay, 15 pg of total protein was loaded into each lane. HCS = head capsule slip, WHC = white head capsule. 5.3 Discussion The expression of CfumSericotropin-like transcripts and protein is consistent with a function in pupation and metamorphosis, and to a lesser degree, larval molting. CfumSericotropin is expressed throughout the development of the ESB, but the highest levels were observed in pupae. Transcript and protein levels peaked at the latest larval time points, and continued through the prepupal and pupal stages. A transient, less intense peak in gene and protein expression was also observed during the 5 t h to 6th instar molt. These patterns of gene and protein expression are inconsistent with a sensory function that most OBPs are thought to have, and resembles that of some of the CSPs characterized in Chapters 3 and 4. 135 CfumAY426538 was expressed throughout development, peaking in late 6 instar larvae and in pupae, but was not detected in the adult stages; and, CfumAY426539 and CfumAY701858 were up regulated during the 5 t h to 6th instar molt (Chapter 3). Therefore, both CSPs and OBPs can be expressed in similar patterns that are inconsistent with a sensory function, but consistent with a role in development. CfumSericotropin-like is a member of larger group of proteins from the Coleoptera, Diptera and Lepidoptera that have been termed 4-Cys OBPs since they lack two of the typically six conserved Cys residues (Figure 5.2 & 5.3). The coleopteran and lepidopteran 4-Cys proteins have a common phylogenetic root (Vogt, 2002), which cannot be linked to that of the dipteran 4-Cys group, as of yet. The coleopteran 4-Cys proteins are represented by TmolBl and TmolB2, isolated from male tubular accessory sex glands, where they are secreted into the seminal fluid (Paesen and Happ, 1995), and as many as 10 different isoforms of T. molitor haemolymph ptotein (THP) (Graham et al, 2001; 2003). TmolTHPs were detected in the haemolymph of all developmental stages of both sexes, but were most abundant in larval and adult stages, comprising as much 5% of the total protein (Graham et al., 2001). Similarly, a group of 4-Cys OBP genes was identified from C. capitata (dipteran 4-Cys subgroup), some of which were expressed specifically in male fat bodies, from which they are presumably secreted into the haemolymph (Christophides et al., 2000). Another group of OBPs are expressed in female mosquito salivary glands, where they are secreted into the saliva that is injected into the host during blood feeding. Four have been identified from A. gambiae (Area et al., 2002), and nine from Anopheles darlingi (Calvo et al., 2002; Calvo et al., 2004), where they have been termed D7 related protein 1 - n (D7RPl-n), named after the founding member, D7, cloned from Aedes aegypti (James et al., 1991). This group is distinct; it lacks the same two Cys residues that the 4-Cys OBPs lack, but as a group, they also have two additional Cys residues, one on each end of the protein, that other OBPs do not have (Area et 136 al., 2002). It is not known whether these additional Cys residues form a disulphide bond. TmolBl and TmolB2, and the mosquito D7RPs, are thought to bind hydrophobic ligands in secreted fluids, keeping them soluble. Chemicals that are active physiologically can be transferred by insect seminal fluid during mating (Paesen and Happ, 1995) and by mosquito saliva during blood feeding (Area et al, 2002). Both the CSP and OBP gene families appear to have evolved into functions that require soluble, extracellular transport proteins that occur in multiple isoforms. The isoforms may share a common overall function that also requires variable binding specificity. In the peripheral sensory system, the need for diverse pattern recognition is clear, given the molecular diversity of odorants. The fact that as many as 10 different OBPs have been isolated from mosquito saliva and beetle haemolymph raises a question: what physiological processes require this degree of pattern recognition? In this thesis, three CSP genes were expressed in patterns consistent with a role in development; one was downregulated, and two upregulated during a molt, in temporal patterns that did not overlap. Thus, we might suspect that each binds a different class of ligand that is present at specific developmental time points. While some similarities exist, the 4-Cys OBPs appear have greater specificity as compared to the CSPs, in terms of tissue and sex specific expression patterns. All four CSP genes characterized in this thesis were expressed generally in the head, thorax and abdomen segments of the ESB. By comparison, CfumSericotropin-like was almost exclusively expressed in the head segment of both larvae and adults. In fact, three of the four lepidopteran proteins, which were isolated from brains, form a homologous subgroup (Figure 5.3; BmorAV400108, GmelSericotropin and PxylSericotropin-like). DmelOBP99B transcripts were localized to the fat cells surrounding male fly brains, and expression was dependent on the known sex determination pathway genes transformer and doublesex (Fujii and Amrein, 2002). Interestingly, DmelOBP99B protein was also isolated from larval haemolymph 137 (Vierstraete et al., 2003). Sericotropin-like proteins may transport bioactive ligands, including some involved in sex specific physiology, from the brain region to the rest of the body. The disulphide bridge structure of CSPs (two disulphide bonds that form a - a loops) is thought to make their binding pocket (and specificity) more flexible as compared to that of the classical OBPs (that have three disulphide bonds that connect the a - helices) (Tegoni et al., 2004). Noteworthy is the fact that 4-Cys OBPs, along with CSPs, are characterized by broad non-sensory expression patterns. However, to date, in vitro binding assays with 4-Cys OBPs have not been reported, therefore the binding specificity of CSPs, 4-Cys OBPs and classic OBPs cannot be compared. 5.4 Materials and Methods 5.4.1 Insect rearing, sampling and staging: As described in section 3.4.1, Chapter 3. 5.4.2 RNA isolation and Northern blotting: As described in section 3.4.4, Chapter 3. 5.4.3 CfumSericotropin-like expression and purification As described in section 4.4.1, Chapter 4. A 498 bp fragment containing the ORF of CfumSericotropin-like was amplified from a cDNA clone using forward and reverse primers. The forward primer contained the Ncol restriction site and the reverse primer contained the HindHI restriction site. CfumSericotropin-like was electroeluted from the insoluble pellet fraction that was resolved by SDS-PAGE, in addition to native protein from the soluble fraction, where the yield was relatively lower. 5.4.4 Polyclonal antibody preparation: As described in section 4.4.2, Chapter 4. 5.4.5 Western blot analysis: As described in section 4.4.3, Chapter 4. 138 5.5 References Area B., Lombardo F., Lanfrancotti A., Spanos L. , Veneri M . , Louis C , and Coluzzi M . (2002). A cluster of four D7-related genes is expressed in the salivary glands of the African malaria vector Anopheles gambiae. Insect Mol Biol 11: 47-55. Ban L. , Scaloni A., D'Ambrosio C , Zhang L. , Yahn Y., and Pelosi P. (2003). Biochemical characterization and bacterial expression of an odorant-binding protein from Locusta migratoria. Cell Mol Life Sci 60: 390-400. Calvo E. , Andersen J., Francischetti I. M . , de L. C. M . , deBianchi A. G., James A. A., Ribeiro J. M . , and Marinotti O. (2004). The transcriptome of adult female Anopheles darlingi salivary glands. Insect Mol Biol 13: 73-88. Calvo E. , deBianchi A. G., James A. A., and Marinotti O. (2002). The major acid soluble proteins of adult female Anopheles darlingi salivary glands include a member of the D7-related family of proteins. Insect Biochem Mol Biol 32: 1419-27. Campanacci V., Krieger J., Bette S., Sturgis J. N., Lartigue A., Cambillau C , Breer H., and Tegoni M . (2001). Revisiting the specificity of Mamestra brassicae and Antheraea polyphemus pheromone-binding proteins with a fluorescence binding assay. J Biol Chem 276: 20078-84. Christophides G. K., Mintzas A. C , and Komitopoulou K. (2000). Organization, evolution and expression of a multigene family encoding putative members of the odourant binding protein family in the medfly Ceratitis capitata. Insect Mol Biol 9: 185-95. Du G., and Prestwich G. D. (1995). Protein structure encodes the ligand binding specificity in pheromone binding proteins. Biochemistry 34: 8726-32. Fujii S., and Amrein H. (2002). Genes expressed in the Drosophila head reveal a role for fat cells in sex-specific physiology. EmboJ21: 5353-63. Galindo K., and Smith D. P. (2001). A large family of divergent Drosophila odorant-binding proteins expressed in gustatory and olfactory sensilla. Genetics 159: 1059-72. Graham L. A., Brewer D., Lajoie G., and Davies P. L. (2003). Characterization of a Subfamily of Beetle Odorant-binding Proteins Found in Hemolymph. Mol Cell Proteomics 2: 541-9. Graham L. A., Tang W., Baust J. G., Liou Y. C , Reid T. S., and Davies P. L. (2001). Characterization and cloning of a Tenebrio molitor hemolymph protein with sequence similarity to insect odorant-binding proteins. Insect Biochem Mol Biol 31: 691-702. Ishida Y., Chiang V. P., Haverty M . I., and Leal W. S. (2002). Odorant-binding proteins from a primitive termite. J Chem Ecol 28: 1887-93. 139 James A. A., Blackmer K., Marinotti O., Ghosn C. R., and Racioppi J. V. (1991). Isolation and characterization of the gene expressing the major salivary gland protein of the female mosquito, Aedes aegypti. Mol Biochem Parasitol 44: 245-53. Kim M . S., Repp A., and Smith D. P. (1998). LUSH odorant-binding protein mediates chemosensory responses to alcohols in Drosophila melanogaster. Genetics 150: 711-21. Kim M . S., and Smith D. P. (2001). The invertebrate odorant-binding protein LUSH is required for normal olfactory behavior in Drosophila. Chem Senses 26: 195-9. Krieger M . J., and Ross K. G. (2002). Identification of a major gene regulating complex social behavior. Science 295: 328-32. Kruse S. W., Zhao R., Smith D. P., and Jones D. N. (2003). Structure of a specific alcohol-binding site defined by the odorant binding protein LUSH from Drosophila melanogaster. Nat Struct Biol 10: 694-700. Maibeche-Coisne M . , Sobrio F., Delaunay T., Lettere M . , Dubroca J., Jacquin-Joly E. and Nagnan-Lemeillour P. (1997). Pheromone binding proteins of the moth Mamestra brassicae: specificity of ligand binding. Insect Biochem Mol Biol 27: 213-221. Paesen G. C., and Happ G. M . (1995). The B proteins secreted by the tubular accessory sex glands of the male mealworm beetle, Tenebrio molitor, have sequence similarity to moth pheromone-binding proteins. Insect Biochem Mol Biol 25: 401-8. Plettner E. , Lazar J., Prestwich E. G., and Prestwich G. D. (2000). Discrimination of pheromone enantiomers by two pheromone binding proteins from the gypsy moth Lymantria dispar. Biochemistry 39: 8953-62. Riviere S., Lartigue A., Quennedey B., Campanacci V., Farine J. P., Tegoni M . , Cambillau C , and Brossut R. (2003). A pheromone-binding protein from the cockroach Leucophaea maderae: cloning, expression and pheromone binding. Biochem 7371: 573-9. Shanbhag S. R., Hekmat-Scafe D., Kim M . S., Park S. K., Carlson J. R., Pikielny C , Smith D. P., and Steinbrecht R. A. (2001). Expression mosaic of odorant-binding proteins in Drosophila olfactory organs. Microsc Res Tech 55: 297-306. Steinbrecht R.A., Ozaki M . and Ziegelberger G. (1992). Immunocytochemical localization of pheromone-binding protein in moth antennae. Cell Tissue Res 270: 287-302. Tegoni M . , Campanacci V., and Cambillau C. (2004). Structural aspects of sexual attraction and chemical communication in insects. Trends Biochem Sci 29: 257-64. Vierstraete E. , Cerstiaens A., Baggerman G., Van den Bergh G., De Loof A., and Schoofs L. (2003). Proteomics in Drosophila melanogaster: first 2D database of larval hemolymph proteins. Biochem Biophys Res Commun 304: 831-8. 140 Vogt R. G. (2002). Odorant binding protein homologues of the malaria mosquito Anopheles gambiae; possible orthologues of the OS-E and OS-F OBPs OF Drosophila melanogaster. J Chem Ecol 28: 2371-6. Vogt R.G. and Riddiford L.M. (1981). Pheromone binding and inactivation by moth antennae. Nature 293: 161-163. Vogt R. G., Rogers M. E. , Franco M. D., and Sun M. (2002). A comparative study of odorant binding protein genes: differential expression of the PBP1-GOBP2 gene cluster in Manduca sexta (Lepidoptera) and the organization of OBP genes in Drosophila melanogaster (Diptera). J Exp Biol 205: 719-44. Vogt R. G., Prestwich G. D., and Lerner M. R. (1991). Odorant-binding-protein subfamilies associate with distinct classes of olfactory receptor neurons in insects. J Neurobiol 22: 74-84. 141 CHAPTER SIX Summary Insect odorant binding proteins are characterized by tissue, sex and development specific expression patterns consistent with their ligand transport function in the peripheral sensory system. Pheromone binding proteins, first discovered by Vogt and Riddiford (1981), are the most studied example; they are expressed in pheromone sensitive sensilla located on the antennae of the adult stage, specific to the sexual gender that responds to the pheromone produced by the opposite sex. Chemosensory protein genes were first cloned by McKenna et al. (1994), thirteen years after the discovery of OBPs. Soon after their discovery, CSPs were isolated from orthopteran species, and Angeli et al. (1999) demonstrated that they were localized specifically to the lymph that surrounds sensory neurons housed in the contact chemosensilla of S. gregaria. Similarly, a phasmid CSP was localized to the lymph surrounding olfactory neurons (Monteforti et al, 2002). An in vitro ability to bind hydrophobic ligands (including pheromones) (Bohbot et al., 1998), combined with the 3-D crystral structure of MbraCSPA6 (Lartigue et al, 2002) that revealed a globular, helical protein with an internal hodrophobic pocket, pointed towards a function that was analogous to that of the OBPs. However, at the same time, several characteristics that distinguished CSPs from OBPs were noted, and these characteristics were not consistent with a classical sensory function related to olfactory and gustatory neurons. First and foremost, it became clear that CSPs were abundantly expressed in tissue devoid of sensory neurons. Second, CSPs representing insect orders that had diverged more than 300 million years ago, shared at least 40% amino acid identity, while OBPs from the same species share as little as 10% identity (Vogt et al, 2002). 142 The fact that OBPs form a large and highly divergent protein family is thought to be an adaptation required to discriminate between thousands of structurally diverse chemicals. I found that several highly conserved amino acid motifs could account for a minimal amino acid identity of approximately 40%. It is important to note, however, that variation among the remaining 60% of the residues could be sufficient to create a large diversity of ligand binding pockets. Rather then focusing on the fact that OBP sequences tend to be more diverse, I feel it is more productive to focus on the reasons why CSP have conserved several amino acid motifs for more then 300 million years. OBP sequences do not have any such conserved motifs. The conserved domains suggest that there exists a functional constraint that has continued during the evolution of diverse insects, from the ancestral orthopteroids to the more modern neopterans. While it is possible that some sensory functions have been conserved, it is more difficult to reconcile the fact that these proteins are also expressed broadly in non-sensory tissues. The functional constraints that maintain the conserved domains must be common in both sensillum and non-sensillum tissue; although it is clear that CSPs are secreted into the sensillum lymph in at least two orthopteroid speices, their function is likely quite distinct from that of OBPs. A detailed review of the current literature, and experimental results presented within this thesis, clearly indicate that in addition to most CSPs, some OBPs are also broadly expressed in non-sensory tissue. Both are multigene families, and in many cases, several CSPs or OBPs are all expressed in the same tissue. This implies that some level of pattern recognition may be required in those tissues, raising the question: what physiological functions (other then the classical olfactory and gustatory senses) require the transport of hydrophobic chemicals including the ability to recognize and discriminate between different ligands? Four 143 possible functions include: A) Transporting hydrophobic ligands in the haemolymph. This could include a role in scavenging toxic plant metabolites or transporting physiologically active ligands such as hormones. B) Transporting physiologically active ligands in secreted fluids. These could include chemicals that modulate a host response (such as immune suppressors in female mosquito saliva) or act in mating behavior (hormones transferred in seminal fluid). C) Microbial pattern recognition in insect innate immunity. Lipids, including the Lipid A component of bacterial lipopolysaccharide, are potent elicitors of the immune response. In fact, the activity of the Lipid A component is dependent upon its structure, that varies between different microbial species; different Lipid A structures are recognized by different receptors. CSPs and OBPs in the haemolymph may transport microbial lipids to membrane bound receptors that then signal specific immune responses. D) Transport of hydrophobic ligands involved in development. I found that three CSP genes and one OBP from the ESB are expressed in patterns that suggest a role in development, including larval molting and pupal metamorphosis. Ligands could include physiologically active compounds that regulate development, or, the actual chemicals used in the tissue reconstruction (such as cuticle synthesis during larval molting). There are several avenues of research that will help to elucidate the non-sensory function of CSPs and OBPs. First, continuing to characterize their non-sensory expression pattern will provide clues as to their function. Second, endogenous ligands need to be identified, so that that the in vivo function of CSPs can be related to in vitro ligand binding assays. This may require the use and/or development of new approaches and experimental tools. Third, productive insights into the function of CSPs will be gained by studying the role of the highly conserved amino acid motifs. These motifs may interact with other highly 144 conserved proteins. A CSP from D. melanogaster interacted with a member of a highly conserved family of transcription factors, with high confidence (Giot et al., 2003). This type of interaction could explain the conservation of several amino acid motifs during 300 million years of evolutionary diversification. It is now becoming clear that CSPs and some OBPs can be expressed in both the sensillum lymph, and more broadly in tissues that do not have olfactory and gustatory neurons. Determining the link between their sensory and non-sensory function is an interesting problem that remains to be solved. References Angeli S., Ceron F., Scaloni A., Monti M . , Monteforti G., Minnocci A., Petacchi R., and Pelosi P. (1999). Purification, structural characterization, cloning and immunocytochemical localization of chemoreception proteins from Schistocerca gregaria. Eur J Biochem 262: 745-54. Bohbot J., Sobrio F., Lucas P., and Nagnan-Le Meillour P. (1998). Functional characterization of a new class of odorant-binding proteins in the moth Mamestra brassicae. Biochem Biophys Res Commun 253: 489-94. Giot L. , Bader J. S., Brouwer C , Chaudhuri A., Kuang B., L i Y., Hao Y. L. , Ooi C. E. , Godwin B., Vitols E. , Vijayadamodar G., Pochart P., Machineni H. , Welsh M . , Kong Y., Zerhusen B., Malcolm R., Varrone Z., Collis A., Minto M . , Burgess S., McDaniel L. , Stimpson E. , Spriggs F., Williams J., Neurath K., Ioime N., Agee M . , Voss E. , Furtak K., Renzulli R., Aanensen N., Carrolla S., Bickelhaupt E. , Lazovatsky Y., DaSilva A., Zhong J., Stanyon C. A., Finley R. L. , Jr., White K. P., Braverman M . , Jarvie T., Gold S., Leach M . , Knight J., Shimkets R. A., McKenna M . P., Chant J., and Rothberg J. M . (2003). A protein interaction map of Drosophila melanogaster. Science 302: 1727-36. Lartigue A., Campanacci V., Roussel A., Larsson A. M . , Jones T. A., Tegoni M . , and Cambillau C. (2002). X-ray structure and ligand binding study of a moth chemosensory protein. JBiol Chem 277: 32094-8. McKenna M . P., Hekmat-Scafe D. S., Gaines P., and Carlson J. R. (1994). Putative Drosophila pheromone-binding proteins expressed in a subregion of the olfactory system. JBiol Chem 269:16340-7. 145 Monteforti G., Angeli S., Petacchi R., and Minnocci A. (2002). Ultrastructural characterization of antennal sensilla and immunocytochemical localization of a chemosensory protein in Carausius morosus Brunner (Phasmida : Phasmatidae). Arthropod Structure & Development30: 195-205. Vogt R.G. and Riddiford L . M . (1981). Pheromone binding and inactivation by moth antennae. Nature 293: 161-163. Vogt R. G., Rogers M . E. , Franco M . D., and Sun M . (2002). A comparative study of odorant binding protein genes: differential expression of the PBP1-GOBP2 gene cluster in Manduca sexta (Lepidoptera) and the organization of OBP genes in Drosophila melanogaster (Diptera). J Exp Biol 205: 719-44. 146 Co-Authorship Statement A s the principal investigator, I designed and executed all of the experiments contained within this thesis that I wrote, with contributions from co-authors as listed below: 1. Dr. Q i l i Feng was a collaborator who provided four c D N A clones from a C. fumiferana larval c D N A library that he constructed. Staged insects used in some Northern blot analyses were collected while I visited Dr. Feng's lab. 2. Les Wi l l i s operated the stand-alone version o f tblastn used to identify C S P sequences from genome trace files (section 2.4.3). 3. Use Hutchkins contributed equally to a joint effort to use R N A i to inhibit C A T activity in Sf-9 cells (section 4.2.2). 4. Dr. Er ika Plettner, a committee member, provided laboratory space, equipment, materials, and supervision towards the protein purification and fluorescent binding studies. Dr. Plettner critically reviewed my manuscripts and thesis. 5. Dr. Murray Isman was my academic supervisor, and Dr. David Theilmann was my research supervisor. They assisted in the design and interpretation o f experiments as wel l as manuscript and thesis preparation.