Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

The Acyl-CoA ligase-like (ACLL) gene family in Arabidopsis and poplar Souza, Clarice de Azevedo 2007-12-31

You don't seem to have a PDF reader installed, try download the pdf

Item Metadata


831-ubc_2007-317600.pdf [ 13.83MB ]
JSON: 831-1.0100437.json
JSON-LD: 831-1.0100437-ld.json
RDF/XML (Pretty): 831-1.0100437-rdf.xml
RDF/JSON: 831-1.0100437-rdf.json
Turtle: 831-1.0100437-turtle.txt
N-Triples: 831-1.0100437-rdf-ntriples.txt
Original Record: 831-1.0100437-source.json
Full Text

Full Text

T H E  A C Y L - C D A  L I G A S E - L I K E  (ACLL)  ARABIDOPSIS A N D  G E N E  F A M I L Y  P O P L A R  by CLARICE DE A Z E V E D O SOUZA B . S c , Universidade de Sao Paulo, Sao Paulo - Brazil, 2001  A THESIS SUBMITTED IN P A R T I A L F U L F I L M E N T OF T H E REQUIREMENTS FOR T H E D E G R E E OF DOCTOR OF PHILOSOPHY  in T H E F A C U L T Y OF G R A D U A T E STUDIES (Botany)  T H E UNIVERSITY OF BRITISH C O L U M B I A August, 2007  © Clarice de Azevedo Souza, 2007  IN  ABSTRACT  Many genes of unknown function have been annotated in plant genome projects, and many of these may encode undiscovered enzymes. For example, completion of the Arabidopsis thaliana genome sequence revealed large families of phenylpropanoid-like enzymes of unknown functions. Using an in silico similarity search based on the aminoacid sequences of known Arabidopsis genes encoding 4-coumarate:CoA ligase (4CL), I identified nine putative genes as members of the Arabidopsis acyl-CoA ligase-like (ACLL) gene superfamily which encode a plant-specific clade of enzymes closely related to true 4CLs. I also identified all A C L L s in the fully sequenced poplar and rice genomes. Phylogenetic analysis of amino-acid sequences revealed five A C L L clades, each containing at least one A C L L member from each species, suggesting conserved biochemical functions for A C L L enzymes. In four of five clades, most of the A C L L representatives have the FTS1 peroxisomal target sequence, indicating a likely function in that organelle. I established tissue expression profiles and the wound and herbivory responsiveness of Arabidopsis and poplar ACLL genes, and this revealed similar expression patterns for potentially orthologous genes. Finally, I mined publicly available microarray databases for co-expressed Arabidopsis genes, and this data provides clues for potential A C L L biochemical functions. The only non-peroxisomal clade is the one most closely related to true 4CLs and contains a single copy gene in Arabidopsis (ACLL5) and poplar (ACLL13). These genes are flower and anther-preferred in expression, and because of the apparent conservation in sequence and in expression, were chosen for functional analysis. ACLL5 is transiently expressed in tapetum cells just prior to release of microspores from tetrads, suggesting a role in pollen wall and/or sporopollenenin formation. In support of this, an acllS transposon insertion mutant is male sterile and fails to produce pollen grains. These data suggest that A C L L 5 and similar enzymes from other species, produce CoA ester intermediates used in an unknown pathway required for pollen wall formation. In silico co-expression analysis in Arabidopsis has revealed potential other members of this pathway, also conserved across angiosperms. This work highlights the utility of the Arabidopsis model system in the discovery of genes in other plant species with genome sequence information.  ii  TABLE OF CONTENTS  ABSTRACT  II  TABLE OF CONTENTS  Ill  LIST OF TABLES  VI  LIST OF FIGURES  VII  LIST OF ABBREVIATIONS  IX  AGNOWLEDGEMENTS  XI  CO-AUTHORSHIP STATEMENT  XII  CHAPTER 1 - INTRODUCTION  1  1.1 N A T U R A L P R O D U C T D I V E R S I T Y A N D P L A N T G E N O M I C S 1.1.1 F U L L Y SEQUENCED PLANT GENOMES A N D LESSONS LEARNED FROM THEM..... 1.2 P H E N Y L P R O P A N O I D M E T A B O L I S M A N D A D E N Y L A T E - F O R M I N G E N Z Y M E S 1.2.1 PHENYLPROPANOID METABOLISM 1.2.2 ADENYLATE-FORMING ENZYMES 1.2.3 4CL GENE FAMILIES 1.2.4 4CL SUBFUNCTIONALIZATION 1.2.5 IDENTIFICATION OF 4CL-LIKE GENES (4CLLS) A N D PHYLOGENETIC RELATIONSHIPS TO OTHER ADENYLATE-FORMING ENZYMES 1.2.6 4CL PROTEIN STRUCTURE 1.3 T H E ACLLs A N D T H E S I S O B J E C T I V E S  19  CHAPTER 2 - MATERIAL AND METHODS  22  2.1 N U C L E I C A C I D M E T H O D S 2.1.1 GENOMIC D N A A N D TOTAL R N A ISOLATION 2.1.2 D N A EXTRACTION FROM AGAROSE GELS 2.1.3 PLASMID D N A PREPARATION A N D SEQUENCING 2.1.4 SEQUENCE ALIGNMENT A N D EDITING 2.1.5 REVERSE TRANSCRIPTION FOR C D N A SYNTHESIS 2.1.6 GENERAL RECOMBINANT D N A METHODS 2.2 P L A N T G R O W T H A N D M A I N T E N A N C E 2.2.1 SEED HARVESTING A N D SOWING 2.2.2 P L A N T GROWTH CONDITIONS 2.2.3 M E C H A N I C A L WOUND A N D HERBIVORY 2.3 G E N E R A L T R A N S F O R M A T I O N P R O C E D U R E S 2.3.1 BACTERIA TRANSFORMATION  iii  I '.  .'  ;  1 5 8 8 11 13 13 17 18  22 22 22 22 23 23 23 26 26 26 27 27 27  2.3.2 ARABIDOPSIS TRANSFORMATION  27  2 . 3 . 3 T O B A C C O TRANSFORMATION (DONE B Y K . TURNER, B C INSTITUTE OF TECHNOLOGY) 2.4 G E N E R A L BIOINFORMATICS PROCEDURES 2.4.1 SEQUENCE SELECTION A N D PHYLOGENETIC TREE CONSTRUCTION 2.4.2 IDENTIFICATION OF C/S-ACTING PROMOTER ELEMENTS 2.4.3 SEARCH OF CO-REGULATED GENES IN PUBLIC ARABIDOPSIS MICROARRAY DATABASE 2.6 G E N E EXPRESSION ANALYSIS 2.6.1 QUANTITATIVE R E A L - T I M E P C R  28 29 29 30 30 30, 30  2.6.2 SEMI-QUANTITATIVE R T - P C R . . . . 2.6.3 G U S HISTOCHEMCAL ASSAY 2.7 S U B - C E L L U L A R LOCALIZATION OF A R A T H A C L L 4 AND P O P T R A C L L 5 2.8 IDENTIFICATION AND CHARACTERIZATION OF AN ACLL5 INSERTION M U T A N T 2.8.1 GENETIC METHODS 2.8.2 PHENOTYPIC ANALYSIS OF THE ACLL5-J MUTANT 2.8.3 IN SITU HYBRIDIZATION (EXPERIMENT PERFORMED B Y S . M C K L M , U B C )  34 34 35 35 35 36 37  C H A P T E R 3 - GENOME-WIDE P H Y L O G E N E T I C ANALYSIS A N D COMPARATIVE G E N O M I C S O F T H E P L A N T - S P E C I F I C A C Y L : C O E N Z Y M E A L I G A S E - L I K E (ACLL) G E N E F A M I L Y I NARABIDOPSIS A N D P O P L A R 38  3.1 INTRODUCTION 3.2 RESULTS 3.2.1 PHYLOGENETIC ANALYSIS OF 1 0 0 A C L L S FROM PLANTS A N D MICROORGANISMS 3.2.2 M O S T A C L L S CONTAIN THE P T S 1 (PEROXISOMAL TARGET SIGNAL 1 ) 3.2.3 SPECIES-SPECIFIC ACLL GENE F A M I L Y EVOLUTION 3.2.4 COMPARATIVE ANALYSIS OF ARABIDOPSIS A N D POPLAR 4 C L A N D A C L L PROTEINS 3.2.5 COMPARATIVE EXPRESSION ANALYSIS OF ARABIDOPSIS A N D POPLAR GENES „ 3.2.7 IN SILICO CO-EXPRESSION ANALYSIS OF ARABIDOPSIS A CLL GENES 3.2.8 POPLAR GENES ACTIVATED B Y ADDITIONAL STRESS TREATMENTS 3.2.9 SUB-CELLULAR LOCALIZATION OF P O P T R A C L L 5  38 42 42 45 45 50 52 60 63 ;....67  3.3 DISCUSSION 3.3.1 4CL A N D ACLL GENE EVOLUTION 3.3.2 ACLL GENE FAMILY STRUCTURE A N D EXPRESSION PATTERNS  69 69 71  3.3.3 S U M M A R Y A N D COMBINING DATA TO M A K E FUNCTIONAL INFERENCES  77  C H A P T E R 4 - T H E ARABIDOPSIS THALIANA F L O W E R - S P E C I F I C A C Y L C O E N Z Y M E A L I G A S E G E N E ACLL5 IS C O N S E R V E D I N A N G I O S P E R M S A N D IS REQUIRED FORM A L E FERTILITY 81  4.1 INTRODUCTION 4.2 RESULTS 4.2.1 IDENTIFICATION OF A N ACLL5  INSERTION M U T A N T  81 86 .86  4.2.2 ACLL5  MUTATION IS CORRELATED WITH M A L E STERILITY A N D ABSENCE OF POLLEN GRAINS : 88 4.2.3 GENETIC CHARACTERIZATION OF THE ACLL5 M U T A N T 90 4.2.4 PHENOTYPIC ANALYSIS OF ANTHER DEVELOPMENT IN THE ACLL5-I MUTANT 4.2.5 IN SITU HYBRIDIZATION ANALYSIS OF ACLL5 EXPRESSION IN DEVELOPING ANTHERS 4.2.6 CO-EXPRESSION ANALYSIS OF A CLL5 IN ARABIDOPSIS : 4.3 DISCUSSION  iv  91 94 ...96 102  \  C H A P T E R 5- C O N C L U S I O N S A N D F U T U R E D I R E C T I O N S  112  5.1 A TIMELINE OF DISCOVERIES 5.2 FUTURE DIRECTIONS 5.2.1 MUTANT ANALYSIS 5.2.2 ACLL5 MUTANT COMPLEMENTATION 5.2.3 MUTANT STUDIES IN POPLAR A N D OTHER PLANT SPECIES 5.2.4 BIOCHEMICAL CHARACTERIZATION POPLAR C L A D E D ACLLs 5.2.5 4 C L / A C L L STRUCTURAL INFORMATION A N D IDENTIFICATION OF SUBSTRATES 5.2.6 CONTINUOUS MINING OF DATA 5.3 FINAL REMARKS  112 114 114 117 119 120 121 124 126  REFERENCES  128  APPENDIX 1  140  v  LIST O F T A B L E S  TABLE 1.1: KINETIC PROPERTIES OF RECOMBINANT ARABIDOPSIS 4CLS  15  TABLE 2.1: PRIMERS USED FOR ACLL PROMOTER AMPLIFICATION AND CLONING  25  TABLE 2.2: PRIMERS USED FOR GENERATING GFP FUSIONS OF ARATHACLL4 AND POPTRACLL5  .-  26  TABLE 2.3: PRIMERS USED FOR QUANTITATIVE AND SEMI-QUANTITATIVE RT- PCR  33  TABLE 2.4: PRIMERS USED FOR GENOTYPING ACLL5 TRANSPOSON INSERTION LINES  36  TABLE 3.1: ANNOTATION OF POPULUS TRICHO CARPA AND ORYZA SATIVA ACLL GENES  47  TABLE 3.2: AMINO ACID IDENTITY COMPARISON OF FULL-LENGTH AMINO ACID '  SEQUENCES OF 4CLS AND ACLLS IN THE DIFFERENT CLADES  52  • TABLE 4.1: CONSENSUS CIS ELEMENT MATCHES IN ACLL5 AND CO-EXPRESSED PHENYLPROPANOID-LIKE GENE PROMOTER REGIONS  vi  106  LIST O F FIGURES  FIGURE 1.1: SCHEMATIC VIEW OF THE RELATIONSHIP OF PHENYLPROPANOID METABOLISM TO PRIMARY METABOLISM  FIGURE 1.2:  A SIMPLIFIED SCHEME OF THE PHENYLPROPANOID PATHWAY  9 11  FIGURE 1.3: THE TWO-STEP MECHANISM OF THE COA-LIGASE REACTION  12  FIGURE 2.1:  PCAMBIA VECTOR 1305.1 USED FOR PROMOTER::G£/S CLONING  24  FIGURE 3.1:  PHYLOGENETIC TREE OF 100 ACLLS FROM VARIOUS ORGANISMS  43  FIGURE 3.2: PHYLOGENETIC RELATIONSHIP OF PLANT-SPECIFIC ACLLS, FROM ARABIDOPSIS, POPLAR AND RICE FIGURE 3.3: SCHEMATIC REPRESENTATION OF THE ARABIDOPSIS THALIANA GENOME, SHOWING THE LOCATION OF A C L L GENES WITH THE RESPECTIVE CLADES  FIGURE 3.4:  TISSUE EXPRESSION PROFILE OF ACLLS IN ARABIDOPSIS AND POPLAR  FIGURE 3.5: EFFECT OF WOUND STRESS ON PEROXISOMAL ACLLS  46 49 55 58  FIGURE 3.6: PAJEK CO-EXPRESSION NETWORKS GENERATED FROM PRIME CORRELATED GENE SEARCH TOOL DATA (HTTP://PRIME.PSC.RIKEN.JP). (A) CO-EXPRESSION NETWORK OF ARATHACLL3. (B) CO-EXPRESSION NETWORK OF ARATHACLL4 61  FIGURE 3.7:  REAL-TIME PCR DATA SHOWED EXPRESSION OF POPLAR ACLLS AFTER SIMULATED HERBIVORY, HERBIVORY BY THE FOREST TENT CATERPILLAR (MALACOSOMA DISSTRIA), AND EXPOSURE TO MEJA  66  FIGURE 3.8:  CONFOCAL MICROSCOPY IMAGE SHOWING SUB-CELLULAR LOCALIZATION OF ARATHACLL4 (OPDA/OPC8-COA LIGASE) AND THE POPLAR HOMOLOGUE POPTRACLL5 IN GUARD CELLS OF TRANSGENIC TOBACCO LINES 68  FIGURE 4.1: (A) SCHEMATIC REPRESENTATION OF THE ACLL5 (AT1G62940) GENE, SHOWING THE LOCATION OFTHETRANSPOSON INSERTION. (B) ACLL5 EXPRESSION IN WILD-TYPE (WT) AND ACLL5-1 HOMOZYGOTE LINES 86  FIGURE 4.2:  PHENOTYPIC CHARACTERIZATION ACLL5-1 HOMOZYGOUS PLANTS  88  FIGURE 4.3:  SEM OF WILD TYPE (WT) AND HOMOZYGOUS ACLL5-1 ANTHERS  89  FIGURE 4.4: ANTHER  CROSS-SECTIONS (IjxM) OF WILD TYPE AND HOMOZYGOUS ACLL5-1 MUTANT. DEVELOPMENTAL STAGES ARE ACCORDING TO SANDERS ET AL. 1999 92  FIGURE 4.5:  IN SITU HYBRIDIZATION ANALYSIS IN DEVELOPING WILD-TYPE FLOWERS...95  FIGURE 4.6: (A) SELECTED GENES WITH HIGH CO-EXPRESSION COEFFICIENTS WITH ACLL5IAT1G62940 INVOLVED IN LIPID AND PHENYLPROPANIOD-LIKE METABOLISM (B) PUTATIVE DUPLICATED PHENYLPROPANOID PATHWAY 97  vii  F I G U R E 4.7: PHYLOGENETIC ANALYSES OF PROTEIN SEQUENCES OF ARABIDOPSIS, POPLAR, RICE, PHYSCOMITRELLA AND OTHER SPECIES GENES EXPRESSED DURING MICROSPORE DEVELOPMENT. (A) CHALCONE SYNTHASE (CHS) AND CHALCONE SYNTHASE-LIKES (CHSL). (B) DEHYDROFLAVONOL REDUCTASE (DFR) AND DEHYDROFLAVONOL REDUCTASE LIKE FAMILIES. (C) 4-COUMARATE-COA LIGASE (4CL) AND ACYL-COA LIGASE 5 AND HOMOLOGUES (ACLL) 101  viii  LIST OF ABBREVIATIONS  4CL AAE ACLL AOC AOS C3H C4H CAD CCOMT CCR CHS COMT CPR CT dexl DFR dsRNA dytl F5H HAL HCT HMGR2 IAA IPP JA LOX LTP MeJA MEP ML MPSS msi ms2 NASC OPC8 OPDA OPR3 OS PAL PCD PTS1 PTS2 SBD  4-Coumarate:CoA Ligase Acyl-Activating Enzyme Acyl-CoA Ligase Like Allene Oxide Cyclase Allene Oxide Synthase Coumaroyl-Shikimate 3'-Hydroxylase Cinnamate 4-Hydroxylase Cinnamyl Alcohol :NADP+ Dehydrogenase Caffeoyl-CoA O-Methyltransferase Enzyme Cinnamyl-CoA Reductase Chalcone Synthase Caffeic Acid O-Methyltransferase Cytochrome P450 Reductase Threshold Cycles Defective in Exine Formation Dihydroflavonol Reductase Double Stranded R N A Dysfunctional Tapetum 1 Ferulate-5-Hydroxylase Histidine-Ammonia Lyase Hydroxy cinnamyl CoA Shikimate/Quinate Hydroxycinnamyltransferase 3-Hydroxy-3-Methylglutaryl-CoA Reductase 2 Indole-3-Acetic Acid Isopentenyl Diphosphate Jasmonic Acid Lipoxygenase Lipid Transfer Proteins Methyjasmonate Methylerythritol Phosphate Maximum-Likelihood Massively Parallel Signature Sequencing Male Sterile 1 Male Sterile 2 The European Arabidopsis Stock Centre 3-oxo-2(2'[Z]-Pentenyl)-Cyclopentane-l-Octanoic Acid 12-oxo-Phytodienoic Acid OPDA Reductase Overall Sequence Phenylalanine-Ammonia Lyase Programmed Cell Death Peroxisomal Target Signal 1 Peroxisomal Target Signal 2 Substrate Binding Domains ix  Scanning Electron Microscopy Transparent Testa 4 Wild-Type  x  AGNOWLEDGEMENTS  The past 6 years that Eve been at U B C were probably the most enriching and challenging years of my life. It's with joy and a little sadness that I will complete my first life-long goal by finishing this document. While I struggle to think about how to start writing this small section after writing an entire thesis, I realize that maybe it's because I just have way too much to say and only one page to say it all... I will start with my very thanks to Juergen Ehlting. As a freshly titled B.Sc. in Biology, I was avidly looking for an internship abroad to study genetics of plant secondary metabolism. Juergen not only made that possible, but he captivated me with his passion for Science. Thanks Juergen, your dedication really changed my life. My supervisor Carl Douglas was a constant source of support. Thank you Carl for directing my project while giving me the freedom to learn and develop. I am extremely grateful for your enormous patience and understanding in various moments throughout my degree. I'll carry with me the knowledge and experience that you shared to the next steps of my scientific career. My deepest thanks to my committee members Joerg Bohlmann, Brian Ellis and Ljerka Kunst for their time and effort in giving valuable feedback on my work and allowing me to use their teams expertise for advice and lab equipment. My special thanks to past and present Douglas Lab members: Dae Kyun Ro, Lee Johnson, Qing Wang, Barham Soltani, Tamara Allen, Shana Gutman, David Johnston, Eryang L i , June Kim, Tom Liu, SongSoo K i m and specially Bjoern Hamberger. From outside the Douglas Lab, I am thankful for the help of so many talented and friendly people, in particular: Barbara Ehlting, Sandra Goritschnig, Sarah McKim, Jamie Pighin, Hardy Hall, Nathalie Mathews, Britta Hamberger, the E M lab crew and the very efficient Botany office staff. Of course, I thank my beautiful host country Canada. I could not have picked a better place to live and I know I will miss Vancouver! To all the friends I made in Canada from 2001 to 2007: you were a crucial support network that kept me out of trouble, and helped me a great deal in and outside the lab. I would have not made it if it weren't for you. You know who you are and I thank you all very much. To my dear friends back home and my boyfriend Ryan: thanks for the great times together, the countless emails and phone calls - very necessary after long days (and long nights) in the lab! Finally, I thank for the enormous encouragement from my family: my brother Olavo and my sister Lulu for always setting the bar high... and my mom and dad, Beatriz and Leopoldo, for always pushing me to be the best I can be, for the constant source of emotional and (very appreciated) financial support. I owe my success to you. Thanks to all above who allowed me to fill this page with words. I am one lucky person.  xi  CO-AUTHORSHIP  STATEMENT  The following people have contributed significantly to experimental data collection presented in this thesis: •  Dr. Brad Barbazuk from Donald Danforth Plant Science Center, St. Louis, Missouri, provided the maize ACLL nucleotide sequences. The candidate used this data for the phylogenetic analysis depicted on Figure 3.1.  •  Dr. Steven Ralph from U B C Michael Smith Laboratories-Genome B C Treenomix project donated all the cDNA used for all gene expression studies in poplar presented in Chapter 3. The candidate performed all the experiments using this material, interpreted the data and prepared Figures 3.4, 3.5 and 3.7.  •  Dr. Keith Turner from B C Institute of Technology generated the transgenic tobacco plants presented in Chapter 3 using transgenic Agrobacterium strains provided by the candidate. The candidate also photographed and processed the images presented in Figure 3.8.  •  Sarah M c K i m from U B C Department of Botany did the in situ hybridization experiment presented in Chapter 4. The candidate analyzed the results of this experiment and prepared the images shown in Figure 4.5.  xii  C H A P T E R 1 - INTRODUCTION  1.1 Natural product diversity and plant genomics Plants are a highly diverse group of sessile organisms that have evolved incredible ecological adaptations to occupy a wide range of terrestrial and aquatic habitats. With over 250,000 species, the plant kingdom represents a variety of life forms and survival strategies that reflect the evolutionary pressures driving speciation (Raven et al., 1996). Our presence on Earth relies on the very basic resources and processes that plants generate, such as food, oxygen, fuel, lumber and fibers. Plants have the ability to synthesize an enormous variety of structurally diverse natural products that are important for their fitness in the natural habitat, and also important for human use. These specialized organic compounds do not appear to be directly related to growth and development, and for this reason, are traditionally referred to as products of secondary metabolism (Croteau et al, 2000). Major classes of natural products, or secondary metabolites, are terpenoids, alkaloids and phenylpropanoids.  Terpenoids are the most structurally diverse class of natural products and are derived from repetitive fusions of five-carbon isopentane units (also called isoprene units) (Croteau et al., 2000). Usually the production of terpenoids is located within specialized anatomical structures, such as glandular trichomes (Turner et al., 1999) and resin ducts (Martin et al., 2002). Terpenoids are unified based on the common biosynthetic origin from the fundamental precursor isopentenyl diphosphate (IPP), which is synthesized via the acetate/mevalonate pathway in the cytosol and ER and via the methylerythritol  1  phosphate (MEP) pathway in the plastids. Examples of terpenoids are those found as components of essential oils and vitamin precursors, like P-carotene. Stress-induced terpenoids can play important roles in plant defense (Croteau et al., 2000).  About 20% of flowering plants produce alkaloids, which are low molecular weight, nitrogen containing molecules, usually pharmacologically active (Facchini and St-Pierre, 2005). The human use of plant-derived alkaloids has been documented since our ancient history. For example, in 399 B.C. the Greek philosopher Socrates was executed by consuming an extract of coniine-containing hemlock. The study of alkaloids has been primary driven due to its importance in medicine, such as the potent analgesics codeine and morphine, derived from opium poppy (Papaver somniferum). Today, it is known that alkaloids derive in most cases from amino acids, and over 12000 alkaloids have been isolated (Facchini and St-Pierre, 2005). In plants, alkaloids are associated with chemical defense such as feeding deterrence against herbivores, and therefore play important roles in ecological interactions (Steppuhn et al, 2004).  About 40% of the organic carbon in the biosphere is found in the form of plant phenolic compounds, such as lignin (Croteau et al., 2000). Phenylpropanoid metabolism, from which most of the phenolic compounds derive, is discussed in detail in the later sections of this chapter.  Plant hormones, such as auxins, salicylic acid and jasmonates, constitute an array of plant metabolic products that add to the repertoire of plant chemicals. The biosynthesis of some  2  of these compounds share precursors with the biosynthesis of some secondary metabolites, as it is the case for gibberellic acid, which derives from IPP and its conversion to geranyl diphosphate via the terpenoid pathway (Croteau et al, 2000). This is an example of the fine line dividing primary and secondary metabolism, and an indication of the origin of plants' specialized metabolism deriving from ubiquitous biochemical pathways.  Over the last two decades, there have been efforts using molecular tools for the study of plant natural products. As one example, these approaches were useful in the identification of the set of structural genes and regulatory elements necessary for phenylpropanoid metabolism, which is thought to be a crucial biochemical pathway for the colonization of the land environment (Douglas, 1996). The early studies with parsley cell cultures such as enzyme activity studies with P A L and 4CL (Hahlbrock el al, 1981), and the cloning and expression of genes coding P A L (Hahlbrock and Scheel, 1989), C4H (Koopmann et al, 1999) and 4 C L (Douglas et al, 1987) enzymes, provided the initial picture of the molecular toolbox defining the phenylpropanoid pathway. With the completion of genome sequences of plant species of different angiosperm lineages and ecological adaptations, starting with the Arabidopsis genome (The Arabidopsis Genome Initiative, 2000), and the more recent additions of the rice (Yuan et al, 2003) and poplar (Tuskan et al, 2006) genomes, as well as generation of significant genome sequence information from crop plants (Paterson, 2006) and basal land plant lineages like the moss Physcomitrella patens (Nishiyama et al, 2003), the era of plant genomics is bringing about new possibilities to study plant metabolism in the context of land plant evolution.  3  For example, the picture of the diversity of genes encoding phenylpropanoid pathway enzymes has become more clear, and insights can be made correlating metabolic diversity to genome evolution and ecological adaptations (Hamberger et al, submitted; Tsai et al, 2006; Tuskan et al, 2006).  As introduced above, a characteristic of plants is their metabolic diversity, but the origins of this diversity are not well understood. A common mechanism for generating metabolic diversity is the recruitment of enzymes from pre-existing biochemical pathways in a given organism (Austin and Noel, 2003; Ritter and Schulz, 2004). One interesting example that may be key to the evolution of phenylpropanoid metabolism is the evolutionary history of the enzyme phenylalanine-ammonia lyase (PAL). The similarity of the phenylpropanoid enzyme P A L to histidine-ammonia lyase (HAL) in the His degradation pathway has been noted (Ritter and Schulz, 2004). This study demonstrated, via sequence and structural similarities between P A L and H A L , that P A L , involved in specialized plant metabolism, was likely recruited from the central metabolic pathway of amino acid degradation. Chalcone synthase (CHS), a key enzyme of the flavonoid branch pathway, and related enzymes were also likely recruited from enzymes functioning in primary metabolism. Although their origin is not fully elucidated, evidence suggests that chalcone synthases evolved from ketoacyl synthase III enzymes, involved in fatty acid biosynthesis (Austin and Noel, 2003). Extensive gene duplication and subsequent genetic variation gave rise to most or all of the diversity in the CHS family seen today (Austin and Noel, 2003). As more genomes are sequenced, comparative genomics approaches will make it increasingly possible to follow the evolutionary history of gene families.  4  1.1.1 Fully sequenced plant genomes and lessons learned from them Arabidopsis thaliana was the first plant with a fully sequenced genome, providing a foundation for functional characterization of plant genes as well as a platform for development of tools relevant to evolutionary biology, agriculture, bioinformatics, and comparative genomics (The Arabidopsis Genome Initiative, 2000). Much was learned with the completion of the Arabidopsis genome in terms of genome structure and evolution, when compared to other available genomes. At the time the Arabidopsis genome was completed, whole genome comparisons to other eukaryotes were only possible between Arabidopsis and the yeast Saccharomyces cerevisiae, the fruit fly DrosophUa melanogaster and the worm Caenorhabditis elegans. Results of those analyses demonstrated conservation of protein families among all eukaryotes, and another ~150 protein families unique to plants, most of unknown function (The Arabidopsis Genome Initiative, 2000), highlighting our lack of knowledge about unique aspects of plant development and metabolism.  Large-scale analysis of the Arabidopsis genome revealed that it has undergone at least two whole genome duplications in its evolutionary history, in addition to numerous tandem duplications and further reshuffling of chromosome segments. In fact, it has been estimated that about 90% of loci in Arabidopsis are duplicated, 17% of which are arranged in tandem arrays (Moore and Purugganan, 2005; The, Arabidopsis Genome Initiative, 2000). One study suggests that the most recent whole genome duplication occurred 24 to 40 Mya, during the early emergence of the crucifer (mustard) family (Blanc et al., 2003). The evolutionary fate of duplicated genes has been long debated and  5  it is generally accepted that duplicated genes are a source of raw material for evolutionary novelty to develop. A study of regulatory genes in Arabidopsis showed evidence that many gene families have expanded and diversified over the course of evolution, as a result of gene duplication and divergence (Duarte et al, 2006). In many cases, retention of duplicated genes is accompanied by either changes in gene expression patterns (subfunctionalization) or by changes in protein function (neofunctionalization) suggesting that changes in gene number, sequence, and expression can give rise to phenotypic variation (Moore and Purugganan, 2005).  A first comparative genomics approach between plants became possible with the completion of the rice genome (International Rice Genome Sequencing Project, 2005; Itoh et al., 2007; Yuan et al., 2003). The rice genome contains about the same number of genes as Arabidopsis, but only about one third of the protein-coding sequences in rice have putative orthologues in Arabidopsis. There were many species-specific gene families discovered that could account for the phenotypic differences evident between these species (Itoh et al., 2007).  The completion of the first tree genome sequence {Populus trichocarpa or poplar; (Tuskan et al, 2006) allows for interesting comparisons between the complete gene sets of two plant species with very different life histories: the herbaceous annual Arabidopsis and the tree Populus. The poplar genome sequencing and genome annotation effort identified more than 45,000 putative protein-coding genes, with an average of 1.5 putative poplar homologues for every Arabidopsis gene. The poplar genome sequence  6  revealed that there has been a recent whole genome duplication event, referred to as the salicoid duplication event, evidenced by the identification of blocks of genes with conserved synteny located on different chromosomes. In addition, comparative genomics approaches showed that both Arabidopsis and Populus lineages share an ancient genome duplication, the eurosid duplication event, and that tandem duplications appear to be relatively more common in Arabidopsis than in poplar (Tuskan et al, 2006).  Genes overrepresented in one or the other organism might be correlated with adaptations that could lead to speciation events (Itoh et al, 2007). The whole genome duplication in the salicoid lineage, including poplar, would have provided a wealth of new genes, and it is interesting that the emergence of the Populus genus in the fossil record coincides with the salicoid whole genome duplication (Tuskan et al., 2006). On the other hand, genes conserved between lineages may encode proteins with common, conserved functions. It has been suggested that gene expression differences account for a large share of the phenotypic variability seen between species (Nielsen, 2006). With regard to evolution of metabolic pathways, even small changes in expression of enzyme encoding genes could lead to changes in developmental and environmentally specified enzyme levels and such changes could cause dramatic changes in flux through metabolic pathways (Nielsen, 2006). Thus, since enzymes of conserved sequence may share the same biochemical function, but not necessarily the same biological functions due to expression differences, information about gene expression is important when assessing homologous pairs of genes for conserved function. However, with the limited gene expression information  7  available today for plant species other that Arabidopsis, gene sequences alone still provide a good starting point for creating hypotheses about gene origin and function.  1.2 Phenylpropanoid metabolism and adenylate-forming enzymes  1.2.1 Phenylpropanoid metabolism Plants can efficiently channel carbon from primary metabolism to the phenylpropanoid metabolism mostly via the amino acid phenylalanine (Figurel.l) (Douglas, 1996). Phenylalanine-derived natural products are crucial compounds in plants, with a variety of functions ranging from UV-protection, inter-species signaling and antimicrobial activity, to structural composition of the cell wall (Dixon and Paiva, 1995; Hahlbrock and Scheel, 1989). Protection against UV-irradiation conferred by flavonoids as well as mechanical support and water impermeability for water transport provided by lignin suggest that the evolution of phenylpropanoid metabolism was likely of fundamental importance in the ability of plants to colonize land (Douglas, 1996).  General phenylpropanoid metabolism in plants is composed of three main enzymatic steps (Figure 1.2). Phenylalanine ammonia lyase (PAL) is the first enzyme of the pathway and catalyzes the conversion of the amino acid phenylalanine into transcinnamic acid. Subsequently, the cytochrome P450-dependant enzyme cinnamate 4hydroxylase (C4H), in conjunction with cytochrome P450 reductase (CPR), hydroxylates rra/M-cinnamic acid yielding p-coumaric acid. Finally, 4-coumarate:CoA ligase (4CL) generates C o A esters of p-coumarate and its derivatives in a two step reaction (Figure  8  1.3) (Hahlbrock and Scheel, 1989). This two-step reaction, in which an adenylate substrate intermediate is formed in the presence of A T P and M g , is a common 2+  mechanism shared with other adenylate-forming enzymes, such as firefly luciferases and non-ribosomal peptide synthases (Becker-Andre et ah, 1991), as discussed in detail below. Thioesters activated by 4 C L are then used as precursors for the downstream branches of the phenylpropanoid pathway, for the production of a variety of plant natural products, including lignin and flavonoids (Figure 1.2) (Hahlbrock and Scheel, 1989).  Pentose Phosphate  Glycolysis  Pathway Erythrose 4-phosphate  Phosphoenol pyruvate  Shikimate  Chorismate  Phenylpropanoid Pathway  |  ^  L-Phenylalanine  •\  r  =4>  L-Tryptophan  L-Tyrosine  Figure 1.1: Schematic view of the relationship of phenylpropanoid metabolism to primary metabolism. The shikimate pathway leads to biosynthesis of aromatic amino acids including phenylalanine, starting with metabolic precursors. Phenylpropanoid metabolism, branches from the shikimate pathway mostly via phenylalanine.  9  The downstream reactions following the general phenylpropanoid pathway are quite diverse and much is still to be learned about them. The known general steps for the monolignol biosynthesis branch pathway include the donation of the acyl group of the C o A ester formed by the 4 C L reaction, by the enyzyme Hydroxycinnamyl C o A shikimate/quinate hydroxycinnamyltransferase ( H C T ) , to a shikimic acid or quinic acid acceptor, yielding the corresponding shikimate or quinate esters. The phenolic rings of these esters are then adorned at position 3 with a hydroxyl group by action of the P450 enzyme coumaroyl-shikimate 3'-hydroxylase (C3H). The 3' hydroxyl group is then substituted by a methyl group by caffeoyl-CoA O-methyltransferase enzyme ( C C O M T ) , and the 5' hydroxyl group may be further introduced by the action of enzymes originally characterized as ferulate-5-hydroxylase (F5H) and methoxylated by caffeic acid Omethyltransferase ( C O M T ) , although these reactions likely occur in vivo at the level of aldehydes (Humphreys and Chappie, 2002). The resulting methylated C o A esters are then reduced to the respective aldehydes by action of cinnamyl-CoA reductase ( C C R ) , and are finally  further  reduced  to  the  corresponding  monolignol alcohols by cinnamyl  alcohol:NADP+ dehydrogenase ( C A D ) (Costa et al, 2003; Hamberger et al, submitted; Humphreys and Chappie, 2002). The first enzyme on the flavonoid branch pathway is chalcone synthase ( C H S ) , which catalyzes the condensation of the product of the 4 C L reaction, p-coumaroyl-CoA, with three molecules of malonyl-CoA to generate a C15 skeleton, which is the backbone for a variety of elaborations by downstream enzymes that generate flavonoid diversity (Noel et al, 2005).  10  H N 2  PAL OH  C4H  HCT OH  SCoA  OH  CHS  P450  i I UV protection pigmentation  OH-  I HCT O  1  DHR  SCoA  Flavonoids  CCoAOMT 0 OH-  SCoA  H,CO'  CCR OHH CO 3  structural support  j ^ g n o i s  Figure 1.2: A simplified scheme of the phenylpropanoid pathway, showing the two main branch pathways deriving from the product of the 4 C L enzyme and the resulting final products.  1.2.2 Adenylate-forming enzymes Adenylate-forming enzymes constitute a large class of enzymes that catalyze diverse reactions, all characterized by two-step reaction mechanism requiring M g  ++  and involving  pyrophosphorylysis of A T P and formation of an enzyme-bound AMP-substrate intermediate (adenylate). This reaction mechanism is used, for example, in the formation of non-ribosomal bacterial peptide synthases (Conti et al., 1997) and by the firefly luciferase enzyme (Deluca, 1976). The final acyl acceptor varies considerably depending on the type of enzyme (Shockey et al., 2003). In the CoA ligase reaction, the A M P is released after nucleophilic attack of the carbonyl carbon of the adenylate by the free 11  electrons of the thiol group of the CoA acyl acceptor, forming the final CoA thioester (Figure 1.3) (Shockey et al,  2003). Adenylate-forming enzymes are involved in  carboxylic acid activation by formation of CoA esters, reactions that play vital roles in all living organisms, providing precursors for biosynthesis or breakdown pathways of many important metabolites (Shockey et al, 2003). Known functions of adenylate-forming enzymes in plants include long chain fatty-acid activation (Shockey et al, 2002), synthesis of acetyl-CoA important for lipid accumulation in developing seeds (Ke et al, 2000), biosynthesis of molecules such as jasmonic acid, indole-3-acetic acid (IAA) and salicylic acid (Staswick et al, 2002), and phenolic acid activation in phenylpropanoid metabolism.  ^,PMA  Y OH  ^  ATP  PPi R  /  O ^ ^ S - C o A CoA-SH  <?, OH  rl  v AMP  R  / y \ OH  R  Figure 1.3: The two-step mechanism of the CoA-ligase reaction. The first step uses A T P to generate an adenylate intermediate. The second step is characterized by the transfer of the acyl molecule to the CoA acceptor forming a thioester bond, and release of A M P .  12  1.2.3 4CL gene families The first 4CL gene cloned was derived from parsley (Douglas et al, 1987) and activities of recombinant parsley 4 C L enzymes were demonstrated (Lozoya et al, 1988). Since then, 4CL genes have been isolated from several plants, such as tobacco (Lee and Douglas, 1996), loblolly pine (Zhang and Chiang, 1997), poplar (Allina et al, 1998), aspen (Hu et al, 1998), Arabidopsis  (Ehlting et al, 1999) and soybean (Lindermayr et  al, 2002), and in these cases these genes were also shown to encode bona fide 4 C L enzymes by expression of recombinant proteins. In all angiosperms examined, 4 C L is encoded by multi-gene families. The 4CL gene family in- Arabidopsis  thaliana is  comprised of four genes, 4CL1, 4CL2, 4CL3 (Ehlting et al, 1999) and 4CL4 (Hamberger and Hahlbrock, 2004). Phylogenetic analysis of all known plant 4CL genes, including Arabidopsis 4CLs, showed that they fall into two classes, which likely arose early in angiosperm evolution (Cukovic et al, 2001; Ehlting et al, 1999). Class I 4CLs appear to be associated with the biosynthesis of lignin and other phenylpropanoids, while class II 4CLs are associated with flavonoid biosynthesis (Ehlting et al, 1999).  1.2.4 4CL subfunctionalization It has been demonstrated that 4 C L enzyme isoforms encoded by different gene family members have the capacity to convert different substrates, thus directing the flux of the general phenylpropanoid metabolism into the major branch pathways of flavonoid or monolignol biosynthesis. In aspen (Populus  tremuloides),  4CL1 and 4CL2  are  differentially expressed and exhibit highly divergent substrate preference associated with their respective functions of lignin biosynthesis in developing xylem tissues, and  13  biosynthesis of other phenolics, such as flavonoids, in epidermal cells (Hu et al., 1998). In soybean (Glycine max) the three structurally and functionally distinct cDNAs encoding 4 C L enzymes were also shown to be divergent at the levels of catalytic specificity and expression (Lindermayr et al, 2002). Interesting data came from the study of loblolly pine (Pinus taeda), in which a single 4CL protein that exhibits broad substrate specificity has been described (Harding et al., 2002; Zhang and Chiang, 1997). Those data, together with phylogenetic data, suggest that subfunctionalization in terms of substrate preference and gene expression, apparent in angiosperm lineages, may have originated after the divergence of angiosperms from gymnosperms (Hamberger et al., submitted).  The 4CL gene family has been particularly well studied in Arabidopsis, where not only gene expression and substrate specificity have been characterized (Ehlting et al., 1999; Hamberger and Hahlbrock, 2004; Soltani et al., 2006) but phenotypic effects of 4CL knock-out mutations have been observed, as discussed below (Hamberger B. and Douglas C , unpublished). A l l Arabidopsis 4 C L proteins expressed heterologously in E. coli have activity towards 4-coumarate and therefore are bona fide 4CLs, although substrate preference is largely complementary among the four 4CL enzymes (Table 1.1). 4CL1 has highest activity with p-coumarate and caffeate, overlapping with the highest activities of 4CL2 (caffeate) and 4CL3 (4-coumarate). 4CL4 is unique in the ability to convert very efficiently ferulate and sinnapate, and its been suggested that this enzyme could play a role in the biosynthesis of soluble sinapate-containing phenolics, alternative to or in addition to a role in lignin biosynthesis (Hamberger and Hahlbrock, 2004). Interestingly, a sinapate-utilizing 4 C L isoform has also been identified in soybean  14  (Lindermayr et al, 2002), but otherwise this activity does not appear to be widespread among 4CL enzymes (Allina et al, 1998; Hu et al, 1998).  Table 1.1: Kinetic properties of recombinant Arabidopsis 4CLs (Data from: Ehlting et al, 1999; Hamberger and Hallbrock 2004) Enzyme  Substrate  4CL1  Cinnamate 4-Coumarate Caffeate Ferulate Sinapate Cinnamate 4-Coumarate Caffeate Ferulate Sinapate Cinnamate 4-Coumarate Caffeate Ferulate Sinapate Cinnamate 4-Coumarate Caffeate Ferulate Sinapate  4CL2  4CL3  4CL4  K m (uM)  Biochemical data demonstrating  V m a x (% c o u m a r a t e )  6320 38 11 199  n.c. 6630 252 20  n.c. n.c.  103 100 27 53 21 100 74 -  Vmax/Km  0.02 2.6 2.5 0.26 -  0.003 0.39 3.7  -  n.c.  164 100 129 86 -  0.08 4.4 0.35 0.52 -  432 186 26 20  100 187 153 105  0.3 1.1 6.6 6.7  2070 23 374 166  substrate preference suggests that 4CL1 could  potentially complement the activity of 4CL2 and 3 in vitro (Table 1.1). In addition, the gene expression patterns of Arabidopsis 4CL genes show differential regulation and expression subfunctionalization of family members. Expression of 4CL1 is higher in seedling roots and in bolting stems of mature plants, 4CL2 is most highly expressed in roots and 4CL3 is most highly expressed in flowers (Ehlting et al, 1999). Examination of 4CL promoter-GUS fusion expression suggests that 4CL1 and 4CL2 promoter activity is confined primarily to the vasculature of aerial organs, with some broader expression in roots (Soltani et al, 2006). In contrast, the 4CL3 promoter drives GUS expression in epidermal cells with no preferential expression in vascular tissues, and the 4CL4  15  promoter exhibits low overall activity but is wound-induced (Soltani et al, 2006). These differential expression patterns combined with the biochemical data suggest that subfunctionalization at the levels of enzymatic properties and gene expression has occurred in the evolution of the 4CL gene family in Arabidopsis, such that specific genes and enzymes are specialized for the production of phenylpropanoids required for specialized organs and tissues.  It is known that phenylpropanoids participate in defense mechanisms, and evidence for that has been gathered at the chemical, biochemical and genetic levels (Dixon and Paiva, 1995). For example, flavonoids were shown to accumulate in bean leaves upon U V radiation treatment (Beggs et al., 1985), studies on parsley suspension culture cells demonstrated increases in P A L and 4CL activity after elicitor treatment (Hahlbrock et al, 1981), and finally studies on parsley 4CL genes demonstrated gene up-regulation upon elicitor and U V treatment (Douglas et al, 1987). In Arabidopsis, 4CL1, 4CL2 and 4CL4 expression is wound inducible (Ehlting et al, 1999; Soltani et al, 2006), consistent with the role of these genes in phenylpropanoid biosynthesis for defense purposes, while 4CL3 shows no wound inducibility but is up-regulated after U V stress, consistent with the role of this gene in flavonoid biosynthesis (Ehlting et al, 1999). These data lend further support to the subfunctionalization of 4CL gene expression in Arabidopsis.  Reverse genetic approaches in Arabidopsis provides further evidence for specialized functions of 4CL genes in the production of lignin and flavonoids. Although single 4CL1 or 4CL2 insertion mutants have no visible phenotypes, and a single 4CL3 mutant has a  16  subtle UV-induced flavonoid-deficiency, double mutants of 4CL1/2 and 4CL1/3 have strong phenotypes: the 4cll/2 double mutant is deficient in lignin biosynthesis and is a dwarf at maturity, while the 4cl2/3 double mutant is severely deficient in developmental soluble flavonoid and anthocyanin production (Hamberger B. and Douglas C , unpublished). These data support subfunctionalization of 4 C L enzyme activity in Arabidopsis under normal laboratory conditions. However, there is clearly a functional overlap, since loss of function of a given isoform is largely silent, presumably due to partially redundant function provided by a second isoform that is sufficient to fulfill the biochemical needs of the plant in growth chamber conditions.  1.2.5 Identification of 4CL-like genes (4CLLs) and phylogenetic relationships to other adenylate-forming enzymes With the sequencing of the Arabidopsis genome and with sequence data from other plants, it became apparent that genes encoding enzymes related to, but phylogenetically distinct from true 4CL genes exist (Cukovic et al, 2001). With the completion of the Arabidipsis genome, a complete set of "4CL-like genes" was identified based on sequence similarity and phylogenetic relationships to known 4CL genes (Costa et al, 2003; Shockey et al, 2003; this study). In the annotation of 4CL-like genes, reported by Ehlting et al. (2005), F A S T A searches were carried out using 4CL1, 4CL2, 4CL3 and 4CL4 amino acid sequences against all annotated Arabidopsis genes (provided by M A T D B , and putative genes that displayed more than 30% identity on the amino acid level to at least one of the 4CL proteins over a stretch of more than 300 amino acids were selected. Initial phylogenetic analysis revealed those most closely related to true,4CLs and became the focus of further study. While  17  many genes of this class have been termed " 4 C L " (Costa et al, 2003), "4CL-like" (Ehlting et al, 2005), and " A A E " for Acyl-Activating enzyme (Shockey et al, 2003), I have subsequently used the term A C L L for Acyl-CoA Ligase Like. The annotation of these genes as A C L L s is based on sequence similarity only, and the biochemical and biological functions of the A C L L proteins were still unresolved at the onset of this thesis.  1.2.6 4CL protein structure Since the substrate utilization profiles of recombinant Arabidopsis 4CL1 and 4CL2 proteins differ markedly (Table 1.1; Ehlting et al, 1999, Hamberger and Hahlbrock, 2004), it was possible to define 4CL substrate recognition domains based on the activity of chimeric proteins. These were localized between two highly conserved regions, L P F S S G T T G L P K G (box I) and GEICIRG (box II), and are approximately 100 amino acids long (Ehlting et al, 2001). Point mutations in the putative substrate binding domains (SBD) of different 4CLs can cause changes in substrate recognition, giving rise to the concept of a substrate binding pocket with limited number of contact residues involved in substrate recognition (Stuible and Kombrink, 2001). It is known that adenylate-forming enzymes utilize very diverse substrates, so amino acid sequence information alone is not sufficient to provide definitive information on substrate usage by 4CL and related enzymes. A crystal structure for 4 C L would allow more definitive identification of key functional amino acids in the 4 C L SBD, and could help with predictions of the substrates of A C L L enzymes.  18  A model for identifying the important amino acid residues responsible for the substrate recognition in 4CL2 has been proposed. This "specificity code" is composed of 12 amino acid residues (Schneider et al., 2003), based on homology modeling of 4CL2 to the known structure of the bacterial phenylalanine-activating domain of gramicidin S syntethase (PheA). Together with previous mutation analysis of 4 C L enzymes, this study introduced  the  concept  that  the  specificity of  4 C L isoforms  towards  their  hydroxycinnamic acid substrates is due to size exclusion controlled by four amino acids in the putative substrate-binding pocket. In addition, increasing hydrophobicity of specific residues this region resulted in variants of 4CL2 with enhanced conversion of cinnamic acid (Stuible and Kombrink, 2001). In the A C L L s , four of the 12 amino acids corresponding to those responsible for the "specificity code" in 4CL2 are conserved, whereas the other eight, including the ones causing steric hindrance of the substrate are not conserved. Knowledge of the types of substrates A C L L s convert may allow future use of this type of modeling approach to identify amino acids within the SBD of A C L L s that are responsible for substrate recognition, and might be used to predict potential substrates of other A C L L s .  1.3 T h e A C L L s and thesis objectives  4CL enzymes have been extensively studied for three decades due to their central role in the general phenylpropanoid pathway. Therefore, upon completion of the Arabidopsis genome and subsequent annotation of "4CL-like" ACLL genes, A C L L s were an obvious target for identifying new enzymatic functions and novel pathways that could be related  19  to the general phenylpropanoid pathway or other metabolic pathways, therefore these became targets for functional analysis by several groups.  The objectives of my thesis were to: 1. Provide a general overview of Arabidopsis ACLL  gene repertoire and gene family  structure using the known Arabidopsis 4CL genes as a guiding tool. 2. Use expression analysis, reverse genetics, and bioinformatic analyses to characterize their evolution, biological and biochemical functions. 3. Provide a comparison of the ACLL  gene family in the three sequenced plant genomes  (Arabidopsis, rice, and poplar) and obtain poplar A C L L expression data to complement data obtained in Arabidopsis.  Several approaches were used to collect such information. Using bioinformatics tools, sequence data from various organisms was assembled to further reconstruct the evolution of the ACLL  gene family, and ACLL homologues were discovered in other plant species,  such as Physcomitrella  patens, rice and poplar. I assessed expression patterns to verify  subfunctionalization or evidence of overlapping functions of ACLL  genes in Arabidopsis  and poplar. Reverse genetics approaches such as analysis of D N A insertion mutants were used to find clues for ACLL  biological function. Finally, I identified co-expressed genes  using public microarray data and identified common regulatory elements in the promoter regions of co-expressed genes that have allowed me to pinpoint known and putative novel biochemical pathways in which A C L L enzymes may be participating. This work has allowed the biological and biochemical functions of some ACLL genes to be deduced,  20  and has generated hypotheses regarding the functions of others, which will require further experimentation.  Identification of additional characteristics of the A C L L s would pave the way for biological and biochemical characterizations, possibly including identification of substrates for A C L L proteins, which could lead to the identification of new biochemical pathways that require adenylate-forming enzymes.  21  C H A P T E R 2 - M A T E R I A L AND M E T H O D S  2.1 Nucleic acid methods 2.1.1 Genomic D N A and total R N A isolation Total Arabidopsis genomic D N A extraction was carried out using young leaf tissue (Columbia ecotype), ground in a bead beater at 4°C, with the use of Nucleon PhytoPure kit (Amersham-Pharmacia) according to manufacturer's instructions. Arabidopsis R N A was isolated from the specified tissues frozen in liquid nitrogen, and ground to a fine powder using Trizol reagent (Gibco-BRL), following manufacturer's instructions.  2.1.2 D N A extraction from agarose gels D N A bands were cut from 1% agarose gels and the D N A extracted using the QiaQuick Gel Extraction Kit (Qiagen) according to manufacturer's instructions.  2.1.3 Plasmid D N A preparation and sequencing Plasmid D N A was prepared with the use of Qiagen spin Miniprep and Midiprep kits, following the manufacturer's instructions (Qiagen). D N A sequencing was performed by the University of British Columbia Nucleic Acid and Protein Service unit, using BigDye 3.0 (Applied Biosystems) and a Prism Sequencer (Applied Biosystems).  22  2.1.4 Sequence alignment and editing Multiple sequence alignment was done using the Genomatix DiAlign (  software  Sequence editing and restriction  mapping was done using the SeqPup software ( seqpup/j ava/seqpup-doc. html).  2.1.5 Reverse transcription for cDNA synthesis R N A samples were isolated and quality assessed by visual inspection of rRNA on a 1% agarose gel. R N A samples were then quantified spectrophotometrically and 2pg R N A / 20ul reaction was used to generate first strand cDNA using Superscript II Reverse Transcriptase (Invitrogen) following the manufacturer's protocol.  2.1.6 General recombinant D N A methods D N A P C R amplified with a proof-reading enzyme (indicated below) was digested with the appropriate restriction enzymes (Invitrogen, Roche and New England BioLabs). Restriction enzyme digests weire performed at 4G>1 volume, following the manufacturers' protocol for each enzyme. Digested D N A was purified using the method described on section 2.1.2. Ligation reactions were performed using T4 ligase (Roche) at 10/d volume, following the manufactures' protocol. A l l clones used for transformation were sequenced before further use.  23  Agrobacterium-mediaied Promoter: :GUS  constructs  Arabidopsis ACLL  promoter regions were amplified from genomic D N A using pwo  enzyme  and  (Roche)  cloned  into  the  pCambia  1305.1  vector  (Figure 2.1; containing the  GUS (beta-glucoronidase) reporter gene (Jefferson et al., 1987). The multiple cloning site of the vector (5'end) and Nco\ sites at ACLL  start codons (3'end) were used to generate  in-frame insertions of the PCR fragment to the GUS gene. Primers contained a 3' "tail" to introduce a compatible restriction site sequence and are given in Table 2.1.  35S prom LAC Z ALPHA MCS 35S prom ^ ™  Catalase intron ^ GUSPIus nos term RB  hptll (hygR) 35S term LB  pBR322 ori pBR322 bom site  pCAMBIA1305.1 11846 bp  pVS1-REP  Figure 2.1: pCambia vector 1305.1 used for promoter::GUS cloning.  24  Table 2.1: Primers used for ACLL promoter amplification and cloning. Restriction sites are underlined and mismatches are in grey. Gene ACLLl ACLL2 ACLL3 ACLL4 ACLL5 ACLL6 ACLL7 ACLL8 ACLL9 pCambia  Primer new3F 3R 12F prom 1500 12R prom 1500 new7F 7R new6F 6R new4F 4R newlF newlR new2F 2R 8F prom 1500 8R 5F new5R  Sequence (5' - 3') GCTTTACTCG GGGAACCACAGGG C G C C A T G TAAAGGACTTTGGTTGTATC GTTG T G G A T C G A T T G A A A G A A C CTCAGGATACGCCATGGTTCTAATATG GTTTCTCGAGGTTATTTCAGCAATGAGGAAGC CTCAGGATACGCCAT iTTTCTCTATACG GAAACTCGAGCTGTCCAGAAGCAAGAGAGTCTC G A A G C C A T G G A I 1 1 IGI 1 1C1ATGTAACTTGAC GTTTCTCGAGGATTTATCACCTGAATAGTATTCTTCCAGATTGG CI 1 1 1GAC rCTCCATGClTTAAACGAATTGAATTTGATTTATG GTTTCTQGAGATGGAATGAAAACACCCGGTCCGGTTC GGATTTCTCCAT TTTCCG ATCTCG GTTTC TC GAGCATTTG GC C G GC G ATAAC ATC A G A G GCCGCCAT GAGAGAAGCAGAGTTTAAG CCAAGAGACGGTCGAATGGC GAATTCGCCAT : 1 I U 1 1GGTTGGATTAG GTTTCCGCGGCCCAATGGTGAAGGATACAAGCC GTCA CATGGT AAGACGATTAGAGATAC  pCambiaF pCambiaR  GCGGATAACAATTTCACACAGGAAAC GGGTCCTAACCAAGAAAATGAAGG ArathACLIA and PoprtACLL5  RE Xho I Nco I Sal I Nco I Xho I Nco I Xho I Nco I Xho I Nco I Xho I NCO I Xho I NCO I Xho I NCO I Sac II Nco I  size (bp) 2000 1500 1900 1700 1950 1900 2000 1500 2100  GFP fusions  The coding sequences of both ArathACLIA and PoptrACLL5 were amplified from cDNA made from R N A extracted from organs where these genes were most highly expressed, according to the results depicted on Figure 3.4 and 3.7 (flowers and MeJA treated leaves, respectively). The PCR amplification reactions (primers given in Table 2.2) were performed  with  Phusion high fidelity  enzyme (Finnzymes) according to  the  manufacture's protocol. The PCR products were cloned into a Gateway (Invitrogen) compatible entry vector using TOPO-TA cloning kit (Invitrogen) and subsequently recombined into the destination vector using L R Clonase II enzyme mix according to the manufacturer's instructions (Invitrogen). PCR products were cloned in-frame N-terminal to the GFP gene driven by the C M V 35S promoter.  25  Table 2.2: Primers used for generating GFP fusions of ArathACLL4 and PoptrACLL5 Gene ArathACLL4  Primer CFP-ArathACLL4-F CFP-ArathACLL4-R  sequence ( 5 ' - 3 ' ) A T G G CTTC A G T G A A T T C T C G A TCAAAGCTTGGAGTTGGAAGT  PoptrACLL5  CFP-PoptrACLL5-F CFP-PoptrACLL5-R  ATGGCAGACAACAACAACCTCACA TCAGAGCTTGGAGGTTGCGAG  2.2 Plant growth and maintenance 2.2.1 Seed harvesting and sowing Mature Arabidopsis thaliana (Arabidopsis) plants were let to dry at room temperature. Fully dried siliques were harvested and seeds were separated from plant debris. Seeds were sterilized in 70% EtOH for 2min followed by 100% EtOH for 2 mins. Seeds were dried and sown on Petri plates containing Vi M S (Murashige and Skoog) salts (Sigma Aldrich), supplemented with 1% sucrose and 0.6% agar medium. Plates were placed at 4°C for 2 to 3 days for seed stratification and then transferred to a growth chamber at 20°C under continuous light until first cotyledons were developed.  2.2.2 Plant growth conditions Arabidopsis seedlings were transferred from plates to pots containing moist soil (Sunshine mix 5, Sungrow Horticulture, Saba Beach, Alberta) and pots were covered with Saran wrap to prevent dehydration of the soil during establishment of seedlings. After 3 days the plastic wrap was cut with razor blade and finally removed after another 3 days. Plants were watered as needed and kept in the growth chamber at 20°C under long day conditions (18h light) until maturity. Material from poplar plants was obtained from S. Ralph (Genome BC). Poplar growth conditions are described in Ralph et al. (2006).  26  2.2.3 Mechanical wound and herb-ivory Arabidopsis plants, grown as described above, were wounded with pliers on the full surface of leaf blade and harvested after lh, 4h, and 24h. Wounded and unwounded control plants were harvested at the same time and placed immediately in liquid nitrogen. Poplar stress experiments were done on leaves of Populus trichocarpa X P. deltoides clone HI 1, during time course of 2h, 6h, and 24h. Mechanical wounding, herbivory, regurgitant, and methyl j asmonate treatments are described in Ralph et al. (2006) and in Hamberger et al. (2007).  2.3 General transformation procedures  2.3.1 Bacteria transformation Competent E. coli (DH5a) or Agrobacterium (GV3303) cells were kept at -80°C until ready to use. Cells were thawed on ice and 10/d ligation reaction mixture or lOOng plasmid D N A was added to the bacteria. The bacteria was transformed by heat shock (42°C for 45 sees for E. coli and 37°C for 2mins for Agrobacterium) and placed on agar L B media Petri plates containing the appropriate selection antibiotic. Single colonies were picked and cultures grown on liquid L B media. Plasmids were isolated from cultures (section 2.1.3) and the presence of transgenes was tested by test digestion of the plasmid.  2.3.2 Arabidopsis transformation Reproductive Arabidopsis plants containing many unopened flowers were used for  27  transformation by the floral dip protocol (Clough and Bent, 1998). After dipping, pots were kept overnight in dim light covered by a plastic bag to maintain high humidity. Two days later the pots were placed back into the growth chamber. Mature plants were harvested for seeds, and seeds were sown in appropriate selection media. Transformant seedlings (TI) were tested for the presence of the transgene by P C R using at least one plasmid specific primer. The next generation derived from self-crosses (T2) was grown and seeds (T3) were collected from individual plants. A. subset of T3 seeds from each T2 plant was placed on selective media for germination to identify homozygous lines. Another subset of seeds from the same individual line was grown in parallel on nonselective media and the lines that were homozygous (based on antibiotic selection) were used for experimentation.  2.3.3 Tobacco transformation (done by K. Turner, BC Institute of Technology) Leaves disks from sterile tobacco plants grown in vitro  (1 cm pieces) were placed in a 2  1/10 dilution of transformed A g r o b a c t e r i u m cultures. Leaves disks were blot dried on sterile paper towel and cultured on M S for 48 hours and then transferred to the regeneration medium. The regeneration medium was M S with 1% sucrose, 1.0 mg/1 BAP and 0.1 mg/1 N A A  supplemented with the appropriate antibiotic. Leaf disks were grown  at room temp with 16 hours days. Shoots emerging from the leaf disks were transferred into sterile universal jars containing M S medium without hormones to induce root formation. A l l shoots generated by tissue culture derived from an independent transformation event.  28  2.4 General bioinformatics procedures 2.4.1 Sequence selection and phylogenetic tree construction The set of Arabidopsis genes characterized as encoding 4 C L enzymes (Ehlting et al., 1999) was used in homology searches to identify potential 4CL-like /ACLL  genes the  Arabidopsis genome, using the database maintained at the Arabidopsis Information Resource (http// Poplar homologues were identified by reciprocal B L A S T searches of the poplar genome assembly (Joint Genomics Institute, Populus trichocarpa v. 1.1; ArathACLL  using 4CL and  sequences as queries. The poplar gene models (from automated ab initio  gene-calling programs; Tuskan et al., 2006) assigned for a given locus were evaluated, annotated manually and revised as necessary (Table 3.1) A l l annotated candidates corresponded to loci anchored to poplar linkage groups or to sequence scaffolds, as described in Tuskan et al. (2006). Corresponding rice homologues (Table 3.1) were identified in the rice genome using B L A S T searches of the rice genome annotation at The Institute  for  Physcomitrella  Genome  Research  (TIGR;  sequences were selected in the same fashion from the JGI website.  Selected microorganism sequences were obtained by B L A S T searches using 4CL and ACLL sequences as queries at the NCBI website ( Protein sequences  for  were  aligned  using  the  Genomatix  Dialign  program  ( and the multiple protein sequence alignments were manually optimized. To reconstruct phylogenetic trees, maximum likelihood analyses with 1000 bootstrap replicates were carried out using PhyML v2.4.4 (Guindon and Gascuel, 2003) with the JTT model of amino acid substitution.  29  2.4.2 Identification of cis-acting promoter elements Promoters were defined as approximately 2Kb upstream region from the start codon. We used  the  PLACE  online  tool  (  bin/BAR_Promomer.cgi) for identification of common elements in the input list of coexpressed genes (Chapter 4; Table 4.1).  2.4.3 Search of co-regulated genes in public Arabidopsis microarray database Genes co-expressed with ArathACLL?, Metabolomics  (PRIMe  -  were identified using the Platform for Riken index)  Correlated Gene Search tool, using the union of sets method. Input data was each ArathACLL  locus ID. A l l data matrices available were analyzed and those displaying co-  expressed genes with highest Pearson coefficients were selected. For each  ArathACLL  search the top 100 genes and/or over 0.6 Pearson coefficient values are shown in Appendix 1. Pajek data output file was used for generating co-expression networks (V. Batagelj etal, 2003).  2.6 Gene expression analysis 2.6.1 Quantitative Real-Time PCR For the real-time quantitative RT-PCR described in Chapter 3 (Figures 3.4 and 3.7), total Arabidopsis R N A extracted from different tissues (10 ug) was first digested with 15U DNAse in l x buffer (Invitrogen) for 15 min at room temperature. The reaction was stopped with E D T A (2.5 m M final concentration) and heat-inactivated (65°C, 10 min).  30  R N A was precipitated with 1 volume of isopropanol and a 1/10 volume of 3 M sodium acetate at - 80°C for at least 30 min, and subsequently pelleted at 14,000 rpm in an Eppendorf 3415C microcentrifuge for 40 min at 4°C. The precipitate was washed with 70% ethanol, re-centrifuged, air dried and re-suspended in RNAse free water to an approximate  concentration  of  0.5  ug/ul.  Concentrations  were  determined  spectrophotometrically. 10 ug total R N A was used for reverse transcription with 0.27 u M oligoDT primer, 0.15 m M dNTP's, 40 U RNAseOut, and 400 U Superscriptll (Invitrogen) in 10 m M DTT and 1 x first strand buffer in a total volume of 40 ul. Prior to addition of enzymes the solution was heated to 65°C for 5 min and for primer annealing cooled to 42°C. Following an incubation at 42°C for 50 min, the reaction was inactivated by heating at 70°C for 15 min. Based on A  2 6 0  concentrations determined for the DNAse  treated total R N A samples, cDNA samples were diluted to a concentration of lng/ul. Poplar c D N A samples were obtained from S. Ralph (UBC Michael Smith Laboratories; Genome B C Treenomix project) at 1.67ng/nJ for use in real-time PCR. For quantitative PCR reactions, lOng of cDNA was incubated with 10ul QuantiTect S Y B R Green P C R mastermix (Qiagen) and 30nmole of each a forward and a reverse primer in a total volume of 20uL After an initial denaturation step at 95°C for 15 min, 40 cycles at (95°C for 15 sec, 55°C for 30 sec, and 68°C for 45 sec) followed by a fluorescence reading were performed. After a final incubation at 68°C for 5 min, a melting curve was generated ranging from 90°C to 60°C. Threshold cycles were adjusted manually, and the resulting threshold cycles (CT) were subtracted from CT values obtained for a housekeeping control amplified in parallel on each plate thus generating normalized CT values (ACT). The relative starting quantities of each gene were determined by setting as a base value  31  the gene with the highest CT value, and relative quantities were calculated using the AACT method as described in (Hietala et al., 2003). ACT were calculated after normalization  using  phosphoribosyltransferase  the  following  control  genes:  (AFT1 - Atlg27450) for  Arabidopsis  adenine  all Arabidopsis expression  experiments and poplar eukaryotic translation initiation factor 5A-1 / eIF-5A 1 (c672 closest homologue to Atlgl3950) for all poplar experiments. AACT was calculated using the following reference tissues: of the highest expressing tissue for developmental expression analysis (Figure 3.4), unwounded leaf tissue (Figure 3.5) and unstressed leaf tissue (Figure 3.7). Only ihtron-spanning primers were used (Table 2.3). Selected reactions were sequenced for quality control.  32  Table 2.3: Primers used for quantitative and semi-quantitative RT- PCR Gene Arabidopsis ACLL1  Primer  RT-CLL3F RT-CLL3R ACLL2 RT-CLL12F RT-CLL12R ACLL3 RT-CLL7F RT-CLL7R ACLL4 RT-CLL6F (purif) RT-CLL6R (purif) ACLL5 RT-CLL4F RT-CLL4R ACLL6 RT-CLL1F RT-CLL1R ACLL7 RT-CLL2F RT-CLL2R ACLL8 RT-CLL8F RT-CLL8R ACLL9 CLL5200F (purif) CLL5200R (purif) APT1 (control) APT1 F APT1 R A c t i n l (control) AtActin3F AtActin3R Poplar ACLL1 RT-POP1F RT-POP1R ACLL2 RT-POP16F RT-POP16R ACLL3 RT-POP33F RT-POP33R ACLL4 RT-POP26F RT-POP26R ACLL5 RT-POP27F RT-POP27R ACLL10 RT-POP17F RT-POP17R ACLL11 RT-POP28F RT-POP28R ACLL12 RT-POP19F RT-POP19R ACLL13 pop24RT-F pop24RT-R c672 (control) C672F C672R  Sequence (5'-3') GAAGTCCTACTGTGATGAAAGG AGGTTTCATGTCAGGGATGGG CAAATACAAAGGCTATCAGGTG AGTGTTTGCCGGATGCAGTC AGGGCCCTTCTATTTCTAAAGG CACGTGGCTAGATTCATATCG CAACGGGTATAGGAGCTTCAC AC I I CI I IGTCCGGAAACGGG CCTAATGTCCAAGTCCAAGAG CTTCCTCGTCCGGTAACGGC GGTGCATACCGGAGATCTTGG CAGGATGTGATACAAGAAGACC GGTCCCGGTGTCATGAAAGGATAC CAGTTGGAGATTTTGGTATAGAGTTGTCC GGGCCTTCTATCGCCAAAGG CGTGCTACGTAAGCCATCGG CCTGTATCTCCTCCGTTGATTG CTCTGTCAAGCCATAGCCCTG GTTGCAGGTGTTGAAGCTAGAGGT TGGCACCAATAGCCAACGCAATAG GCGACAATGGAACTGGAAT GGATAGCATGTGGAAGTGCATACC GAATGCGCCAAGAATTTGCCG AGGAGGGAGAGGCTTTGCAG GATATGAGGTTCCACGGTCCC ACTTGAGACTGATAGTAACTTCC CAGGGAAGCATGCTAACACAGG CAGTTTAGAAGTCAGGGAGCAC CATCATCAACTATTGATTCAGAGG GGAAATGGTATTACAGCAGCATC ATCGATTCAGAGGGATGGTTAAG CAGGAAACGGTATTACAGCAGC CACACGTGGAAATAGTACAG CTCATCTCCTACATAGCCTTTC GCCAACTGTCATGAAGGTTATG CTCTTCATCAGGATACGGAATC CACAGGCTGAAATAATGCAGGG CATCTCCTACATAACCTTC CCAGTGTGTTATG C AAG GTTAC TGCCTCTTCATCTGGCAACGG GACGGTATTTTAGCTATGGAATTG CTGATAACACAAGTTCCCTGC  33  2.6.2 Semi-quantitative RT-PCR For the semi-quantitative RT-PCR described in Chapter 4 (Figure 4.1), gene-specific and intron-spanning primers were used in PCR reactions to amplify corresponding cDNA sequences as follows: general P C R conditions were 95°C for 3 min, followed by 30 cycles of (94°C for 30 sec, 57°C for 30s, 72°C lmin) followed by 72°C for 3mins, using Taq polymerase in a 25^1 total reaction. P C R products were separated on 1% ethidium bromide agarose gels, and photographed under U V transilluminator using Alphalmager 1220. Actin 1 (At2g37620) was used as control.  2.6.3 GUS histochemical assay The GUS histochemical assay solution was prepared by mixing an aqueous solution of 100 m M NaP0 , pH 7.0, 0.5% X - G L U C (bromochloroindoyl-b-glucuronide) with an 4  aqueous solution of 2 m M K Fe(CN) and 2 m M K Fe(CN) in 0.1% TritonX. Young 3  6  4  6  Arabidopsis leaf blades were wounded with scissors, cut from the plant after l h and placed in an 1.5mL plastic tube with cold GUS solution. The tubes were vacuum infiltrated for 15 minutes. The samples were incubated at 37°C for 2h or until a blue color could be seen. The reaction was stopped by removal of the assay buffer and the addition of 95% ethanol. Samples were cleared by incubation in 95% ethanol overnight. Stained Arabidopsis leaves were visualized using a Leica dissecting microscope and Spot32 camera and software, at the U B C Biolmaging Facility.  34  2.7 Sub-cellular localization of ArathACLL4 and PoptrACLL5 Agrobacterium strains carrying GFP::ArathACLL4 and the GFP::PoptrACLL5 were used to transform tobacco leaf discs for generating a transgenic tobacco plant, as described in section 2.3.3. Transgenic plantlets were screened for fluorescence indicating GFP expression under an epifluorescence microscope (Zeiss Axioplan 2) and plants with both high and low levels of GFP expression were selected for analysis by confocal microscopy (Zeiss Meta Confocal). Plants expressing GFP::ArathACLIA were used as a positive control and plantlets with no visible GFP fluorescence and wild type tobacco plants were used as negative controls. Chloroplast auto-fluorescence was excited with a 488-nm argon laser and was detected after passage through a long-pass 650-nm emission filter. GFP fluorescence was excited with a 488-nm laser and was detected after passage through a band pass 505-530-nm emission filter. Images were reconstructed using the ImageJ software suite (  2.8 Identification and characterization of an ACLL5  insertion mutant  2.8.1 Genetic methods Seeds for an ArathACLL5 (Atlg62940) transposon insertion line (stock code N123936; synonymous SM_3.37225) were obtained from the The European Arabidopsis Stock Centre (NASC) Arabidopsis Biological Resource Center. Homozygous insertion lines were identified by PCR-based screening for both the presence of the transposon insertion and the absence of an intact endogenous gene. Primers are listed on Table 2.4. Primer combinations used were as follows: C L L 4 F and EcoRl reverse for detection of endogenous gene, and C L L 4 F and dspnl for detection of the transposon insertion. P C R  35  analysis confirmed the homozygosity of the insertion insert in all plants displaying male sterile phenotype. The PCR fragments generated by C L L 4 F and dspnl were sequenced to determine the exact location of the insertion in the ACLL5 gene. Real-time PCR (section 2.6.1) and RT-PCR (section 2.6.2) were used to determine mRNA levels in the mutants. Genetic crosses of wild-type pollen to a homozygous acll5 mutant plant were performed to obtain F2 generation plants. The pattern of insertion segregation ACLL5  transposon  insertion in the F2 generation was tested by chi-square statistical analysis on observed phenotypes and genotypes.  Table 2.4: Primers used for genotyping ACLL5 transposon insertion lines Gene ArathACLL5  Primer CLL4F EcoRI r e v e r s e  Sequence ( 5 '- 3 ' ) ATGGAGAGTCAAAAGCAAGAAGATAATG C A T T G T C G G T A T C T C C G C A 1 1 1GTC  transposon  dspml  (JI IAI 1 1 C A G 1 A A G A G 1 G 1 G G G G 1 1 1 1 G G  2.8.2 Phenotypic analysis of the acll5-l mutant For  scanning electron microscopy observations, using a Hitachi  S4700 S E M ,  inflorescences of wild type and homozygous mutant lines were fixed overnight in 2% glutaraldehyde, washed and post-fixed in 1% osmium tetraoxide in 0.05 M PIPES buffer, and dehydrated using a series of graded ethanol solutions (30% to 100%). Dried samples were gold sputter coated (Nanotech SEMPrep II Sputter Coater).  To obtain cross  sections of developing anthers, inflorescences of wild type and mutant lines were fixed in F A A (4% paraformaldehyde, 15% acetic acid, and 50% ethanol) overnight and directly dehydrated without post-fixation. Samples were then transferred to a propylene oxide solution and slowly infiltrated with Spurr's epoxy resin (Canemco). For bright-field  36  microscopy, 1/mi sections were cut with glass knives (Leica) on a microtome, mounted on glass slides, heat fixed to the slides and stained with toluidine blue. Sections were photographed using a light microscope. A l l procedures described were performed in the U B C Biolmaging Facility (  2.8.3 In situ hybridization (Experiment performed by S.McKim, UBC) Arabidopsis Col-0 inflorescences were embedded in Paraplast (Sigma), sectioned at 8u.m thickness and mounted onto pre-charged slides. For antisense AtCLL5 probe synthesis, a 1629bp D N A template corresponding to the entire AtCLLS cDNA was amplified by PCR from  flower cDNA  using the  forward  primer C L L 4 F  (Table 2.4)  GATAATACGACTCACTATAGGCTACTTCTTGTTGATGCTGAGGATC-3'  and  5'-  reverse  primer which incorporates a T7 polymerase binding site. Digoxigenin (DIG) -labeled probes were transcribed off the template using T7 polymerase (Roche). Probes were shortened to 200bp fragments by limited carbonate hydrolysis, quantified and hybridized to slides. Tissue fixation, embedding, probe design, hybridization and signal detection are described in Hooker et al. (2002).  37  CHAPTER  3  -  GENOME-WIDE  PHYLOGENETIC  GENOMICS OF T H E PLANT-SPECIFIC  ANALYSIS  AND  COMPARATIVE  ACYLrCOENZYMEA LIGASE-LIKE (ACLL)  GENE  FAMILY IN ARABIDOPSIS AND POPLAR  3.1 Introduction The enzyme 4-coumarate:CoA ligases (4CLs) play important roles in phenylpropanoid metabolism by generating CoA esters of hydroxycinnamic acids. These cinnamyl CoA esters are used as intermediates in the biosynthesis of a large array of phenolic secondary natural products, including monolignols and flavonoids (Hahlbrock and Scheel, 1989). The first 4CL gene cloned was derived from parsley (Douglas et al, 1987), and it has subsequently been shown that 4 C L enzymes are encoded by multi-gene families in all vascular plants examined to date (Cukovic et al., 2001; Hamberger et al., in press). Analysis of enzymatic properties of recombinant enzymes has revealed that 4 C L isoenzymes have differential activity towards different hydroxycinnamyl substrates (Allina et al, 1998; Ehlting et al, 1999; Hamberger and Hahlbrock, 2004; Hu et al, 1998; Lee and Douglas, 1996; Lindermayr et al, 2002; Stuible and Kombrink, 2001). The analysis of the 4CL gene family in the fully sequenced Arabidopsis (Ehlting et al, 1999; Hamberger and Hahlbrock, 2004) and poplar (Tuskan et al, 2006) genomes showed that 4 C L is encoded by four and five genes respectively. Differential 4CL gene expression patterns in Arabidopsis and poplar, coupled with 4 C L isoenzyme substrate utilization preferences,  suggest that 4CL genes and enzymes  have  undergone  subfunctionalization and neofunctionalization for biosynthesis of monolignols and flavonoids (Ehlting et al, 1999; Hamberger and Hahlbrock, 2004; Hu et al, 1998).  38  4 C L enzymes are members of the adenylate-forming enzyme superfamily, which share a common reaction involving formation of an adenylate intermediate, and includes enzymes involved in fatty acid chain elongation (Shockey et al., 2003). Following the generation of sequence data from plant genomes, a number of genes encoding adenylateforming enzymes of unknown specific function related to true 4CLs (4CL-like, ACLL genes) have been annotated, and these may function as unknown enzymes in plant metabolism and secondary metabolism. For example, initial Arabidopsis genome sequence data revealed the presence of Arabidopsis 4CL-like genes (Cukovic et al., 2001), and later, eight members of a larger set of Arabidopsis adenylate-forming enzymes annotated in the completed Arabidopsis genome were classified as 4CL-like genes because of their close phylogenetic relationship to true 4CLs (Shockey et al., 2003; this study). It has been proposed, however, that enzymes encoded by 4CL-like genes may not have activity towards the known 4 C L substrates, but instead, activate acyl molecules derived from fatty acid metabolism (Shockey et al, 2003).  Jasmonic acid (JA) is an important plant signaling molecule, generated from the membrane lipid linolenic acid via the octadecanoid pathway (Liechti and Farmer, 2002). JA and its volatile methyl ester, methyjasmonate (MeJA), are known to be important plant growth regulators and stress signaling molecules mediating responses to various developmental and environmental cues, such as wounding and herbivory (Farmer et al, 2003; L i et al., 2005; Sasaki-Sekimoto et al., 2005). The role of JA in regulation of gene expression has been well documented, and many of the enzymatic steps in its biosynthesis have been characterized (Schaller et al, 2004; Schilmiller et al, 2007). The  39  latter steps of the octadecanoid pathway occur in the peroxisome, in which the activated CoA ester of the plastid-derived precursor  12-oxo-phytodienoic acid (OPDA) is  generated, followed by three rounds of acyl chain shortening by beta-oxidation, (Li et al, 2005). Therefore enzymes involved in this part of the pathway are predicted to be targeted to this organelle, likely via a C-terminal peroxisomal target signal (PTS1) or N terminal signal (PTS2) as deduced from sequence analysis of plant peroxisome proteins (Reumann, 2004), and previously reported for in vivo import in Trypanosoma brucei (Sommer et al., 1992). Proposed substrates for certain Arabidopsis A C L L s derive from the octadecanoid pathway, including OPDA and 3-oxo-2(2'[Z]-pentenyl)-cyclopentane1-octanoic acid (OPC8) (Koo et ah, 2006; Schneider et al, 2005). However, the biological functions of most the 4CL-like genes are still unknown.  There is an increasing amount of publicly available gene expression data in Arabidopsis, such as expression data generated by various microarray experiments. These data are searchable using bioinformatic tools such as Expression Angler (Toufighi et al, 2005) and PRIMe ( Therefore, networks of co-expressed genes can be visualized by mining existing gene expression data. For enzyme-encoding genes, such co-expression analysis can provide clues regarding possible metabolic pathways to which enzymes of unknown function may belong, based on their correlated co-expression with genes encoding other enzymes (Ehlting et al, 2006). In addition, the completion of the rice (Yuan et al, 2003) and poplar (Tuskan et al, 2006) genomes, together with the reference genome of Arabidopsis, opens the door to the application of comparative  40  genomic approaches to understanding the evolution and potential functions of conserved genes of unknown specific function such as those in the ACLL gene family.  In an initial analysis, the complete set of Arabidopsis ACLL genes, formerly called 4CLL genes, was identified based on their similarity to genes encoding bonafide4CL enzymes (Ehlting et al, 2005). In this chapter, I identified all ACLL genes in the fully sequenced poplar and rice genomes. In addition, I obtained full-length ACLL sequences from a maize genome database (sequences provided by Dr. Brad Barbazuk), retrieved nucleotide sequences from publicly available plant genome databases, and searched eukaryotic and prokaryotic genome databases for ACLL  genes in diverse taxa. Phylogenetic  reconstructions based on amino acid sequence alignments showed that ACLL genes belong to a land plant-specific clade of adenylate-forming enzymes more closely related to true 4CLs than any other adenylate-forming enzyme. Furthermore, each fully sequenced plant genome has representatives in each of five well-defined ACLL clades, four of which contain proteins predicted to be localized in the peroxisome. This suggests that A C L L enzymes perform important, conserved roles in plant metabolism. I profiled the developmental and stress-induced expression of Arabidopsis and poplar homologues representing all five clades, and similarities in expression patterns across these taxa allowed me to identify putative orthologues, and suggested subfunctionalization of ACLL genes in these two lineages. In addition, using Arabidopsis co-expression analysis, I was able to predict the function of poplar homologues in one A C L L clade related to JA biosynthesis, a hypothesis that was further tested by monitoring stress-induced gene expression and subcellular localization.  41  3.2 Results 3.2.1 Phylogenetic analysis of 100 A C L L s from plants and microorganisms The adenylate-forming enzyme superfamily of genes includes members from all organisms, including prokaryotes  and eukaryotes  (Conti et al., 1996), and are  distinguished by the presence of conserved structural elements that define this superfamily (Conti et al., 1997). The phenylpropanoid enzyme 4 C L is one member of this family that has been extensively studied due to its important role in the phenylpropanoid pathway. As a first step towards a genome-wide survey of ACLL genes most closely related to 4CLs, I identified 100 predicted 4CL-like (ACLL) proteins from genomic databases using an in silico similarity search based on the amino-acid sequences of Arabidopsis 4 C L proteins (Ehlting et al., 1999; Hamberger and Hahlbrock, 2004), as described in Materials and Methods. In this analysis, I focused on the three complete genome sequences available for angiosperms (Arabidopsis, poplar, and rice), the genomes of maize, Physcomitrella  patens, Chlamydomonas  reinhardtii, and the genomes  of selected other microorganisms (fungi and bacteria) for which complete or substantial genome sequence data were available.  In order to determine if a plant-specific clade of A C L L enzymes could be circumscribed, I carried out phylogenetic analysis of all 100 aligned A C L L translated nucleotide sequences using PhyML 4 (Guindon and Gascuel, 2003) and generated the maximum likelihood tree shown in Figure 3.1. This analysis revealed two general groups of A C L L proteins. One large group contained representatives from all organisms analyzed, including bacteria, fungi, Chlamydomonas, Physcomitrella,  42  and angiosperms, which are  relatively distantly related to true 4CLs. Many representatives in this group are probably adenylate-forming enzymes with metabolic functions in primary metabolism or related functions common to all or many prokaryotic and eukaryotic cells. As an example of possible functions of such enzymes, one clade in this group contains the Arabidopsis ACN1 protein, an acetate:CoA ligase which functions as an entry point to the glyoxylate cycle during seed germination (Turner et al, 2005).  I did not carry out any further  analyses of plant genes or enzymes in this large group. B subtilis  C.reihardtii2  Creinhardtii  Figure 3.1: Phylogenetic tree of 100 A C L L s corresponding to translated nucleotide sequences from full-length cDNAs and ESTs from various organisms. Solid triangles represent the clades of plant-specific A C L L s most closely related to true 4CL enzymes.  43  The second group of A C L L proteins, containing previously annotated Arabidopsis 4CLlike (4CLL) proteins (Ehlting et al, 2005; Shockey et al., 2003), was striking in that it was land plant-specific. A l l angiosperm genomes, as well as the Physcomitrella  genome,  encoded proteins contained in this group, while no representatives from other eukaryote species, including Chlamydomonas,  were found. Based on their deduced phylogenetic  relationships to each other, the A C L L s in this group could be further divided into five clades (Figure 3.1; clades A to E), which are phylogenetically closely related bona fide 4CLs. In each of the clades there is at least one representative of each of the four angiosperm plant species analyzed (Arabidopsis,  poplar, rice, and maize; data not  shown), demonstrating that these proteins are evolutionarily conserved in the angiosperm lineage and that common ancestors in each clade were present before the divergence of monocots and eudicots.  i  Analyses of the Physcomitrella  EST dataset revealed an A C L L protein monophyletic to  true 4CLs, suggesting that the 4 C L clade likely originated early during the evolution of land plants, consistent with the postulated role of phenylpropanoids in the adaptation to the land environment (Douglas, 1996). Interestingly, a second Physcomitrella  ACLL  protein in this group appears basal to clades C, D and E, suggesting that the proteins from these clades originated from the same common ancestor, also early in land plant evolution. Completion of the Physcomitrella additional Physcomitrella  A C L L s exist.  44  genome sequence should reveal whether  3.2.2 Most A C L L s contain the PTS1 (Peroxisomal Target Signal 1) Almost all proteins in clades B , C, D and E contain the PTS1 peroxisomal target sequence in their C-termini, which suggests they are targeted to this organelle. To my knowledge, ArathACLL9 from clade E and ArathACLL4 from clade D are the only enzymes for which this localization has been experimentally demonstrated, using a GFPtagging approach (Koo et al, 2006; Schneider et al, 2005). Interestingly, all fungal adenylate-forming enzymes identified also have peroxisomal target signals. None of the A C L L s in clade A , the A C L L clade most closely related to bonafide4CLs, contained the PTS1 sequence, suggesting that loss of this sequence may have played a role in the acquisition of 4 C L and clade A functions.  3.2.3 Species-specific ACLL gene family evolution In order to gain insights into species-specific retention and expansion of the plant-specific ACLL genes for each clade, I next analyzed each clade in more detail, focusing on the complete. ACLL gene families from the three angiosperm genomes for which whole genome sequence information is available: Arabidopsis, poplar, and rice. As shown in Figure 3.2, all three species contained A C L L proteins in each of clades A - E . M y annotation of the complete set of ACLL genes in poplar, and their locations on poplar linkage groups, is given in Table 3.1.  45  ArathACNI PoptrACLL16 „ , JL^ 0rysaACLL14 PoptrACLL15 \ P°PtrA<kL17 • ,OrysaACLL15 ArathACLL 11 ACLL18  4  /  PoptrACLL14 ArathACLLIO  PoptrACLL19  74  100  OrysaACLL13-52  100 T74  PoplrACLL20 PoptrACLL21 100^  OrysaACLL16 OrysaACLL17  ArathAELLS  Bona  • t  Fide.  Pc  ACL L a 1  '100  i?L  Orysa4CL1 - • • f l W A O u t a OrysaACLLI Orysa4CL2 .'Arath4CL>>  4CU  Poptr4CL3 Arath4CL1 Arath4CL2. Poptr4CL Poptr4CL1 • '. Poptr4CL5Pop4CL4 •..Ara(h4CL4/  100  78  Bf  ..•••p'optrAClU PoptrACLL2 ArathACLL6  B  1CKL  '•0.rysaACLL4 y  100  OrysaACLL11 ..109.  .•100  PoptrACLL3..' • • ••Ai«1iACliL3 ArathAGLL2 AfathACLL8  90  .ArathACLL9  100 JM  ^  100  b'r^zaACLL7 \  PoptrACLL12  PoptrACtL11 PoptrAClilO  \PoptrACLL8 PoptrACLL9  / \ A OrysaACLL8^ PoptrACi,L6 OrysaACLL9 PoptrACLL7 'Ory$aA'CLL10 ArathACLL^ ..ArathACLLI  J_)[P0PtrACLL4 : PoptrACLL5 :  0 1  94  90  '  PoplrACLL22  AralhCLLIO ' PoptrACLL23  100.  OrysaACLL5 ..•6rysaACLL6  OrysaACLL18  Figure 3.2: Phylogenetic relationship of plant-specific A C L L s , including translated nucleotide sequences from Arabidopsis, poplar and rice. Bootstrap values are shown on the branches. Values below 70% were removed from the tree. Clades are circled and contain at least one representative of each plant species. Proteins in shaded boxes contain the PTS1 (Peroxisomal Target Signal 1).  46  Table 3.1: Annotation of Populus trichocarpa and Oryza sativa ACLL genes. Gene name Poplar* PoptrACLLl PoptrACLL2 PoptrACLL3 PoptrACLL4 PoptrACLL5 PoptrACLL6 PoptrACLL7 PoptrACLL8 PoptrACLL9 PoptrACLLlO PoptrACLLl 1 PoptrACLL12 PoptrACLL13 Rice OrysaACLLl OrysaACLL2 OrysaACLL3 OrysaACLL4 OrysaACLL5 OrysaACLL6 OrysaACLL7 OrysaACLL8 OrysaACLL9 OrysaACLLlO OrysaACLLl 1 OrysaACLL12  Clade Gene model  Location  Coordinates  B B C D D E E E E E E E A  eugene3.01230068 estEXT_fgeneshl_pg_vl.C_LG_IV0024 fgenesh4_pg.C_LG_III000781 eugene3.00020113 fgenesh4_pm.C_LG_V000686 fgenesh4_pm.C_LG_X000932 fgenesh4_pm.C_LG_VIII000094 eugene3.00640074 fgenesh4_pm.C_LG_X000174 estEXT_fgenesh4_pm.C_LG_XV0272 eugene3.00120875 grail3.0015024001 euqene3.00010460  scaffold 123 LG IV LG III LG II LG VIII LG X LG VIII scaffold 64 LG X LG XV LG XII LG XII LG I  568900-572342 8408652-8412261 9834289-9841997 738720-743055 17267514-17272723 19581039-19583937 1492142-1494543 517028-522599 6952628-6055525 6923655-6933433 11162718-11168082 11177018-11180781 3983954-3987069  4CL 4CL 4CL B C c C D E E E N/D** A  Os08g34790 Os02g08100 Os06g44620 Os03g05780 Osl0g42800 Os08g04770 Os03g04000 Os01g67530 Os01g67540 Os07g17970 Os07g44560 Os04q24530  *Gene model and locations from JGI Populus trichocarpa web browser v. 1.1 * * not defined by phylogenetic results  47  While the number of ACLL  genes within each genome was similar (13 in poplar, 12 in  rice, and 9 in Arabidopsis), the number of genes in each clade varied between species, and certain clades were greatly enriched with genes from a particular species. For example, clade D is an Arabidopsis rich clade, with 5 Arabidopsis representatives, two from poplar, and only one rice member. On the other hand, clade E is poplar rich with seven poplar ACLL  genes, three from rice and one from Arabidopsis. Clade A , unique in  - containing A C L L proteins lacking the PTS1 targeting signal, is the only clade that contained a single representative from each species. Thus, while the origin of the A C L L s clades clearly predates the divergence of monocots and dicot lineages, the variable numbers of genes in most clades reveals species-specific genome evolution. For example, four of the five Arabidopsis genes in clade D are found in tandem in chromosome 1 (Figure 3.3), which suggests that the duplication of the original gene that gave rise to clade D was a result of tandem gene duplication events and selection for retention of copies in the Arabidopsis genome. Two of the poplar A C L L genes in clade E (ACLL11 and ACLL12) appear to have arisen by tandem duplication on linkage group XII (Table 3.1). However, other members of this and other clades that have poplar gene models anchored to linkage groups are physically unlinked and on different linkage groups. This suggests that tandem gene duplication was not the only factor responsible for the diversification of the poplar ACLL gene family, in keeping with the apparent greater role of tandem gene duplications in Arabidopsis genome evolution relative to poplar (Tuskan et al, 2006). Many of the poplar ACLL  genes may rather have been retained after the  salicoid whole genome duplication^ in the Populus lineage, in which chromosome doubling and subsequent rearrangement is thought to have increased the  48  Populus  chromosome number from n=10 to the current n=19 (Tuskan et al., 2006). For example, in the poplar rich clade E, ACLLl0,  ACLLl 1 and ACLLl2,  which are located on  duplicated homologous linkage groups XII and X V , and ACLL6 and ACLL7, which are located on duplicated homologous linkage groups XIII and X (Table 3.1; Tuskan et al., 2006) are likely to have arisen in this manner.  1 clade B  -ACLL6  clade C  i—ACLL7  ACLL1 'ACLL2 ACLL3 ACLL4  clade D N  clade D  clade A  I—ACLL8  —ACLL5  clade E  l—ACLL9  Figure 3.3: Schematic representation of the Arabidopsis thaliana genome, showing the location of A C L L genes with the respective clades. Clade D is the only clade containing more that one A C L L in Arabidopsis, containing 4 genes originated by tandem duplication in chromosome 1 and one gene in chromosome 5.  49  Also noteworthy amongst the poplar clade E A C L L s is the loss of C-terminal PST1 peroxisomal targeting sequences in two members ( A C L L 6 and ACLL12), suggesting that functional diversification, or neofunctionalization, may have taken place at the level of enzyme localization in the poplar lineage after gene duplication. Taken together, these data show two scenarios in the evolution of the ACLL  genes: conservation of ACLL gene  number, and likely function, in all three angiosperm lineages for some clades (A, B, and C, with 1-2 members from each lineage), and family expansion with possible diversification of function taking place in a lineage-specific manner for other clades (D and E).  3.2.4 Comparative analysis of Arabidopsis and poplar 4CL and A C L L proteins In order to assess the amino acid sequence diversity of A C L L proteins relative to that of well-characterized 4CL gene family members, the levels of amino-acid sequence conservation among poplar and Arabidopsis 4 C L enzymes were compared to the levels" A C L L protein sequence conservation in each clade. Similar levels of interspecies identity may indicate retention of function among A C L L enzymes between lineages, as observed for the enzymes in the bona fide 4 C L clade. Table 3.2 shows the identity values of Arabidopsis and poplar 4CLs and A C L L s in relation to each other for each clade. Identity was calculated for both overall sequence (OS) and putative substrate binding domain region (SBD) and are shown as OS%/SBD%. Identity values between Arabidopsis and poplar 4CLs, which use the same or similar substrates, are over 65%/70%, suggesting that A C L L s with similar or higher levels of conservation in sequence may also have conserved functions. The data show that OS%/SBD% identity values were 75%/81%  50  between ArathACLL5 and PoptrACLLl3 of clade A , the highest level of conservation seen between poplar and Arabidopsis proteins in this study.  In clade B , with one  Arabidopsis and two poplar genes, values were 74%/75% and 74%/73%, also demonstrating high conservation of sequence. This result may indicate that A C L L s in clades A and B have conserved functions in both poplar and Arabidopsis.  In clade C, with one A C L L copy in Arabidopsis and poplar, the sequence conservation level was of 58%/65%, suggesting that genes in this clade have diverged and could carry" out species-specific functions. In clade D, enriched with Arabidopsis sequences, identity values between poplar and Arabidopsis sequences were low, comparable to that of clade C, with the exception of ArathACLL4, which reached values of 72%/75% when compared to the two poplar proteins in this clade, PotrACLL4 and PoptrACLL5. This result may suggest that ArathACLL4 function has been conserved in both species, while other Arabidopsis A C L L s in the same clade may have diverged in function. For clade E, which contains 7 poplar genes for one Arabidopsis gene, the three poplar homologues most similar to ArathACLL9 were analyzed, and the results showed that amino-acid identity values were generally low. The highest values were between ArathACLL9 and PoptrACLLl 1 62%/73%, dropping to 59%/66% for PoptrACLLlO and 55%/64% for PoptrACLLl2. With the exception of PoptrACLLl 1, results suggest that the poplar A C L L s in clade E may have diverged to fulfill other biochemical and/or biological roles in a species-specific manner.  51  Table 3.2: Amino acid identity comparison of full-length amino acid sequences of 4CLs and A C L L s in the different clades. Results of pairwise similarity are shown as full-length sequence% / predicted substrate binding domain%.  Arath4CLl  Arath4CL2 78%/87%  ArathACLL5  PoptrACLL13 75%/81%  ArathACLL6  PoptrACLLl 74%/75%  Clade C ArathACLL7 PoptrACLL3  ArathACLL7  PoptrACLL3 58%/65%  Clade D ArathACLLl ArathACLL2 ArathACLL3 ArathACLL4 ArathACLL8 PoptrACLL4 PoptrACLL5  ArathACLLl  ArathACLL2 58%/50%  4CL.S  Arath4CLl Arath4CL2 Arath4CL3 Arath4CL4 Poptr4CLl Poptr4CL2 Poptr4CL3 Poptr4CL4 Poptr4CL5 Clade A  ArathACLL5 PoptrACLL13 Clade B  ArathACLL6 PoptrACLLl PoptrACLL2  Clade E ArathACLL9 ArathACLL9 PoptrACLLlO PoptrACLLll PoptrACLL12  Arath4CL3 56%/67% 60%/66%  Arath4CL4 62%/72% 63%/71% 55%/61%  Poptr4CLl 66%/77% 69%/75% 62%/66% 59%/68%  Poptr4CL2 65%/76% 68%/75% 62%/67% 60%/69% 85%/85%  Poptr4CL3 58%/70% 60%/69% 69%/75% 52%/62% 63%/70% 62%/67%  ArathACLL4 65%/63% 64%/65% 67%/69%  ArathACLL8 53%/56% 70%/73% 71%/72% 60%/67%  PoptrACLL4 63%/66% 60%/60% ' 65%/63% 72%/75% 61%/63%  PoptrACLL5 61%/62% 61%/57% 64%/61% 72%/74% 60%/60% 88%/90%  Poptr4CL4 67%/73% 67%/74% 59%/66% 60%/64% 72%/76% 72%/77% 60%/70%  Poptr4CL5 68%/74% 69%/74% 69%/66% 62%/66% 74%/79% 74%/79% 60%/70% 89%/90%  PoptrACLL2 74%/73% 91%/88%  ArathACLL3 59%/58% 83%/75%  PoptrACLLlO PoptrACLLll PoptrACLL12 59%/66% 62%/73% 55%/64% 87%/85% 72%/67% 72%/74%  3.2.5 Comparative expression analysis of Arabidopsis and poplar genes In order to gain clues as to the possible functions of A C L L proteins, I examined the gene expression patterns of all Arabidopsis ACLL genes, as well as representative poplar genes in clades A - E by quantitative real-time reverse transcription-PCR. Data on expression in different organs are shown in Figure 3.4.  52  These results revealed that Arabidopsis and poplar ACLL homologues tended to have similar developmental expression patterns in clades in which there are single Arabidopsis and poplar A C L L representatives. A striking example is clade A , in which ArathACLL5 expression was strongly flower-preferred, and PoptrACLL13, while also showing expression in phloem and bark, showed a similar pattern of flower-preferred expression. Interestingly, PoptrACLL13 expression is specific to male flowers, and a putative ArathACLLS orthologue in tobacco shows an anther preferred expression pattern (Varbanova et al, 2003). Together, these data suggest a role for A C L L enzymes in clade A in a biochemical pathway important in anther and/or pollen development. Another example is the predominant expression of both Arabidopsis and poplar representatives of clade C in leaves, with less predominant expression in stem/xylem/phloem and flowers. Clade B contains two poplar genes and a single Arabidopsis member. Genes in this clade were expressed in all organs, but both poplar PoptrACLLl and ArathACLL6 showed highest expression in mature leaves, and lower expression in other organs and tissues. Interestingly, the PoptrACLLl expression profile differed from that of PoptrACLL2, with highest expression in flowers, bark, and young leaves, suggesting subfunctionalization of expression patterns, as predicted as one possible outcome of genes retained after duplication events (Duarte et al, 2006).  More complex expression patterns were observed in clades where substantial expansion of gene family members in either Arabidopsis or poplar has occurred. In clade D, the duplicated poplar genes PoptrACLIA and PoptrACLLS appeared to have very similar expression patterns across a range of organs and tissues, with low expression only in  53  xylem and male flowers (Figure 3.4). However, the transcribed portions of the two poplar genes were nearly identical, making it impossible to design gene-specific P C R primers, and the products amplified using primers for each gene were contaminated with products from the other gene (data not shown). In contrast, for the five representatives of the Arabidopsis members of this clade, I saw distinct and complementary expression patterns throughout almost all organs tested in Arabidopsis, which suggests subfunctionalization in expression of these duplicated genes in clade D. The only Arabidopsis gene from clade E, ArathACLL9,  was most highly expressed in seedlings, followed by flowers. The  expression patterns of the three poplar homologues most closely related to  ArathACLL9  (out of the seven poplar genes present in this group) were largely complementary to each other, covering expression in leaves, roots and male flowers, and did not parallel that of ArathACLL9.  54  „ ArathACLL7  0  T d Y L  M L Y S M S  ; „„J|optrACLL3  R  f  °  YL ML  P  B X Pti Y R M R M F  FF  , ArathACLLI  , ArathACLL:  1  , ArathACLL4  , ArathACLL8 _ii_zr_ w ^ ™ •  Poplar (PoptrACLL):  PoptrACLL4  , ArathACLL3  L J - y o u n g leaves L M - mature leaves  I  ill f l . l l  LI  PoptrACLL5  •Jl  5.  'MR MF  Arabidopsis (ArathACLL): 7d - 7day seedling Y L - young leaves ML- mature leaves Y S - young stems MS- mature stems R-roots F- flowers  P- petiole B- bark Ph- phloem X- xylem R J - young roots RM-mature roots MF- male flowers FF- female flowers  FF^  Figure 3.4: Tissue expression profile of ACLLs in Arabidopsis and poplar. Expression was determined by real-time PCR relative to the tissue with the highest level of expression, set at 100%. Calibrator genes used were APT1 for Arabidopsis and c672 for poplar. Two technical replicates per tissue were tested.  55  3.2.6 Identification of stress responsive ACLL genes One important clue in the quest towards identifying biological roles for the A C L L s came from the presence of the peroxisomal target signal (PTS1) in most of the Arabidopsis, rice, and poplar A C L L s in clades B, C, D and E. Given that the A C L L s in this study are part of a plant-specific group (Figure 3.1), I considered possible plant-specific peroxisomal functions. For example, plant peroxisomes, in addition to ubiquitous functions in primary metabolism, perform special roles in synthesizing plant hormones such as auxin and jasmonates (Nyathi and Baker, 2006). Jasmonates, in particular, play important roles in response to stress (Farmer et al., 2003). Therefore, ACLL  genes  encoding proteins targeted to the peroxisome were subjected to further expression analyses in order to gain insights into possible environmental influence on their expression.  I generated transgenic Arabidopsis plants expressing chimeric constructs of selected Arabidopsis ACLL  promoters fused to GUS reporter genes, and then analyzed GUS  activity histochemically. Promoter regions were defined as the genomic sequences directly upstream of the A T G start codon, between 1.5 and 2-kb in length. At least five independently transformed lines were generated per gene, and at least 5 individuals of each transgenic Arabidopsis line were examined. Transgenic plants were subject to mechanical wound treatments as described in Material and Methods, and representative results consistently observed are shown in Figure 3.5A. Out of all plants tested, GUS expression was stronger at the wound sites of transgenic plants containing promoter constructs from clade D genes (ACLL2,  ACLL3  56  and ACLL4),  indicating activation of  these ACLL promoters upon wounding. Promoters of ACLL genes in clades B, C and E , and of ACLL1 and ACLLS in clade D were not wound activated by this assay.  Wound responsiveness of genes encoding peroxisomally targeted A C L L s was further tested by measuring gene expression using quantitative real-time reverse transcription PCR at various times after mechanical wounding in both Arabidopsis and poplar (Figure 3.5B). Arabidopsis 4CL2, known to be wound inducible (Ehlting et al, 1999), was used as a positive control for Arabidopsis treatments, and was up-regulated by over 5 fold at 4h after wounding, as expected (data not shown).  In clade B , ArathACLL6 and PoptrACLL2 showed no wound responsiveness, whereas PoptrACLLl expression was slightly up-regulated (1.6-fold) after 6h wounding. These data suggest that ArathACLL6 and PoptrACLL2 do not function in wound-related biochemical pathways, but that PoptrACLLl may have gained this function after duplication of the gene in the Populus lineage. In clade C, ArathACLL7 expression was down-regulated to less than half the level of the unwounded control, and a similar result was obtained for the single poplar homologue in this group, PoptrACLL3. Arabidopsis and poplar genes in clade D were particularly responsive to wounding, with two out of five Arabidopsis ACLL genes showing wound induction, up to 14-fold l h after wounding, and the poplar homologues PoptrACLL4/5 were induced by up to 5-fold 2h after wounding. ArathACLLA, which has been shown to be an OPDA-CoA ligase (Koo et al, 2006) and showed strong wound induction of the ArathACLL4::GUS reporter gene, showed no wound induction by real-time PCR. A possible reason for this may be the time  57  points chosen for harvesting the tissue after wounding, which could have missed the 'window" of transient up-regulated expression of this gene. ArathACLLl,  for which no  developmental expression was detected in leaves by reverse transcription PCR (Figure 3.4), showed only weakly detectable expression on the basis of multiple promoter-GUS lines (Figure 3.5A), and no expression above background levels was detectable by reverse transcription PCR after wounding (data not shown).  Finally, in clade E, ArathACLL9  expression was down-regulated after l h and expression  stayed below the original levels even at 24h. Of the 3 closest poplar homologues, PoptrACLLll PoptrACLLll  showed a similar expression pattern,  whereas  PoptrACLLlO  and  showed little or no change in expression in response to wounding. The  enzyme encoded by ArathACLL9  has been shown to have activity in, vitro with  octodecanoid precursors in JA biosynthesis, suggesting that it may be involved in its biosynthesis in the peroxisome (Schneider et al, 2005). However, the lack of woundinducible expression suggests that ArathACLL9  and its closest poplar homologues are not  involved in stress-induced octadecanoid metabolism.  Overall, the results from promoter-GUS expression and real-time PCR showed consistent wound induced up-regulation for both the poplar gene members and certain Arabidopsis members of clade D, whereas genes in other clades showed little or no wound responsiveness. This suggests, as shown for ArathACLL4  (OPDA::CoA  ligase;  Koo et  al, 2006) that members of clade D are likely to have functions in stress response pathways localized in the peroxisome.  58  BArathACLL6  ArathACLL7  b  ArathACLLl ArathACLL2 ArathACLL3  ArathACLL4  ArathACLL8  ArathACLL9  I  B Clade  Arabidopsis  B  Poplar  ArathACLLB  _  •  ArathACLL7  i; AralhACLL2  AialhACLL3  _ I  d  . AraIhACLL4  _  - -  PoptrACLL2  1  0  PoptrACLL3  Aralh ACLLS  PoptrACLL4  PoptrACLL5  • •-  m ArathACLL9  6  | PoplrACLLI  ah PoptrACLLlO  Mi ,  H 1  Mi  PoptrACLLll  •  •  •  to  in (  I4n  PoplrACLLI 2  »  »  Figure 3.5: Effect of wound stress on peroxisomal ACLLs. (A) Histochemical GUS staining in transgenic Arabidopsis plants expressing beta-glucoronidase (GUS) gene driven by the corresponding ACLL promoter. (B) Real-time PCR data of time course of wound response in Arabidopsis and poplar. Y axis represents fold change relative to unwounded control.  24*  3.2.7 In silico co-expression analysis of Arabidopsis ACLL genes With the increasing amount of information becoming available for global expression data in Arabidopsis, public databases have been successfully used to identify co-expressed genes that could be participating in the same biological process and/or biochemical pathway (Ehlting et al, 2006; Persson et al, 2005). Therefore, in the effort to gain insights into the possible functions of peroxisomal A C L L s , I performed an in silico coexpression analysis using public Arabidopsis  microarray datasets. Using the PRIMe  Correlated Gene Search tool, I queried all microarray experiments in all datasets available from the R I K E N Institute ( The top 100 most highly co-regulated genes in the dataset, showing highest Pearson co-expression coefficient values, are listed in Appendix 1. When possible, a graphic network of interactions was generated using the Pajek program (V. Batagelj et al, 2003) to facilitate interpretation of results (Figure 3.6).  Out of all eight peroxisomal ACLL?, analyzed, only co-expression data for ArathACLL3 and ArathACLlA,  both in clade D, provided extractible information that could be  associated with a biological function in plants. ArathACLL3 demonstrated extremely high coefficient values, of above 0.9, with all 100 co-expfessed genes. Around 25% of the genes in this list were associated with lipid metabolism, and about 10% were correlated with seed germination. ArathACLL3 was directly connected to a gibberellin oxidase in the network of co-expressed genes (Figure 3.6). These results suggest that ArathACLL3 could have a function in lipid metabolism related to seed development or germination. However, due to the extremely high values of co-expression between all genes on this  60  list, it is quite difficult to distinguish a biochemical pathway that could require this CoA ligase.  For ArathACLIA, the 100 most highly co-expressed genes had coefficients of coexpression varying from 0:64 and 0.86. Among these genes, I identified a network of coregulated genes that participate in the JA pathway, including those encoding lipoxygenases (LOX), allene oxide synthase (AOS), allene oxide cyclase (AOC) and OPDA reductase (OPR3). In addition, other stress related genes were co-expressed with ArathACLIA such as the transcription factor W R K Y 4 0 , which has been shown to be upregulated after infection with Pseudomonas syringae or treatment with salicylic acid (Dong et al, 2003), and RLPK3, shown to be activated by oxidative stress and pathogen attack (Czernic et al, 1999). This result suggested that ArathACLIA participates in defense response. More specifically, ArathACLL4 could be involved in the JA biosynthetic pathway. This hypothesis was independently confirmed by Koo et al. 2006, who demonstrated in vitro biochemical activity of ArathACLL4 with OPDA and OPC8 as substrates, both components of the JA pathway, and localization of this enzyme in the peroxisome.  61  ACLL3-At1g20500 (0.9O.99)  AM JO* 300 ,  / A»4flt0490 I  ^  -7 -  A  ' . \  / /  At1g20500  *  1 ai781 o^~^r^l^/f  ACLL4-At1g20510 (0.7<0.85)  Figure 3.6: Pajek co-expression networks generated from PRIMe Correlated Gene Search tool data ( (A) Co-expression network of ArathACLL3. 0.99 At5g07200 gibberellin 20-oxidase (A) was the most highly co-expressed gene with A C L L 3 . Tissue and Development dataset (237 data). (B) Co-expression network of A r a t h A C L I A Co-expressed J A synthesis genes ( • ) : 0.851 At3g25780 allene oxide cyclase (AOC), 0.847 At2g06050 12-oxophytodienoate reductase, (OPR3), 0.831 Atlg72520 lipoxygenase, 0.830 Atlgl7420 lipoxygenase are among the most highly coexpressed with A C L L 4 . Stress treatment dataset (298 data). Pearson coefficients are highlighted in bold.  62  Interestingly, no gene was co-expressed with ArathACLL7 in clade C. Also, while more than 45 genes were co-expressed with ArathACLLl,  ArathACLL8  and ArathACLL9  individually, at high co-expression coefficients (starting above 0.6), no network of coexpressed genes could be generated for these genes. This result could indicate that coexpressed genes in these sets are more highly co-expressed among each other than with the corresponding ACLLs, in which case the co-expression would be only circumstantial, and without biological meaning. Genes co-expressed with ArathACLLl  and ArathACLL6  generated networks with scattered connections, which did not provide any clues regarding biological function. However, it is worth noting that ArathACLLl  was most  highly co-expressed (Pearson coefficient = 0.8) with the gene encoding 3-hydroxy-3methylglutaryl-CoA reductase 2 (HMGR2), which catalyzes the synthesis of mevalonate, a rate-limiting step in the mevalonic acid pathway of isoprenoid biosynthesis (Enjuto et al, 1995).  In conclusion, co-expression analyses applied to individual Arabidopsis ACLL genes provided limited functional information with the exception of ArathACLlA,  now known  to encode an O P D A / O P C 8 - C 0 A ligase as predicted based on co-expression analysis. However, the data presented here on ACLL1 and ACLL3 may be useful for generating hypotheses once more is known about these A C L L s .  3.2.8 Poplar genes activated by additional stress treatments In this study I demonstrated that among the poplar ACLL genes in clade D, PoptrACLlA and PoptrACLLS, encode highly similar proteins most closely related to  63  ArathACLlA,  which encodes an ODPA/OPC8-C0A ligase (Koo et al., 2006). Therefore it was possible to speculate that both poplar genes, which share 90% nucleic acid identity, and are upregulated following wounding, could likewise encode ODPA/OPC8-C0A ligases.  As one test of this hypothesis, and to further test the stress responsiveness of the poplar ACLL genes analyzed in Figure 3.5,1 measured the expression of poplar ACLL genes by quantitative real-time reverse transcription PCR, using R N A isolated after treatment of poplar trees with a battery of additional stressors: herbivory by the forest tent caterpillar (Malacosoma disstria), simulated herbivory (SH; wound + insect regurgitant) and exposure to MeJA, a volatile derivative of JA. These data are summarized in Figure 3.7.  The results of this analysis showed differences in the responses of poplar ACLL genes to these stresses. In clade B , PoptrACLLl PoptrACLLl  showed no stress responsiveness, whereas  expression was strongly up-regulated by S H and herbivory by 6h after  treatment onset, consistent with wound activation of this gene (Figure 3.5B). In a separate microarray expression profiling experiment, ArathACLL6, the Arabidopsis homologue in clade B , expression was not activated by diamondback moth herbivory (J. Ehlting and J. Bohlmann, personal communication). These data suggest that ArathACLL6 PoptrACLLl PoptrACLLl  and  do not function in wound or herbivory related pathways, but that may have gained this function after duplication of the gene in Populus.  Interestingly, PoptrACLL3 (in clade C) was strongly up regulated by S H after 2h, but not by other treatments tested. For the three poplar homologues in clade E that were tested, these stress treatments either had no effect or resulted in down regulation of gene  64  expression. PoptrACLLll expression was the most down regulated, with levels dropping below half or original values after l h and staying below the original levels even at 24h. No consistent  change in expression could be observed for PoptrACLLlO and  PoptrACLLll.  The enzyme encoded by the Arabidopsis homologue in clade E , ArathACL9, has been shown to have activity with precursors in JA biosynthesis, suggesting that it may be involved in JA biosynthesis in the peroxisome (Schneider et al., 2005). However, the lack of wound (Figure 3.5) and herbivory (J. Ehlting and J. Bohlmann, personal communication) activation of ArathACLL9, and lack of wound, SH, herbivory, and MeJA activation of the most closely related poplar ACLL genes in clade E, does not support a role for these enzymes in the stress induced synthesis of JA.  Finally, in clade D, which contains poplar and Arabidopsis genes responsive to wound stress, PoptrACLLA and PoptrACLLS were remarkably up-regulated by herbivory, SH, and MeJA, with the latter treatment leading to a 10-20 fold increase in expression. In the separate microarray expression profiling experiment mentioned above, expression of Arabidopsis homologues ArathACLlA and ArathACLLl were activated by diamondback moth herbivory (J. Ehlting and J. Bohlmann, personal communication). These data are consistent with a role for certain Arabidopsis and poplar A C L L clade D enzymes in biochemical pathway(s) related to defense against wounding and/or herbivory/with PotrACLL4 and PoptrACLL5 being likely orthologs of the ArathACLL4/ODPA/OPC8-  65  CoA ligase gene, based on their phylogenetic relationship to ArathACLlA  and strong  wound, herbivory, and MeJA induced expression.  PoptrACLLl  PoptrACLL2  B :  [TI  ;' J  I~H-I  - . i~l  •  PoptrACLL3 w  24n  24ft  \J]  3^  PoptrACLLl  PoptrACLL5  nu ~ -I'M _  J!  PoptrACLLlO  1  1 [.  Sh • 24h • 24n  -n  PoptrACLLl 1  PoptrACLLl 2  2.5 j i —>,.,...: I 0.5 0  ««  I •  Herliiv«W  •  .>4n  B  » 2 «  ^"apltTi Ti«Llv6nF ^ W J a ^  Figure 3.7: Real-time PCR data showed expression of poplar ACLLs after simulated herbivory (SH or "spit"=mechanical wound + insect regurgitant), herbivory by the forest tent caterpillar (Malacosoma disstria), and exposure to MeJA. Results are given in two replicas for each treatment (same color bars). Y axis represent fold change in gene expression compared to expression at zero time point.  66  3.2.9 Sub-cellular localization of PoptrACLL5 In order to determine if the poplar homologues of ArathACLlA  in clade D actually  encode peroxisomally localized enzymes, consistent with their postulated roles as OPDA:CoA ligases, I tested the subcellular localization of one of the two highly similar homologues PoptrACLL5, using ArathACLL4, experimentally shown to be localized to peroxisomes (Koo et al., 2006) as a positive control. I generated N-terminally tagged GFP-PoptrACLL5 and GFP-ArathACLL4 protein fusions. Constructs were expressed under the control of the 35S promoter in transgenic tobacco plants generated by tissue culture (in collaboration with K . Turner, B C Institute of Technology). Eight transgenic plants derived from independent calli were obtained from  Agrobacterium-\xtfzcie&  tobacco leaf disks, and GFP signal localization was analyzed using confocal microscopy. Figure 3.8A shows the results of this analysis. The clearest GFP signal was found in guard cells of the leaf epidermis in the transgenic lines. In the negative controls of transgenic lines not expressing GFP and wild type untransformed tobacco plants, only chlorophyll-derived autofluorescence of chloroplasts was observed, indicating that the fluorescence observed attributed to GFP is not naturally occurring in tobacco cells. However, in GFP:ArathACLL4 transgenic  lines, punctate peroxisome-like GFP  fluorescent sub-cellular structures were observed, similar to those described by Koo et al. (2006) and consistent with the deduced peroxisomal localization of ArathACLL4 (Figure 3.8A; Koo et al, 2006). GFP-PoptrACLL5 expressing tobacco lines (Figure 3.8A) showed GFP fluorescence patterns in the guard cells that were indistinguishable from those in the GFP:ArathACLL4 positive control. At higher magnification (Figure 3.8B), the round fluorescent structures in a GFP-PoptrACLL5 expressing tobacco line clearly  67  resembled the peroxisomes described in other studies (Koo et al, 2006; Schneider et al, 2005), and were around l ^ m in diameter, consistent with the reported size of this organelle (Nyathi and Baker, 2006).  A Control  GFP:ArathACLL4  GFP:PoptrACLL5  MJWfl  Figure 3.8: Confocal microscopy image showing sub-cellular localization of ArathACLIA ( O P D A / O P C 8 - C 0 A ligase) and the poplar homologue P o p t r A C L L 5 in guard cells of transgenic tobacco lines. Yellow signal derives from chlorophyll autofluorescence (A) GFP signal (green) localized to similar structures consistent with a peroxisomal localization for both Arabidopsis and poplar clade D enzymes. (B) Detailed view of G F P : P o p t r A C L L 5 showing close proximity of peroxisomes (green) with chloroplasts (red).  68  3.3 Discussion 3.3.1 4CL and ACLL  gene evolution  Before this study, little was known about the group of adenylate-forming enzymes most closely related to plant 4 C L enzymes (i.e, A C L L  enzymes). Important  initial  contributions to the phylogeny of A C L L s (Cukovic et al, 2001; Shockey et al, 2003) demonstrating their close relationship to true 4CLs and other enzymes in fatty acid metabolism support my phylogenetic results. Also, recent discoveries regarding biochemical functions of two such enzymes (Koo et al, 2006; Schneider et al, 2005), lend further support for stress induction of gene expression and co-expression analyses presented here.  In this part of my thesis, using genome sequence information from land plants and various microorganisms, I show that ACLL  genes, like true 4CL genes, are a land plant-  specific gene super family (Figure 3.1). Furthermore, genome sequence information from Arabidopsis, poplar, and rice shows that the five clades of  ACLL  genes are conserved in  monocot and dicot lineages, with at least one representative in each of the 3 fully sequenced angiosperm genomes (Figure 3.2), demonstrating the early origin of such genes during angiosperm evolution. The two diverging branches, separating 4CLs and clade A  ACLLs,  from the peroxisomal ACLLs, evident in Figure 3.2, suggests an early  common ancestor of both 4CLs and clade A  ACLLs,  and a common ancestor for  peroxisomal A C L L s . Therefore, the phylogenetic reconstruction allowed me to infer that there were at least two ACLL ACLL  ancestral genes in land plants, which gave rise to the five  clades present today.  69  Given the ancient origins of the enzymes encoded by genes in these clades it is quite likely that A C L L enzymes from different clades perform similar but distinct functions in current species. In addition, given that the adenylate-forming enzymes most closely related to the A C L L s that are not unique to plants are targeted to the peroxisome, it is reasonable to speculate that the ancestral enzymes of both the cytosolic and peroxisomal 4CL and A C L L clades were recruited from peroxisomal enzymes. Since the major role of peroxisomes is in lipid metabolism, and CoA ligases are widely used in |3-oxidation of these molecules, it is tempting to speculate that A C L L s and 4CLs were recruited from fatty acid metabolism early in land plant evolution, to perform their current functions. It may be relatively easy for genes to lose the peroxisomal targeting signal in their encoded proteins. A n evidence for this is the fact that two poplar A C L L proteins (PoptrACLL6 and PoptrACLL12), with close homologues in the peroxisomal clade E (Figure 3.2), have lost their C-terminal peroxisomal targeting sequences and are presumably localized in the cytosol. Additional genome sequence information from basal land plant species will help -I  to more accurately infer the evolutionary history of ACLL genes.  In this context, it is interesting that two Physcomitrella A C L L sequences were identified, which could represent genes ancestral to the current suite of 4CL and A C L L genes in angiosperms. Completion of the Physcomitrella genome sequence should reveal whether additional Physcomitrella A C L L s exist. It is possible that analysis of a loss of function mutation in Physcomitrella genes will provide hints regarding the biochemical functions of A C L L s in the other plant species, in addition to indicating the possible biochemical  70  function of this apparently ancestral protein. Such work could shed light both on the origin of the A C L L family of proteins, and shed light on how enzymes are recruited to novel biochemical pathways.  3.3.2 ACLL gene family structure and expression patterns Despite the larger poplar genome, in which many genes are duplicated relative to Arabidopsis as a result of the salicoid whole genome duplication event (Tuskan et al., 2006), my results demonstrate that there is no relationship between species and the number of genes in a given clade. Overall, my results indicate that certain ACLL gene families, as defined by clades A - E , have undergone differential expansion in individual species over the course of lineage-specific genome evolution. Presumably, duplicated genes were retained due to selective pressures for elaboration of biochemical pathways requiring A C L L activity, which may vary according to the diverse life histories of Arabidopsis,  poplar,  and  rice. One  example  of  species-specific  gene  family  diversification is the 4CL gene family itself in Arabidopsis and poplar. While both poplar and Arabidopsis have only one class I 4CL, involved in flavonoid and soluble phenolic biosynthesis, class II genes evolved in a lineage-specific manner. The three Arabidopsis class II 4CLs seem to have originated by a combination of segmental duplication involving the chromosomal region where Arath4CL4 resides, followed by tandem duplications giving origin to Arath4CLl and Arath4CL2 (Hamberger and Hahlbrock, 2004). In poplar, the four class II 4CL loci are unlinked and scattered over four different chromosomes, indicating a different mechanism of gene duplication (Hamberger et al., submitted). In Arabidopsis, the three class II genes have different expression patterns and  71  even encode enzymes with specialized properties, as mentioned in Chapter 1. Similarly to 4CLs, specific CoA ligation requirements could be driving ACLL gene family diversification in a lineage-specific manner.  Interestingly, the conservation of sequence identity between Arabidopsis and poplar ACLL genes within individual clades was not uniform, suggesting more rapid evolution of genes in certain clades. M y analysis revealed that for clades A , B and D, Arabidopsis and poplar species share A C L L homologues that have been strongly conserved since divergence of the poplar and Arabidopsis lineages. This indicates that these enzymes may perform key roles in plant metabolism, conserved in these two dicot lineages. For example, in clade A poplar and Arabidopsis A C L L amino acid sequence identity (PoptrACLLl3 and A r a t h A C L L 5 )  is strongly conserved. Amino acid sequence  conservation, especially in the putative A C L L substrate binding domains inferred from adenylate-forming enzyme protein structure (Stuible and Kombrink, 2001), is a good indication of possible conservation of the substrate utilization.  Thus, the highly  conserved homologous genes belonging to clades such as clade A have a high likelihood of being orthologues, i.e., having the same biological/biochemical function in different organisms. On the other hand, the relatively divergent poplar and Arabidopsis sequences and the expansion of, for example, clade E sequences in poplar, suggest species-specific retention of duplicated genes and their neofunctionalization. This could reflect evolution of poplar-specific metabolic pathways involving these A C L L enzymes. In clade C, the single copy poplar and Arabidopsis proteins appear to have less constraint on sequence divergence, perhaps due to the nature of the substrates used, or partial redundancy of  72  clade C enzyme function with other A C L L s , allowing potential acquisition of new species-specific functions.  Another indicator of functional conservation across species is the conservation of gene expression patterns. Genes derived from a common ancestor that perform the same function in related organisms might retain similar expression patterns, especially if gene duplication has not occurred, leading to subfunctionalization of gene expression.  A  striking example of conservation of gene expression patterns was observed for clade A ACLL genes. While the expression pattern of the rice representative is unknown, both the Arabidopsis and poplar ACLL genes in this clade have flower-preferred expression patterns, according to our gene expression data. This is supported by Arabidopsis microarray data from Douglas and Ehlting (2005), and microarray expression data mined from public databases (data not shown), which indicates that ArathACLL5 has a strongly anther-preferred expression pattern. Similarly PoptrACLL13 is preferentially expressed in male flowers, with no detectable expression in female flowers. The relatively high level of sequence conservation of these genes, and their shared expression patterns suggest functional conservation of clade A enzymes in Arabidopsis and poplar, and that they play metabolic roles associated with anther development.  Clade B is interesting since there are two ACLL copies in poplar (PoptrACLLl/2), encoding enzymes with same identity values when compared to the Arabidopsis protein homologue (ArathACLL6). M y results showed that one of the poplar homologues (PoptrACLL2) had an expression pattern most similar to ArathACLL6 (expression  73  throughout all organs and tissue types, but most highly expressed in mature leaves; Figure 3.4).. Combined with their high level of amino acid sequence conservation, this conservation of developmental expression suggests that these poplar and Arabidopsis genes could be functional othologues. However, the two poplar homologues in these clades appear to have undergone subfuntionalization and neofunctionalization, suggesting an expanded function of poplar enzymes in this clade. Expression analysis showed that PoptrACLLl  and PoptrACLLl  have highly complementary patterns for expression in all  organ and tissue types analyzed (i.e. in organs and tissues where PoptrACLLl PoptrACLLl  is low,  is high, and vice versa; Figure 3.4). This appears to be a classical example  of subfunctionalization, where duplicated genes acquire specialized expression patterns, which when combined are equal to the expression pattern before duplication (Duarte et al., 2006). Analysis of stress-induced expression of clade B genes suggests as well neofunctionalization of one member of the duplicated poplar gene pair. While neither ArathACLL6 nor PoptrACLLl  is stress inducible (Figures 3.5 and 3.7; J. Ehlting and J.  Bohlmann, personal communication), PoptrACLLl  is clearly induced by wounding,  simulated herbivory and herbivory (Figures 3.5 and 3.7). This suggests that, whatever its biochemical function, the enzyme encoded by the duplicated PoptrACLLl recruited to  serve in a defense-related  metabolic  pathway,  gene has been  in addition to  its  developmental function. This new function is apparently novel in the poplar lineage.  In clade C there is a single copy gene for Arabidopsis and poplar. Interestingly, although identity values for these two genes are the lowest among the A C L L clades (58%/65%), expression patterns seem to be conserved with highest expression in leaves, stem and  74  flowers. One hypothesis would be that, despite their similar expression patterns, there has been functional divergence of the poplar and Arabidopsis (and, possibly, rice) A C L L enzymes in this clade, possibly in the type of substrate accepted or preferred. Alternatively, enzymes in this clade may still perform conserved functions despite the low identity. Further evidence for functional divergence of the poplar gene comes from its strong induction by simulated herbivory, although expression of the Arabidopsis gene has not been tested under similar conditions.  Clade D is interesting in that four of the five Arabidopsis ACLL genes, with exception of ArathACLL8, originated via tandem duplications on chromosome 1 (Figure 3.3). One member of this group, ArathACLIA, has been shown to encode an OPDA:CoA ligase, required for JA biosynthesis in the peroxisome (Koo et al., 2006). This function is consistent with the wound inducible expression of the ArathACLIA: :GUS fusion gene (Figure 3.5A), and its expression pattern assessed in public microarray databases (data not shown). However, it is striking that these duplicated genes share largely nonoverlapping developmental expression patterns (Figure 3.4), as well as variable responses to wounding, herbivory, and MeJA treatment (Figure 3.5). Thus, diversification of functions of the tandemly duplicated members of the Arabidopsis genes in this clade could have occurred, and may be related to their differential expression patterns. Since not all Arabidopsis ACLL genes in clade D were inducible by wounding, it may be that they perform functions other than in JA biosynthesis, or participate in developmentally regulated JA biosynthesis. This is supported by the fact that identity levels among Arabidopsis A C L L s in this clade are above 60%. As mentioned previously, my findings  75  from co-expression data are not consistent with a role for ACLL3 in JA biosynthesis, even though this gene is strongly wound-induced (Figure 3.5). However, many of the genes highly co-expressed  with ACLL3 are related to seed development, which is a  developmental process also known to be regulated by JA. Given the known biochemical function of A r a t h A C L L 4 as an O P D A / O P C 8 - C 0 A ligase, A C L L s in this clade are good candidates for participating in the octadecanoid pathway. Additional experiments, such as biochemical characterization of heterologously expressed enzymes, will be important to address this question.  My data indicate that the apparent subfunctionalization and possible neofunctionalizaton of duplicated clade D Arabidopsis ACLL genes has not occurred in poplar and rice, in which clade D genes have not expanded in number. Poplar contains two highly similar genes PoptrACLIA and PoptrACLL5, located on different linkage groups and a single rice  gene  (Figure  3.2).  The  close  phylogenetic  relationship  between  the  ArathACLIA/OPDA CoA ligase gene and PoptrACLLA and PoptrACLL5 (Figure 3.2), their high amino acid similarities (72%/75% amino acid identity), coupled with the wound, herbivory, and MeJA inducible expression of the poplar genes (Figures 3.5 and 3.7), strongly suggest that the poplar A C L L enzymes encoded by PoptrACLLA and PoptrACLL5 in this clade are peroxisomally localized OPDA:CoA ligases, involved in acyl chain shortening step required for JA biosynthesis, indeed performing the same functions as ArathACLIA. Given the lack of diversification of poplar and rice genes in this clade relative to gene family expansion in Arabidopsis, it is possible that diverse  76  octadecanoid signaling pathways are more prevalent in Arabidopsis than in these two species.  Clade E represents a contrasting case to clade D, in that the numbers of poplar and rice genes have expanded to 7 and 5, respectively, relative to a single Arabidopsis gene, and two of the poplar genes have undergone apparent neofunctionalization by loss of peroxisomal targeting sequences (Figure 3.2). The amino acid identity values between enzymes in this clade are quite low (Table 3.2) and there was no discernable similarity in developmental expression patterns between the Arabidopsis ACLL (ArathACLL9) and its three most closely related poplar homologues (Figure 3.4). While analysis of ArathACLL9 enzyme activity showed in vitro activity with octadecanoid precursors in the JA biosynthetic pathway, leading to the suggestion that it may be involved in JA biosynthesis (Schneider et al., 2005), its lack of wound or herbivory inducible expression Figures 3.5 and 3.7), coupled with the lack of co-expressed genes associated with J A biosynthesis (this study; Koo et al., 2006), suggests that it could be involved in some other aspect of fatty acid metabolism (Koo et al., 2006). Taken together, the data suggest that clade E contains genes encoding enzymes that are not strongly conserved between species, and may perform specialized functions specific to certain lineages;  3.3.3 Summary and combining data to make functional inferences The first clue available for inferring any kind of function for A C L L s came from the conservation of sequence motifs that define the adenylate-forming enzyme superfamily in addition to the sequence similarity with true 4CLs (Cukovic et al., 2001; Shockey et al.,  11  2003) suggesting that A C L L s are CoA ligases. M y phylogenetic analyses showed that A C L L s are a plant-specific group, thus are unlikely to be performing ubiquitous functions in general metabolism. Additional clues for functions came from comparative genomic analyses presented here, focusing on poplar and Arabidopsis, but also including A C L L s from other plant species (discussed in Chapter 4 for clade A). The presence of the peroxisomal target sequence PTS1, indicating a putative localization in that organelle and therefore a function in vplant-specific processes of that organelle, provided another significant functional clue. Combining promoter activity assays together with analysis of changes in gene expression in response to stress treatments revealed a set of A C L L s , mostly in clade D, which have apparent functions in biochemical responses to wounding and insect herbivory. Using a gene expression data mining approach, I identified the JA biochemical pathway as one in which the ArathACLL4 enzyme likely participates as an OPDA:CoA ligase. This result was independently confirmed by conclusive biochemical and genetic data by Koo et al. (2006).  Other stress responsive ACLL genes in clade D did not show co-expression with enzymes in the JA biosynthetic pathway in the public microarray data. However, co-expression analysis may be less robust in identifying genes in common pathways or processes when the genes have broad developmental expression patterns and are not highly inducible. Therefore, it could be that there is functional redundancy among the Arabidopsis ACLL genes in this group, as has been suggested by Koo et al: (2006). They found that JA accumulation is only partially compromised in an ArathACLLA  loss of function mutant,  suggesting that other related genes encode enzymes with the same function. The role of  78  other clade D A C L L , if they are OPDA:CoA ligases, could be primarily in constitutive, developmental JA biosynthesis, but this activity could be sufficient to provide sufficient flux through the pathway upon stress activation of JA biosynthesis such that stressactivated JA accumulation still occurs. It has been shown that OPDA is constitutively present in untreated wild-type leaves (Stenzel et al, 2003), so it is possible that esterified OPDA is stored in plants for rapid response to wound and herbivory attacks. It is important to note that ArathACLIA is constitutively expressed in flowers, and that other ACLLs in clade D have highly complementary constitutive expression levels in all tissues analyzed. JA is also a signal molecule for various developmental processes, including root growth (Staswick et al., 1992), flower and seed development (Feys et al., 1994; L i et al., 2004) and apical meristem development (Cenzano et al., 2003). So, if one or more ACLLs in clade D also encode enzymes in the J A biosynthetic pathway in the tissues where they are expressed, it is possibile that they may play a role in developmental metabolism regulated by JA. With this in mind, I checked public expression data for clade D ACLL expression after methyl jasmonate (MeJA) treatment, using the eFP browser  (  In  the  available  experiments, done with 7day seedlings treated with 10/^M MeJA and harvested between 30mins and 3hours after exposure, ACLLl was not inducible by MeJA or any other hormone tested. However, ACLL2, ACLL8 and, as expected ACLL4) showed up regulation, consistent with a role of additional clade D ACLLs in JA related pathways. ACLL3 does not map to a probe set on the Affymetrix ATH1 GeneChip dataset used by the eFP browser, therefore no information could be obtained for this gene using this tool.  79  The comparative genomics and expression profiling approaches allowed me to predict with some confidence that the two similar poplar clade D genes, PotrACLLA  and  PoptrACLL5, encode enzymes that carry out the same function as ArathACLL4 as an OPDA:CoA ligase. The poplar homologues in clade D were highly up-regulated by a variety of stress treatments, in particular after induction with MeJA, a derivative signaling molecule of JA. It has been shown that JA has the property of regulating its own synthesis via a positive feedback loop, which is dependent on JA synthesis and JA signaling (Bonaventure et al, 2007; Jensen et al, 2002). Therefore, up-regulation upon contact with MeJA is a predicted response for genes involved in the JA pathway, in addition to genes involved in downstream events. Other evidence supporting the biochemical functions of PoptrACLL4 and PoptrACLL5 as OPDA:CoA ligases include their high amino acid similarity to ArathACLL4 (Table 3.2) and the experimentally demonstrated sub-cellular localization of PoptrACLL5 in the peroxisome (Figure 3.8). Biochemical characterization of recombinant poplar enzymes and/or the generation and phenotypic characterization of RNAi-mediated loss of  PoptrACLL4/PoptrACLL5  function poplar plants would be necessary to test this hypothesis. Unlike Arabidopsis, in which there appear to be partially redundant genes encoding OPDA:CoA ligase in clade D in addition to ArathACLlA,  no extensive redundancy appears to be present in poplar.  Thus, simultaneous knockdown of PoplrACLlA  and PoptrACLL5 expression would be  predicted to severely impact JA biosynthesis and accumulation, allowing definitive tests to be carried out regarding the role of this signaling molecule in defense and development.  80  C H A P T E R 4 - T H E ARABIDOPSIS THALIANA FLOWER-SPECIFIC A C Y L - C O E N Z Y M E A LIGASE GENE ACLL5  IS CONSERVED IN ANGIOSPERMS AND IS REQUIRED FOR M A L E  FERTILITY  4.1 Introduction Anther development, culminating in the formation of mature male  gametophytes  (microspores, or pollen grains) is a complex process that is central to angiosperm life history (Ma, 2005). Anther development and microsporogenesis have been subjects of intense study and are well documented and characterized in many plants, including models such as tobacco and Arabidopsis  thaliana (Goldberg et al, 1993; Ma, 2005;  Sanders et al, 1999; Scott et al, 2004).  Stages of anther development and microsporogensis are precisely timed and tightly controlled, and are characterized by specific events ranging from initial  cell  differentiation from the floral meristem to pollen formation, maturation and release during anther dehiscence (Goldberg et al, 1993; Ma, 2005; Sanders et al, 1999; Scott et al, 2004). In tobacco and Arabidopsis,  anther development has been divided into stages  based on anatomical, morphological, cellular and molecular events (Goldberg et al, 1993; Ma, 2005; Sanders et al, Arabidopsis,  have  shed  light  1999). Molecular genetic studies, particularly in on  many  events  in  anther  development  and  microsporogenesis (Ma, 2005). However, many biochemical and cellular processes specific to anther development and their regulation are still unknown.  81  One event of fundamental importance during pollen maturation is the deposition of the pollen wall, necessary for pollen protection, dispersal and pollen-stigma recognition. The pollen wall consists essentially of two layers: the intine and the exine. The intine is mostly synthesized by the haploid microspore itself. However, the tapetum, a maternal cell layer that surrounds the inner side of the anther locules, is responsible for the production and secretion of the exine, generally known as a mixture of proteins, lipids and aromatic molecules that comprises the outermost layer of the pollen wall (Ma, 2005; Scott et al., 2004). After synthesis and deposition of the pollen wall, the tapetum cells are degraded via programmed cell death (PCD), and pollen grains continue to develop and mature.  Although the exact composition of the exine and other components of the pollen wall, as well as the regulation of its synthesis and deposition, are not completely understood, it is known that functional tapetum cells are essential for the development of viable pollen grains, presumably due to their crucial role in biosynthesis and secretion of the exine (Vizcay-Barrena and Wilson, 2006; Zhang et al., 2006). The precise chemical composition of the exine has been long debated. The major component of the exine is termed sporopollenin, a complex biopolymer characterized by its extreme stability and resistance to degradation. As a result, there are a limited number of techniques available for exine chemical analysis (Blokker et al, 2006), but the major components of sporopollenin are long-chain fatty acids and poorly characterized phenolic molecules coupled by ether linkages (Scott et al, 2004). Genetic approaches that identify enzymes  82  and biochemical pathways required for exine production promise to aid in the elucidatation of its composition and functions (Ma, 2005).  Several male sterile mutants that have been isolated and characterized in Arabidopsis (Ma, 2005; Sanders et al., 1999; Taylor et al., 1998) shed some light on the cell biology and biochemistry of pollen maturation. Obvious examples include mutants defective in meiosis that result in abnormal pollen grains, such as pollenless3 (Sanders et al 1999). Male sterile mutants have also identified genes required for normal tapetum development, demonstrating the intimate relationship between tapetum function and male fertility. The mutant dysfunctional tapetuml (dytl) is an example of a postmeiotic male sterile mutant. The DYT1 gene has been cloned, and exhibits strong tapetum preferred expression. DYT1 encodes a b H L H transcription factor believed to be required for the proper expression of tapetum genes (Zhang et al, 2006), since loss of DYT1 function results in reduced expression of tapetum-preferential genes and abnormal pollen development. The male sterile! (msl) mutant, which is defective in tapetum programmed cell death and does not produce pollen grains in an otherwise normal appearing mature anther, provides a clear example of the requirement for a functional tapetum for pollen grain development. (Vizcay-Barrena and Wilson, 2006). The male sterile! (ms2) mutant, defective in sporopollenin deposition, develops microspores that collapse shortly after release from tetrads, showing no signs of pollen wall formation. The MS2 protein accumulates specifically in the tapetum and is suggested to be a long chain fatty acid reductase, possibly involved in biosynthesis of a long-chain aliphatic sporopollenin polymer (Aarts et al., 1997). In the defective in exine formation (dexl) mutant, like ms2, abnormal microspores develop after their release from tetrads. Although sporopollenin is produced,  83  it does not anchor to the microspores, which eventually collapse. DEX1 is a novel protein of unknown function but may function at the plasmalemma. Accumulation of the DEX1 protein is not restricted to floral buds, but it is found in many other organs in the plant (Paxson-Sowders et al, 2001). Other genes required for exine production and male fertility defined by mutations include YRE/WAX2/FLP1,  encoding an enzyme that may be  involved in wax biosynthesis (Ariizumi et al, 2003; Chen et al., 2003; Kurata et al., 2003), and NEF1, which encodes a possible transporter protein (Ariizumi et al, 2004).  While these forward genetic analyses have identified certain biochemical and regulatory events involved in anther and pollen development in Arabidopsis, they provide a far from complete picture of these events. In particular, the biochemical and cellular events involved in tapetum function and exine formation, as well as the biochemistry and functions of the exine and sporopollenin remain poorly defined, despite their importance for microspores and male fertility.  Novel classes of Arabidopsis genes encoding enzymes related to, yet functionally distinct from phenylpropanoid genes have been defined (phenylpropanoid-like genes) (Costa et al., 2003; Ehlting et al, 2005; Raes et al, 2003), many of which are conserved in the fully sequenced genomes of poplar and rice (Hamberger et al, submitted; Tuskan et al, 2006). A n example of such a gene superfamily is the group of genes encoding enzymes related to the phenylpropanoid enzyme 4 C L (Costa et al, 2003; Cukovic et al, 2001; Ehlting et al, 2005; Shockey et al, 2003). I have characterized this family of genes, which I now designate as Acyl-CoA  ligase-like {ACLL) genes, in genomes of Arabidopsis,  84  poplar, rice, and other plants (de Azevedo Souza et al., in preparation; see Chapter 3). These studies show that ACLL genes, together with 4CL genes, are a land-plant-specific clade of adenylate-forming enzymes. The ACLL clade that contains the Arabidopsis ACLL5 gene (Atlg62940) is more closely related to true 4CL genes than other ACLLs are. Additional analysis using sequence information from the poplar and rice genomes revealed that ACLLS is a highly conserved single copy gene in Arabidopsis, poplar and rice, which suggests they originated from a common ancestor present before the divergence of monocot and eudicot lineages (de Azevedo Souza et al., in preparation; see Chapter 3). Furthermore, ACLL5 and its poplar orthologue are expressed in a strongly flower-preferred manner (de Azevedo Souza et al., in preparation; see Chapter 3; (Douglas and Ehlting, 2005), suggesting a function in flowers.  Here I describe acll5, a novel male-sterile mutant of the ACLL5 gene. Characterization of this mutant suggests that ACLL5 has a tapetum-specific function, and encodes an enzyme that may be involved the production of an aromatic constituent(s) of the exine which is required for post-meiotic pollen development and male fertility.  85  4.2 Results 4.2.1 Identification of an ACLLS insertion mutant The flower-preferred expression pattern of ACLLS, including evidence that is expressed in the male organ, as well as the conservation of single copy ACLLS homologues in the fully sequenced rice and poplar genomes (Chapter 3), suggest that this gene may play an important role in male reproductive organ development in angiosperms. In order to determine the biological function of A C L L 5 , I identified a potential acll5 transposon insertion loss of function mutant. Seeds for the line were obtained from N A S C (stock code N123936; standard name SM_3.37225). A segregating population derived from the original insertion line was genotyped by P C R as described in Materials and Methods. Within this population, acllS homozygous plants were identified. The ACLLS insertion segregated as a single Mendelian locus, which I designated acllS-1, to my knowledge the first described ACLL5  mutant allele. The location of the transposon insertion, in the  second exon of ACLLS, was verified by sequencing the P C R amplification product generated from line SM_3.37225 genomic D N A using a gene-specific primer and a primer specific to the transposon-tag (Figure 4.1 A).  I tested ACLLS expression in the acll5 mutant plants, using semi-quantitative and quantitative RT-PCR, with template cDNA derived from both wild type and mutant, flowers. Figure 4. IB shows that no ACLLS mRNA could be detected in the mutant by semi-quantitative RT-PCR, and quantitative RT-PCR also failed to detect any mRNA (data not shown), indicating that acllS-1 is a null allele of ACLLS.  86  A  site o f i n s e r t i o n  480GCTTCCGGTGCTAGAGGAATCATCACTGATGCTACTAACTATGAAAAG528  B  flower cDNA  wt  ac//5-1  Figure 4.1: (A) Schematic representation of the ACLL5 (Atlg62940) gene, showing the location of the transposon insertion in line SM_3.37225 (acllS-1). (B) ACLLS expression in wild-type (wt) and acll5-\ homozygote lines. RT-PCR analysis (30 cycles) was carried out on duplicate samples from cDNA prepared from wt or acllS-1 flowers, using ACLLS and actin-specific primers.  87  4.2.2 acllS mutation is correlated with male sterility and absence of pollen grains Initial phenotypic analysis of the homozygous acll5  plants suggested that the plants were  self-sterile, since siliques failed to mature and produce seeds. Careful examination of the mutant flowers revealed anthers with a darker appearance than wild-type anthers, and that were devoid of pollen grains (Figure 4.2). At a time when wild-type flowers had self pollinated and produced siliques with developing seeds, development of acll5  mutant  flowers culminated in undeveloped siliques and absence of seed production (Figure 4.2). There were no other obvious morphological differences between acll5  mutant and wild-  type plants when grown to maturity (Figure 4.2).  C WT  B WT  ac//5-1  ac//5-1  Figure 4.2: Phenotypic characterization  acll5-\ homozygous plants. (A) Flowers from wild-type (WT) and acll5-\ plants. Arrows indicate mature anthers, which are dehiscing and releasing mature pollen grains onto the stigmatic surface in WT, but which appear dark and without obvious pollen in acll5-\. (B) Mature siliques from WT and acll5-\ plants. No seed formation could be observed in the mutant. (C) Mature WT and acll5-\ plants.  88  To verify if the failure of self-pollination in acllS mutant plants was due to a defect in anther development and/or pollen production, I used S E M to view these processes at higher resolution in several wild-type and mutant flowers. Figure 4.3 shows that, although acll5 mutant anthers underwent dehiscence, no pollen grains could be observed in the anthers, and abundant pollen was observed in dehiscing wild-type anthers examined in parallel (Figure 4.3).  Figure 4.3: S E M of wild type (WT) and homozygous acll5-\ anthers. (A) Anther dehiscence was observed to occur normally in the mutant, however no pollen grains could be found. (B) There was no carpel elongation for silique development in acll5-\ plants. Scale bar= lOOum.  Furthermore, wild-type carpels were clearly elongated whereas carpel elongation was defective in the mutant. These data imply that loss of ACLL5 function in the acll5 mutant may lead to a defect in pollen production and development, and consequent male sterility and absence of self-pollination. Unfertilized ovules were present in the mutant siliques,  89  suggesting that the mutant was female fertile (Figure 4.2B). To test this hypothesis I used acll5 mutant plants as pollen recipients in crosses using pollen from wild-type plants. Siliques developed normally from such crosses, which produced F l seeds that were able to germinate normally with no apparent loss in fecundity relative to self-pollinated wildtype plants.  4.2.3 Genetic characterization of the acllS mutant In addition to showing female fertility of the acll5 mutant, the crosses mentioned above were used to investigate the genetic basis for the observed male sterility phenotype. F l heterozygote plants were allowed to self-pollinate and the resulting F2 population was then analyzed for co-segregation of the male sterile phenotype with acll5-l.  The results  showed that the mutant phenotype was inherited in a Mendelian fashion in all 184 plants analyzed, with one quarter of the F2 progeny displaying male sterility (x = 0.437; p>0.4; 2  n=184). This demonstrates that the mutant phenotype is caused by a single locus. Next, in order to establish if the single-locus male sterile phenotype is caused by the  acll5-l  mutant allele, I determined the genotype of 183 F2 plants for which there were phenotypic data. ACLL5 alleles segregated in a Mendelian ratio of 1:2:1 (x = 2.08; p > 2  0.1), and inheritance of the male sterile phenotype co-segregated with acll5-l acll5-l  homozygotes male sterile, 0/183 acll5-l  male sterile).  90  (51/183  heterozygotes and ACLL5 homozygotes  4.2.4 Phenotypic analysis of anther development in the acll5-l mutant A l l together, my data suggested that the transposon insertion in ACLL5 generated a lossof-function allele, resulting in male sterility. Therefore, in order to investigate further which point anther development and pollen production were impaired in the mutant, I analyzed in detail the development of mutant and wild-type anthers. Anther sections taken from flowers at different developmental stages were stained with toluidine blue, visualized in bright field microscopy, and the anthers categorized into developmental stages (Sanders et al, 1999). Representative sections from wild type and acll5 mutant flowers are shown in Figure 4.4.  Early stages of anther development in acll5 appeared normal. At stage 5, in both mutant and wild-type anthers, four defined locules were established and visible pollen mother cells had appeared. Subsequently, the pollen mother cells undergo meiosis and tetrads are formed, connected by a callose wall (Sanders et al., 1999), which is degraded by stage 8, releasing microspores. Figure 4.4 shows that, in both wild-type and mutant anthers, individual microspores could be seen at stage 8, indicating that the callose wall had degenerated, releasing the microspores from the tetrads in a normal manner in the mutant.  At stage 9, as described (Sanders et al., 1999), the wild-type microspores became vacuolated, and the exine wall started to become visible, as evidenced by toluidine blue staining of the walls of developing pollen grains. However, normal development seemed to be arrested in the acll5 mutant at this stage. Although the vacuoles were evident in some mutant microspores, they appeared malformed and distorted in shape, and exine  91  walls were not clearly evident as in wild-type (Figure 4.4). In the place of clearly developing microspores seen in the locules of wild-type anthers, stage 9 mutant locules were filled with these misshapen structures.  At stage 10, wild-type microspores continued to enlarge and develop, and the tapetum layer showed initial signs of degeneration. In contrast, in the mutant anthers at an age equivalent to stage 10,1 observed degradation of both microspores and the tapetum wall (Figure 4.4). Thus, in the mutant few if any microspores were observed at this stage, and the tapetum layer degenerated earlier than expected in normal development. At stage 11, the tapetum cell layer was greatly degraded in wild-type anthers, and clear darkly staining pollen grains were seen. In contrast, although fully mature anthers appeared normal in the mutant, the locules were empty, devoid of any pollen grains (Figure 4.4). These data allowed me to pinpoint the developmental stage at which the acllS male sterile phenotype became apparent.  92  Figure 4.4: Anther cross-sections (l/*m) of wild type and homozygous acll5-\ mutant. Developmental stages are according to Sanders et al. 1999. Anther development appears normal in the mutant up until stage 8. In stage 9 initial degradation of microspores is apparent in the mutant. The tapetum cell layer has normal appearance when compared to wild type. Microspore degradation is complete in acll5-\ and results in a normal looking anther devoid of any pollen grains.  4.2.5 In situ hybridization analysis of ACLL5 expression in developing anthers Our results demonstrated that the mutation does not impair the early stages of microspore production in the developing anthers. Instead, microspores are present but fail to complete maturation into fully developed pollen grains and degenerate well before maturation, together with premature degradation of the tapetum cell layer. The mutant phenotype is first visible at stage 9 of anther development. To gain insights into the spatio-temporal pattern of ACLL5 expression in the anthers, and to determine if ACLL5 expression could be correlated to the timing and location of the acll5 phenotype observed, in situ hybridization experiments were performed using an A C L L 5 - d e r i v e d probe hybridized to developing wild-type flowers. The experimental procedure was carried out by our collaborator Sarah McKim, University of British Columbia, while I interpreted the data.  Figure 4.5 shows that ACLL5 was strongly and specifically expressed in the tapetum cell layer of developing anthers. Strong expression was first evident at stage 7, right before the separation of the microspores from the tetrads. Gene expression was dramatically reduced in stage 8 anthers, observed in the same flower (Figure 4.5A), and it decreased to background levels in later developmental stages, observed in different flowers (Figure 4.5B). These results demonstrate that ACLL5 has a transient and tapetum preferred expression pattern and is most highly expressed in the stages immediately preceding the appearance of the visible phenotype. One interesting observation is the presence of anthers of slightly different developmental stages in the same flower. The development of the anther from the outer short stamen occurs immediately after the anther from the inner  94  long stamen. This timing is likely an outcome of the initiation from the flower primordia occurring at consecutive time points for both types of stamen (Kunst et al., 1989), and explains the presence of stage 7 and stage 8 anthers on the same flower.  Figure 4.5: In situ hybridization analysis in developing wild-type flowers showing ACLL5 expression specific to the tapetum cell layer (t£). (A) Cross-section through an immature flower with anthers at slightly different stages of development, hybridized to an anti-sense ACLLS probe. Tapetum-specific ACLL5 expression was highest in stage 7 anthers (*) just before tetrad separation, and was reduced in stage 8 anthers (**) when individual microspores can be observed. (B) Cross-section through an older flower, hybridized with the same probe. Little ACLL5 expression was observed in stage 10 anthers (***). a, anther; g, gynoecium; p, petal. (C) Longitudinal section of inflorescence showing developing flowers, and highest signal in stage 7 anthers.  95  4.2.6 Co-expression analysis of ACLL5 in Arabidopsis Co-expression analysis using global expression data available in public databases has been successfully used to identify genes participating in the same biological process and/or biochemical pathway (Ehlting et al, 2006; Persson et al, 2005). If ACLL5 is transcriptionaly regulated together with other genes encoding enzymes in the same hypothetical biochemical pathway in tapetum cells prior to microspore release, one should expect to identify this group of genes by their co-expression with ACLL5 in the various datasets. Therefore, in the effort to gain insights into biochemical pathways that A C L L 5 could be participating in, I performed an in silico co-expression analysis using public Arabidopsis microarray datasets, as described in Materials and Methods.  Using the Correlated Gene Search tool, I queried 237 microarray experiments in the Tissue and Development dataset (, using a cutoff Pearson coexpression coefficient of 0.80. This analysis identified 56 co-expressed genes, most of unknown function (complete list in Appendix I). Not surprisingly, expression of all genes in this group was very specific to floral tissues, as could be seen by their individual expression patterns available at the eFP browser ( Among those genes most highly co-expressed with ACLL5,1 identified genes annotated as related to lipid metabolism and others with similarity to genes involved in monolignol and flavonoid metabolism, which could be expected to encode enzymes in a pathway involving A C L L 5 and a CoA ester intermediate in the biosynthesis of fatty acid or phenolic constituents of the exine (Figure 4.6). Interestingly, MS2, involved in fatty acid metabolism and with possible fatty acyl-coenzyme A reductase activity (Aarts et al,  96  1997) is the only one of these genes, of fatty-acid or phenylpropanoid-like annotation, of known biological function that was co-regulated with ACLL5 in our search.  A  B OH  0.986 At4g34850 Chalcone synthase family  OH  0.966 At3g11980 Male sterility protein MS2  4CL / ACLL5 O  0.958 At5g07230 Lipid transfer protein 0.944 At5g62080 Lipid transfer protein  OH  SCoA  HCT/HCT-like •  an  0.942 At3g07450 Lipid transfer protein  P450  CHS/CHS-like  0.939 At3g52130 Lipid transfer protein 0.937 At1g02050 Chalcone synthase famil 0.910 At4g35420 Dihydroflavonol 4-reducta:  Of/  ,  HCT/HCT-like  0.905 At4g28395 Lipid transfer protein  O  0.905 At5g52160 Lipid transfer protein j  0.883 At3g52160 beta-ketoacyl-CoA synthase family  SCoA  D F R / DFR-like Orf  0.874 At1g67990 Caffeoyl-CoA 3-0-methyltransferase like  CCoAOMT / CCoAOMT-like  Flavonoids  0.857 At5g49070 beta-ketoacyl-CoA synthase family  0  0.834 At1g03390 N-hydroxycinnamoyl transferase family  OH-  0.827 At2g19070 N-hydroxycinnamoyl transferase family  H,CO'  CCR/ CCR-like  0.817 At5g4l890 GDSL-motif lipase/hydrolase family  0 OH-  Lipid metabolism I  SCoA  H,CO'  Phenlypropanoid-like metabolism  Monolignols  Figure 4.6: (A) Selected genes with high co-expression coefficients with ACLL5/Atlg62940 involved in lipid and phenylpropanoid-like metabolism ( Correlated Gene Search: Tissue and development v . l , 237 data, threshold 0.80). (B) Putative duplicated phenylpropanoid pathway.  97  A number of co-expressed genes annotated as encoding phenylpropanoid-like enzymes were found co-expressed with ACLL5. As shown in Figure 4.6A, co-expressed phenylpropanoid-like genes included those encoding two chalcone synthase (CHS)-like enzymes and a dihydroflavonol reductase (DFR)-like enzyme, both related to enzymes involved in flavonoid biosynthesis (Figure 4.6B). CHS uses 4-coumaryl-CoA as one of its substrates. This list also included two genes encoding hydroxycinnamyl CoA shikimate/quinate hydroxycinnamyltransferase (HCT)-like enzymes, a gene encoding a caffeoyl-CoA O-methyltransferase  (CCOMT)-like enzyme and a gene encoding a  cinnamyl-CoA reductase (CCR)-like enzyme. Interestingly, this set of genes encodes sets of enzymes mimicking the pathways leading to the production of monolignols, known to act in sequence leading from hydroxycinnamyl-CoA to the corresponding alcohol (Figure 4.6B), and to flavonoids, in which CHS catalyzes condensation of 4-coumaryl-CoA with malonyl CoA at the entry point of flavonoid metabolism. None of these genes have yet been biochemically or biologically characterized but their co-expression with ACLL5 suggests the existence of one or more anther expressed phenylpropanoid-like pathways, analogous to the well-characterized monolignol and flavonoid biosynthetic pathways, both of which could involve A C L L 5 (Figure 4.6).  Phenylpropanoid-like genes such as the CCR-like and CCOMT-like genes co-expressed with ACLL5 have been described in several reports (Costa et al, 2003; Ehlting et al, 2005; Raes et al, 2003), and occur in clades distinct from those known to be involved in phenylpropanoid and monolignol biosynthesis. However, less phylogenetic information is available for CHS-like and DFR-like genes. In order to obtain further information about  98  the CHS-like and DFR-like  genes co-expressed with ACLL5, I carried out phylogenetic  analyses of homologues identified by in silico searches of sequence information in Arabidopsis and other plant species. In addition to Arabidopsis, poplar, rice and Physcomitrella  CHS, CHS-like, DFR, and DFR-like  genes, CHS-like and DFR-like  genes  from tobacco (Varbanova etal, 2003), pine (Walden et al, 1999), rice (Yau et al, 2005) and Silene latifolia (Ageez et al, 2005) were identified from the literature on the basis of high expression during uninucleate microspore development (Varbanova et al, 2003; Walden et al, 1999; Yau et al, 2005) and during bursts of tapetum cell activity (Ageez et al, 2005).  These analyses, shown in Figure 4.7A and 4.7B, indicated that the CHS-like and DFRlike genes co-expressed with ACLL5 are in distinct clades from those containing bona fide Arabidopsis CHS and DFR genes. These clades also contain CHS-like and DFR-like homologues from other species expressed in tapetum cells and/or in concert with microsporogenesis. In addition, further phylogenetic analysis showed that a tobacco 4CLlike gene expressed in the tapetum during microsporogenesis (Atanassov et al, 1998) is in the same clade as ACLLS and its poplar and rice homologues (Figure 4.7C). Also, expression of the single rice homologue gene in this clade, based on massively parallel signature sequencing (MPSS) data ( is strongly preferred in the immature panicles (inflorescence). These data support a role for the Arabidopsis ACLL5 co-expressed CHS-like  and DFR-like  genes in anther and microspore development,  possibly functioning together with ACLL5  in one or more exine-specific biochemical  pathways conserved in angiosperms.  99  c  Figure 4.7: Phylogenetic analyses of protein sequences of Arabidopsis, poplar, rice, Physcomitrella and other species genes expressed during microspore development (star). Maximum-likelihood (ML) tree was built using 100 or 500 bootstrap replicates in PhyML 2.4.4. Bootstrap values above 70 are shown on branches. ACLL5 and co-expressed genes are highlighted (oval). (A) Chalcone synthase (CHS) and chalcone synthase-Iikes (CHSL). (B) Dehydroflavonol reductase (DFR) and dehydroflavonol reductase like families. (C) 4-coumarate-CoA ligase (4CL) and acyl-CoA ligase 5 and homologues (ACLL).  101  4.3 Discussion A l l together, our data support the hypothesis that ACLL5, a gene transiently and preferentially expressed in the tapetum, is required for normal development and maturation of pollen grains, and is involved in pollen wall formation. Furthermore, our phylogenetic analyses provide evidence for ACLLS  homologues from poplar, rice, and  tobacco that may have a similar function in those species. In addition, given that A C L L 5 is an enzyme closely related to bona fide 4CLs, and that in silico co-expression results revealed a number of genes co-expressed with ACLL5 that encode phenylpropanoid and lipid metabolism related enzymes, it is reasonable to speculate that A C L L 5 has a phenylpropanoid-like or fatty acid substrate, providing precursors for biosynthesis of an essential sporopollenin polymer.  The exact composition of the pollen wall is not well defined yet, and may even vary greatly between species. However, it is generally accepted that tapetum cell function is conserved among plant species and is required for exine production and deposition (Scott et al, 2004). Sporopollenin, a heterogeneous polymer found in the pollen exine layer, is composed of long chain fatty acids and phenolic compounds. The phenolic monomers are coupled by ether linkages, which are characteristic of polyphenolic polymers such as lignin (Scott et al,  2004). It has been shown that enzymes involved in the  phenylpropanoid pathway such as P A L and CHS are required for male fertility in some plants (Matsuda et al, 1996; van der Meer et al, 1992), reinforcing an essential role for phenylpropanoids in exine composition.  However, the Arabidopsis mutant tt4  (transparent testa4), a loss of function lesion in the single copy Arabidopsis CHS gene,  102  encoding the first enzyme committed to flavonoid biosynthesis, exhibits normal pollen development (Ylstra et al., 1996). This suggests that flavonoids themselves are not required for pollen viability and sporopollenin biosynthesis, and that other phenolic compounds may play such roles (Boavida et al., 2005).  The tapetum contribution to exine synthesis and deposition starts while the microspores are still attached in tetrads and continues through the vacuolated stages until the first pollen mitosis is almost completed (Boavida et al., 2005). The spatio-temporal pattern of ACLL5 gene expression, revealed by in situ hybridization, is consistent with the timing of this function of the tapetum, and ACLLS expression in the tapetum during stages 7 and 8 of anther development, characterized by tetrad formation and microspore release, supports the hypothesis that A C L L 5 is necessary for production of phenylpropanoid or fatty acid related molecules in the early steps of exine biosynthesis. The spatio-temporal pattern of ACLLS  expression is also consistent with the timing of the defect in pollen  development during anther maturation in the acllS mutant. Loss of the ACLL5 function in tapetum cells in stage 7 anthers, when ACLLS is most highly expressed in tapetum cells (Figure 4.5) is consistent with a defect in biosynthesis of a critical secreted sporopollenin component(s), leading to defective microspores which, when released from tetrads in stage 8 anthers, fail to develop normal exine and are aborted in development at stage 9, as observed in the acllS mutant (Figure 4.4).  The A C L L 5 substrate and the nature of the biochemical pathway that uses the CoA esters, that are the presumed product of the enzyme, are unknown. Genes closely related  103  to ACLLS are found in poplar, rice, and tobacco (Chapter 3; Figure 4.7C), suggesting that the metabolic pathway in which it participates is conserved amongst angiosperms. In silico analysis of Arabidopsis genes strongly co-expressed with ACLLS, and tapetumexpressed genes in other species known from the literature, provide clues for possible enzymes that could function with A C L L 5 in one or more common pathways. Prominent among the Arabidopsis co-expressed genes are those encoding phenylpropanoid-like enzymes (Figure 4.6). These include two CHS-like genes and a DFR-like gene. Interestingly, CHS-like and DFR-like genes with anther and/or  tapetum-preferred  expression patterns have been described in several other species. Phylogenetic analysis of CHS and CHS-like genes in Arabidopsis, poplar, rice, and other species (Figure 4.7A) shows that the Arabidopsis co-expressed CHS-like genes At4g34850 and Atlg02050 occur in a clade distinct from true CHS. This clade contains representatives from poplar and rice, as well as tapetum-expressed genes from pine, Silene, and tobacco, suggesting that A C L L 5 could function upstream of a CHS-like polyketide synthase enzyme in the biosynthesis of a structural polyketide component of sporopollenin, distinct from flavonoids. Similarly, the co-expressed DFR-like gene At4g35420 occurs in a clade distinct from true DFR genes (Figure 4.7B), which also contains poplar and rice representatives. Interestingly, one of the rice DFR-like genes in this clade is expressed in the tapetum of developing anthers (Yau et al., 2005). The co-expressed DFR-like gene At4g35420 could be a reductase involved in modification of a CHS-like derived polyketide constituent of sporopollenin, or could be reductase that acts directly on the CoA ester product of the ACLL5-catalyzed reaction, analogous to C C R in monolignol biosynthesis (Lauvergeat et al, 2001).  104  Also of interest among the phenylpropanoid-like co-expressed genes are a CCR-like gene and a CCOMT-like gene, (CCRL6 and CCOMT2; Ehlting et al, 2005), as well as two genes encoding N-hydroxycinnamyl transferase family proteins (HCT-likes). The CCRL6 gene occurs in a clade of genes encoding reductases distinct from DFR, the ACLL5 coexpressed DRF-like gene At4g35420, and bonafideCCR genes, and this clade also contains rice and poplar members (Figure 4.7B). As illustrated in Figure 4.6, one interpretation is that ACLLS and this set of co-expressed phenylpropanoid-like genes encode enzymes in a pathway required for biosynthesis of a phenolic sporopollenin constituent, which is analogous to the well-characterized sequence of enzymatic steps leading to monolignol biosynthesis.  Thus, co-expression analysis has revealed potentially novel phenylpropanoid-like biochemical pathways in which A C L L 5 could play a key role by providing CoA ester substrates, and one or more of these pathways could be involved in biosynthesis of crucial sporopollenin constituents. This hypothesis could be further tested by determining the spatio-temporal expression patterns of the co-expressed genes during anther development, to establish how well they coincide with ACLL5 expression, and to obtain loss of function alleles of the genes to determine if, like ACLL5, they are required for male fertility and pollen development.  Phenylpropanoid genes have been shown to be transcriptionally regulated by M Y B transcription factors, which control many aspects of phenylpropanoid metabolism in plants (Douglas, 1996; Hauffe et al, 1991; Rogers and Campbell, 2004). The gene  105  encoding the transcription factor M Y B 9 9 is strongly co-expressed with  ACLL5  (Appendix 1), and could thus play a role in regulating the pathways of secondary metabolism related to A C L L 5 function. To look for evidence of co-regulation of  ACLL5  and co-expressed phenylpropanoid-like genes, I performed an in silico search of the PLACE  25.0.1  (  database to look for consensus matches of regulatory elements in the promoter regions of these genes. This search identified a plant M Y B binding site ( M Y B P L A N T ) , that has been reported to activate genes members of the phenylpropanoid metabolism in Antirrhinum  majus (snapdragon) flowers (Sablowski et al., 1994). Additional consensus  cis elements related to flower development, pollen development and M Y B binding were also present in the promoter regions of  ACLL5  and co-expressed phenylpropanoid-like  genes (Table 4.1).  Table 4.1: Consensus cis element matches in ACLL5 like gene promoter regions. Element 23BPUASNSCYCB1 AGAMOUSATCONSENSUS AGATCONSENSUS AGL1ATC0NSENSUS AGL3ATC0NSENSUS CIACADIANLELHC MYBPLANT PALBOXPPC  Sequence ACAAA [AGT]CC[AT][AT][AT] [AT]CC[AT][AT][AT] [AGT]CC[AT][AT][AT] [AT]C[CT]A[AT][AT] A[ACGT][ACGT][ACGT][AT] ACC[AT]A[AC] [ACIfACICfAClAfACl  and co-expressed phenylpropanoid-  Description . MYB binding core required for M-phase expression. Related to cell cycle. Indispensible for AGAMOUS function in flower development Indispensible for AGAMOUS function in flower development Sequence for AtAGLl, MADS-Box domain gene expressed in transition to flowering Sequence for AtAGL3, MADS-Box domain gene expressed in above-ground vegetative organs Region necessary for circadian region in tomato Plant MYB binding site, sequence related to box P in promoters of phenylpropanoid genes Box P. One of three cis-actinq element boxes of phenylalanine amonia lyase (PAL)  Given the high lipid content of sporopollenin in the exine, and the fact that fatty acidrelated enzymes are also co-expressed with  ACLL5,  an alternative to an A C L L 5 function  in a phenylpropanoid-like metabolic pathway is the possibility that this enzyme plays a crucial role in fatty acid metabolism and uses a fatty acid derived substrate. For example, A C L L 5 could participate in the same pathway as M S 2 , which is strongly co-expressed 106  with ACLLS and apparently encodes a long chain fatty acid reductase required for sporopollenin deposition. It could also play a role in another lipid-related pathway, such as those that yield in the production of the lipid rich pollen coat formed after sporopollenin deposition. The pollen coat fills the gaps between the exine structures, and confers important functions such as protection from dehydration and pollen-stigma recognitions (Boavida et al., 2005). As opposed to the exine, the pollen coat is easily extractible and has been extensively analyzed, revealing that its major components are non-polar esters of medium, long-chain and very long chain fatty acids, as well as lipases and other proteins attached to the pollen surface. However, deposition of the pollen coat must be timed to occur after the deposition of the exine. The temporal pattern of ACLLS expression, as seen in our in situ experiments (Figure 4.5), concomitant to callose wall dissolution at stage 7 of anther development, is not as well correlated with the timing of pollen coat deposition as with the timing of exine deposition. Therefore ACLL5 is more likely to participate in the biosynthesis of either phenolic or fatty acid constituents of sporopollenin in the exine, prior to the deposition of the pollen coat.  Pollen development and pollen wall biosynthesis are very complex processes, involving more genes that any other single developmental process in plants (Scott et al, 2004), many of which are expressed in similar spatio-temporal patterns during anther development. In this light, it should be kept in mind that genes that show high coexpression coefficients with ACLLS in microarray experiments are not necessarily coregulated or part of the same biochemical pathway. Still, it is interesting to observe the various classes of co-expressed genes that might, collectively, orchestrate processes in  107  pollen development and pollen wall formation. One class of proteins with several representatives encoded by genes that are co-expressed with ACLL5 is Lipid Transfer Proteins (LTP). It has been suggested that LTPs can bind to fatty acids and acyl-CoA esters facilitating the secretion and deposition of lipophilic molecules onto cell walls (Arondel et al, 2000). Analogously, LTPs could be participating in a similar process transporting molecules from the tapetum cells to the pollen wall. The same could be speculated for the A B C transporter co-expressed with ACLL5, which belongs to the W B C subfamily of A B C transporters (Sanchez-Fernandez et al, 2001). Members of the W B C subfamily have been shown to transport lipids to the cell walls (Pighin et al, 2004) and could function in the transport of related molecules to the pollen wall. Other co-expressed genes with ACLL5 include those encoding for glycosyl hydrolases. Such class of enzymes would be necessary for callose degradation for microspore release from the tetrads during stage 7 of anther development. In addition, two uncharacterized genes enoding cytochrome P450 enzymes were co-expressed with ACLL5, which could be involved in hyrodroxylation of phenolic constituents of sporopollenin, or modification of other sporopollenin constituents.  Another possible role for ACLL5 that would be consistent with the phenotype observed in the acllS mutant is participation in a vital tapetum-specific metabolic pathway unrelated to exine formation. The loss of A C L L 5 enzyme function could result in improper tapetum function and degeneration of the tapetum cells, which would result indirectly in improper pollen wall formation resulting in abortion of the pollen grains. The degradation of the tapetum is a normally occurring and tightly regulated physiological process. In normal  108  development, the tapetum starts to degenerate only after stage 10 (Sanders et al, 1999). Although in the acll5 mutant the tapetum cells appeared to degenerate earlier than in wild-type, I could not determine if the early degeneration of the tapetum is the result of accelerated programmed cell death or a non-specific process. Additional detailed ultrastructural analysis of the tapetum cells in the mutant to verify early signs of P C D would be helpful to address this question. However, I did not observe any obvious aberrations in the tapetum appearance before the degradation of the microspores, suggesting that tapetum function itself is not strongly affected in the acll5 mutant. In addition, our in situ hybridization results showed that ACLL5 expression in the tapetum is very transient, being restricted largely to stages 7 and 8 of anther development, and returning to background levels well before initiation of tapetum degeneration. Therefore, the early degeneration of the tapetum in the acll5 mutant is more likely a consequence, and not a cause, of microspore malformation and subsequent degradation, and the ACLLS expression pattern is more consistent with a role for ACLL5 in the production of sporopollenin compounds in the exine of the pollen wall, rather than in functioning of the tapetum itself.  What still remains unclear is the exact cause of degradation of the microspores in the acll5 mutant. From an evolutionary perspective, it would be an extreme disadvantage for the plant to release its genetic material in a defective "package". Healthy pollen grains that will survive the obstacles between anther release, stigma recognition and germination are crucial for the survival of the species. It is likely that there are "check points" to verify the fidelity of the pollen developmental process, with a mechanism to eliminate  109  defective microspores, and that this is engaged in the acll5 mutant. A n alternative hypothesis would be that without the physical strength provided by a normal pollen wall coating in the acll5 mutant, the pollen grains simply collapse due to physical pressures.  As described in the introduction, detailed chemical analysis of the pollen wall, and elucidation of sporopollenin chemical structure is a daunting task given the biochemical nature of the exine. Therefore, combined genetic and bioinformatics approaches, such as those taken on this study, are important to generate hypotheses regarding the structure and biological function of the pollen wall, and the nature of the biochemical networks required for its development and deposition. This study opens the door to further testing of the hypothesis that ACLL5 is involved a pathway required for the biosynthesis of uncharacterized phenolic or lipid-based constituents of the pollen wall, for example by chemical analysis of sporopollenin in the acll5 mutant, and investigation of potential male sterile phenotypes of co-expressed genes.  Finally, this work demonstrates the usefulness of comparative genomics in understanding the role of a particular gene in a given biological system. Figure 4.7C and previous results (Chapter 3) show that single-copy ACLL5 homologs are present in poplar and rice and maize, and that a closely related tapetum-specfic gene is present in tobacco, supporting a conserved biological function for this enzyme in angiosperms. Furthermore, by taking advantage of the dioecious nature the poplar species, I showed that the poplar ACLL5 homolog is preferentially expressed in male flowers (Chapter 3), consistent with a role for this enzyme in anther development. The Arabidopsis ACLL5 gene and its poplar  110  homologue share 80% identity at the amino acid level, which is comparable to the level of identity between the 4CL representatives of both species. The availability of both poplar and Arabidopsis ACLL5 homologues allows future testing of the hypothesis that they are orthologous genes with the same biochemical function, for example by complementation of the acll5 phenotype with the poplar ACLL5 homologue. Experiments are underway to test this hypothesis.  ^  111  C H A P T E R 5- CONCLUSIONS AND FUTURE DIRECTIONS  5.1 A timeline of discoveries  The study of the ACLL gene family presented in this thesis was a discovery-based project in which the starting null hypothesis was A C L L enzymes have 4 C L activity given their close relationship to bona fide 4 C L enzymes. This hypothesis was rejected in the early phases of this project by preliminary in vitro biochemistry studies done in the Douglas < lab with ACLLS  expressed heterologously in E. coli (unpublished data), and additional  biochemical experiments done by other labs interested in ACLL  genes (E. Kombrink,  personal communication). Therefore, not knowing which kind of substrates A C L L s could be active with, information regarding ACLL gene function had to be generated from little available data. Given initial evidence that all ACLL  genes are expressed, based in EST  support, a number of experiments and analyses were carried out in order to obtain clues to the functions of ACLL genes in Arabidopsis.  In parallel with the studies of my thesis, more information about Arabidopsis genes became publicly available. First, after performing A C L L amino acid alignments for phylogenetic analyses discussed in Chapter 3,1 realized that most A C L L proteins contain a conserved C terminus peroxisomal target signal (FTS1), which was subsequently confirmed by the publication of a database of putative Arabidopsis peroxisomal proteins (Reumann et al., 2004). M y attention then focused on possible A C L L functions in the peroxisome. Analysis of the expression of ACLL transgenic Arabidopsis revealed that ACLL  promoter-GUS fusion constructs in  expression was not limited to any particular  112  tissue or cell-type (data not shown), which made it difficult to generate specific hypotheses about putative functions based on developmental expression patterns. However, results obtained from analysis of gene fusion expression revealed response to wounding for some clade D ACLL genes.  With my access to information from the initial stages of the poplar genome sequence assembly in the early 2005,1 was able to identify potential poplar ACLL homologues and a comparative genome approach between Arabidopsis and poplar became possible (Chapter 3). I focused on comparison of poplar and ArathACLL  developmental  expression patterns and their response to stress treatments, which confirmed that some clade D Arabidopsis and poplar homologues have increased expression upon wound and herbivory treatments. In 2005 it was reported that the protein encoded by the clade E Arabidopsis ACLL9  gene converts the octadecanoid pathway intermediate OPDA to the  corresponding CoA ester in vitro (Schneider et al, 2005), suggesting a role for this gene in the JA pathway. However, a putative role of ArathaCLL9 in defense-related JA biosynthesis was not corroborated by our gene expression data, which did not show increase in gene expression of ACLL9  after mechanical wounding or stress activated  expression of its closest poplar homologues. A t a similar time, I used newly available tools for identifying co-expressed genes based on data from public microarray experiments, such as Expression Angler (Toufighi et al, 2005) and Prime Correlated Gene Search ( This lead me to suggest a function of ArathACLL4 in the JA pathway, which was independently confirmed in experiments performed by Koo etal (2006)  113  The comparative genomics approach revealed highly conserved ACLL genes with similar expression patterns, with a striking example being those ArathACLL5 and PoptrACLL13 in clade A . Expression of each gene was flower-preferred, suggesting a common and conserved function in that organ in both organisms. In 2006 I was able to isolate a homozygous line of an acll5 loss of function transposon insertion mutant from a segregating population. This mutant had a male sterile phenotype, consistent with antherlocalized of gene expression as revealed by mining of microarray expression data. Further analysis of mutant phenotype, collaborative in situ hybridization experiments, as well as co-expression analysis allowed me to generate specific hypotheses regarding the roles of A C L L 5 and homologues from other species in a tapetum-localized biochemical pathway required for pollen exine biosynthesis (Chapter 4).  5.2 Future Directions  5.2.1 Mutant analysis As seen for ArathACLL5, functionally analyzed in Chapter 4, mutants affecting proper gene expression can be powerful tools to assess gene function. Arabidopsis knock-out lines for a large fraction of Arabidopsis genes are available and can be identified by simple in silico searches (Salk Institute, bin/tdnaexpress).  I obtained knock-out lines for all nine ACLLs and I have been able to isolate homozygous mutants lines for four of them, including ACLLS. Lines for ACLLl, ACLL4 and ACLL9 were isolated but no visible phenotype could be observed under laboratory growing  114  conditions (data not shown). I did not elaborate on these lines in this thesis as the same knock-out mutants for ACLL4 and ACLL9 showing no phenotype had been described elsewhere (Koo et al., 2006; Schneider et al., 2005). Single knock-out mutants are not always informative since genes might have redundant functions, particularly genes that are part of gene families. ACLL1 and ACLL4 are in the same clade (D), and are located in tandem making nearly impossible the generation of double mutants by cross-pollination and genetic recombination. As discussed in Chapter 3, based on the known function of ACLL4, it is possible that other genes in clade D have similar functions, encoding O D P A / O P C 8 - C 0 A ligases. If this hypothesis is correct, a mutant with all clade D genes knocked out should be unable to make JA, and show a phenotype related to J A deficiency. Such a mutant would be valuable for the study of plant defense and developmental mechanisms that depend on JA signaling.  An alternative to insertion knock-outs for reverse genetic analysis relies on a natural mechanism of gene silencing in response to virus attack (Waterhouse and Helliwell, 2003). Most plant viruses have single-stranded RNA genomes, which are released into the host plant cell upon infection. Double stranded R N A (dsRNA) is formed by replication of viral RNA. The presence of dsRNA triggers a plant defense response, resulting in cleavage of specific RNA by an enzyme termed Dicer. Dicer-generated ~22nt RNA segments are then associated with an endonuclease forming a complex that will cut any RNA that hybridizes to this complex. Therefore, in nature, this process culminates in the degradation of homologous viral R N A molecule. Double stranded R N A can be artificially generated in the plant cells by expressing a construct leading to the  115  transcription of a mRNA that complements itself to form a hairpin structure. This hairpin is recognized as dsRNA and the plant response described above is activated, leading to the degradation of the corresponding endogenous mRNA. This R N A i gene silencing approach can generate plants with mRNA levels ranging from near wild type to undetectable (Waterhouse and Helliwell 2003). One strategy for silencing expression of all genes in clade D could take advantage of their similarity in sequence by creation of an R N A i silencing construct targeted to a region of sequence conserved in all clade D genes, potentially leading to silencing all five Arabidopsis ACLL genes in this clade, thus eliminating the entire suite of putatively similar enzymatic function. Alternatively, genes in clade D can be heterologously expressed for the production of A C L L enzymes for testing for OPDA/OPC8 activity in vitro.  As seen in Chapter 3, I identified a network of genes that are co-expressed with ArathACLLA (Figure 3.6). In addition to co-expressed genes that encode enzymes that may be in the same biochemical pathway, this kind of analysis can also identify potential regulatory genes. Central within the ArathACLIA network is a transcription factor encoded by At3g44260. This CCR-NOT transcription complex protein could potentially be involved in regulation of the octadecanoid pathway biosynthetic genes. A knock-out mutant for this gene could be used to test changes in expression of ACLL4, the remaining 4 gene members of clade D, and the co-expressed octadecanoid pathway genes. If this hypothesis was confirmed and At3g44260 is a central regulator of the octadecanoid pathway, then a single mutant would be sufficient to generate plants deficient in JA biosynthesis via the octadecanoid pathway. This mutant could also potentially shed light  116  on the functions of other clade D ACLL genes, or alternatively, be useful for understanding alternative pathways of-JA synthesis or alternative defense and regulatory mechanisms in Arabidopsis.  Similarly, in Chapter 4,1 discussed the possibility of the transcription factor gene MYB99 being a regulator of ACLL5 and other genes co-expressed in the tapetum that encode enzymes involved in exine biosynthesis. It would be interesting test MYB99 loss of function mutants for loss of expression of potential target genes such as ACLL5 and coexpressed genes, and also to test the myb99 mutant phenotype with regards to male sterility and impairment of pollen development. This approach would help test the hypothesis that the ACLL5 co-expressed genes are in fact co-regulated, and provide support for roles in a common biochemical pathway required for exine biosynthesis during pollen development. In addition, phenotypic analysis of null mutants of coexpressed genes, and especially the effect of such mutations on pollen development and male fertility would confirm their importance in this biological process and would be consistent with functions in the same biochemical pathway as ACLL5. Experiments are underway in the Douglas Lab to address these questions.  5.2.2 acll5 mutant complementation One well accepted method for proving gene function is by genetic complementation of a loss of function mutation. As described in Chapter 4, the loss of function mutant acll5-l yields a male-sterile phenotype. Although co-segregation analysis shows that the mutation is tightly linked to the phenotype, complementation of the acll5 phenotype with  117  a wild-type ACLL5 gene would provide undisputable proof that the mutation in ACLL5 is responsible for the male sterile mutant phenotype.  In order to accomplish this, I have built a construct composed of ACLL5 genomic region driven by the 2Kb ACLL5 native promoter and cloned in the pGreen 0029 T - D N A vector (Hellens et al., 2000). In preliminary work, plants heterozygous for the acll5 mutation were transformed with an Agrobacterium carrying this construct. However, no transformants were obtained, and this experiment will have to be repeated using a different binary vector such as pCambia (http://www.cambia.orgl. If the male sterile phenotype is indeed due to disruption of the ACLL5 gene, T l transformant plants derived from heterozygous background are predicted to have wild-type (male fertile) phenotypes, and this trait would be inherited by T2 progeny of such lines.  Based on the results of the comparative genomics approach, it would be interesting test the ability of ACLLS homologues from other plant species to complement the Arabidopsis acll5-l mutation.  For example PoprtACLL13, which is preferentially  expressed in the male flowers (Chapters 3 and Chapter 4), and the single rice homologue in clade A could be tested. This heterologous complementation approach is a valuable tool to confirm gene function in organisms for which there are less information and/or resources available than in Arabidopsis. For example, it can take several years for a poplar tree to flower, but by testing complementation of the Arabidopsis mutant it would take only months to test whether PoptrACLL13 has a function similar or identical to ArathACLL5 in pollen development.  118  5.2.3 Mutant studies in poplar and other plant species In poplar, reverse genetic approaches employing R N A i induced gene silencing are possible. Despite the necessity of generating transgenic plants by the labor intensive and time consuming process of plant regeneration in tissue culture following co-cultivation of leaf discs with Agrobacterium, it is possible to obtain transgenic poplar plants with reduced levels of gene expression via this technology (Meyer et al, 2004). Given the knowledge that ArathACLLA encodes an OPDA:CoA ligase that functions in JA biosynthesis (Koo et al, 2006), and that PoptrACLLA and PoptrACLL5 may have the same function based on expression and phylogenetic analyses (Chapter 3), an R N A i strategy could be used to generate transgenic poplar plants with reduced or null levels of PoptrACLLA/5 expression. Since, -in contrast to Arabidopsis, these two highly similar genes appear to be the only genes encoding OPDA:CoA ligase in poplar, such transgenic plants would be predicted to have reduced or undetectable levels of JA. Such plants would be valuable for the study of the role of JA in plant defense against herbivory and other stresses, of the postulated roles for alternative signaling molecules for plant defense in vivo, such as upstream intermediates in the octadecanoid pathway that are generated in the peroxisome and have been postulated to play roles in defense signaling (Stintzi et al, 2001). A similar approach using the Arabidopsis opr3 mutant, defective in the isoform of OPDA reductase required for JA biosynthesis, has been successfully used to determine the role of OPDA in wound-induced signal transduction. Genes previously known to be JA-dependant were up-regulated in the opr3 mutant (Stintzi et al, 2001).  119  Using a similar strategy, R N A i silencing of PoptrACLL13 in transgenic poplar would be predicted to lead to defects in pollen development and male fertility in that plant. While it normally takes five or more years for poplar to flower, it has recently been shown that over-expression of either of the two poplar FT genes leads to rapidly accelerated flowering, sometimes observed even in tissue culture (Bohlenius et al., 2006; Hsu et al., 2006), making it feasible to express RNAi constructs in early flowering poplar plants. If this approach were successful in generating male sterile poplar trees, it could be used as a tool to generate pollen-less trees for biotechnology purposes (C. de Azevedo Souza and C.J. Douglas, US Provisional Patent " A method for generating male sterile plants"). This technology could be particularly useful in trees such as poplar that are wind pollinated, since they release large amounts of pollen. Male sterile transgenic trees that fail to form pollen would be desirable to prevent gene dispersal from transgenic or exotic poplar trees via cross-pollination to wild relatives.  5.2.4 Biochemical characterization Poplar clade D A C L L s In Chapter 3 I suggested a function in JA biosynthesis and plant defense for clade D poplar ACLL homologues, which are the closest relatives to ArathACLIA, which encodes an O P D A / O P C 8 - C 0 A ligase required for JA biosynthesis (Koo et al, 2006). The hypothesis that the poplar homologues PoptrACLIA and PoptrACLL5 also encode O P D A / O P C 8 - C 0 A ligases is supported by their strong up-regulated expression after wounding, herbivory, and MeJA treatments, and the experimentally demonstrated peroxisomal localization of P o p t r A C L L 5 . However, characterization of the P o p t r A C L L 5  120  enzyme by expression of the recombinant protein in E. coli was unsuccessful (data not shown).  The question of the enzymatic properties of PoptrACLL4/5, and particularly its ability to use OPDA/OPC8 as a substrate is still very interesting in light of the known activity of enzyme encoded by the ArathACLlA homologue. The use a eukaryotic host for heterologous expression such as yeast instead of E. coli might be a useful alternative to address this question.  5.2.5 4 C L / A C L L structural information and identification of substrates With the exception of ArathACLL4 and ArathACLL9, no published experimental information is available regarding A C L L substrates. There is currently no protein structural information available for 4CL, not to mention A C L L enzymes, which could aid in making predictions as substrates and structural features relevant to substrate predictions. Two known crystal structures of adenylate-forming enzymes are those of the firefly Photinus pyralis luciferase (EC (Conti et al, 1996) and the bacterium Brevibacillus brevis gramicidin S-synthetase 1 (PheA; CAA33603) (Conti et al, 1997). While these enzymes share limited sequence identity to 4 C L (less than 20% identity), information from these structures has allowed prediction of the nature of the 4 C L substrate binding pocket, and has been used to predict amino acid residues that determine 4CL substrate binding (Stuible and Kombrink, 2001). Based on information obtained from the crystal structure of the phenylalanine-activating domain of PheA, and aminoacid sequence comparisons between PheA and 4 C L , 10 amino acid residues were  121  identified that could form the 4 C L substrate binding pocket (Stuible and Kombrink, 2001). The authors took advantage of the fact that a single member of the Arabidopsis 4CL family (4CL2) is unable to accept ferulate as a substrate, allowing them to pinpoint amino acid residues absent, or not conserved, in the 4CL2 putative substrate binding pocket, and therefore are candidates for causing the lack of activity of 4CL2 towards ferulate.  In later studies, the same group used homology modeling to predict 4CL2 tertiary structure by alignment to the known PheA structure (Schneider et al., 2003). Although the structure of enzyme luciferase, which is more closely related to 4 C L and PheA, is also known, PheA was chosen for this study due to the similarity of substrate structures. This allowed a more accurate prediction the orientation of the 4 C L substrates in the putative binding pocket. Using the 3D model, 12 amino acids were identified, in the substrate binding pocket, that were predicted to be close enough to the substrate to form electron interactions. A site directed mutagenesis approach of targeted amino acids was used successfully to allow the design of ferulic acid, sinapic acid, and cinnamic acidactivating At4CL2 Variants (Schneider et al., 2003). The previous knowledge of 4 C L substrates was therefore indispensable for testing the hypothesis of amino acids responsible for substrate specificity in 4CL2.  In the future, one might be able to use this same approach to make predictions regarding the substrate specificities of A C L L enzymes. A crystal structure of ArathACLL4 would be particularly advantageous given that the substrate is known. This data would show  122  which amino acids are important for substrate recognition. Therefore, homology modeling of additional members of clade D A C L L s , including poplar representatives, could provide further insights into similarities of putative substrate binding pockets and indicating if OPDA/OPC8 is a suitable substrate for these enzymes. Further knowledge of crystal structures of closely related enzymes such as 4CLs will, also provide insights into how similar the binding pockets of A C L L s are to 4CLs. It would be, for example, interesting to compare a 4CL structure with that of ArathACLL5 in clade A , which could have a phenolic substrate based on co-expression analysis, as discussed in Chapter 4.  A large scale screening approach for identification of A C L L substrates in vitro has been used successfully for A r a t h A C L L (Schneider et al., 2005). The method consists of using 0  the property of the adenylate-forming enzyme luciferase, in the presence of the substrate luciferin and ATP, for generation of light involving A T P hydrolysis and formation of an AMP-bound substrate intermediate.  Since A C L L enzymes, as adenylate-forming  enzymes, also require A T P , luciferase activity is used in a visual assay for A T P depletion. Thus, loss of luciferase activity when co-incubated with a recombinant A C L L enzyme and a potential substrate indicates potential A C L L activity against the substrate, if the activity is high enough to deplete the A T P concentration. In theory all A C L L s can be screened using this method. Although this approach can be a powerful tool for identification of substrates, one limitation of this method lies in the necessity of having a large enough library of potential substrates to find potential substrates for which an A C L L has high enough activity to deplete A T P in the assay. Since A C L L s may have very specific substrate preferences, this could limit the chances of successfully finding the  123  correct substrate using this assay. Another variable that could confound the assay is possible necessity of adjusting enzyme assay conditions. However, as more is learned about A C L L s , the multitude of possible substrates will be narrowed to fewer more likely candidates, making this a potential powerful approach to screen candidate substrates.  5.2.6 Continuous mining of data With the ongoing efforts to functionally characterize all genes in Arabidopsis, there is an enormous amount of information continually generated about single genes, biochemical pathways and biological processes. I found that co-expression studies are an especially useful tool for generating hypotheses regarding biological and biochemical roles for genes of unknown function. The reverse is also true. Given a set of genes with known functions in a common process or pathway, this approach is useful to identify genes encoding enzymes or other proteins that function in uncharacterized parts of the pathway or process. The more expression information there is available, the more robust the data pointing to these relationships should become. Therefore, in the near future data on the networks of genes co-expressed with ACLLs in clades that are still poorly characterized might become easier to interpret, and may allow us to make educated guesses regarding A C L L functions. For example, in Chapter 3 I showed that ArathACLL3  is part of a large  set of co-expressed genes, but no obvious functional relationships to ArathACLL3  could  be derived from that the data. However, as more expression profiling experiments under different environmental, developmental, and genotype-specific conditions are performed, this network of relationships may become more clear, revealing potential biochemical partners of ArathACLL3.  124  Libraries of insertional mutants, as discussed in the section 5.2.1 of this chapter, are also constantly being enlarged and more lines of D N A insertions in selected A C L L s might become available. For example, if an additional mutant allele for ArathACLL5 becomes available, it would be desirable to verify whether this mutant has the same male sterile phenotype as acll5-l. This would be an additional indication that the acll5-l mutation is really the cause of the observed phenotype. This type of evidence would eliminate the necessity of mutant rescue by genetic complementation as suggested in section 5.2.2.  Given the above, a constant mining of publicly available data is important to support functional analysis of genes of unknown function such as ACLL genes. The large amount of knowledge and large numbers of tools available are possibly the best advantages when working with a model organism such as Arabidopsis.  More information becomes continually available in the published literature that provides direct or indirect insights into a biological question of interest. For example, shortly before completing this chapter, the biological and biochemical function of the Arabidopsis CYP703A2 was published (Morant et al, 2007). CYP703A2 is a single copy plant-specific P450 enzyme that was shown to be specifically involved in pollen development. The expression pattern of CYP703A2 is the same as that of ArathACLL5 (expressed during pollen formation) and belongs to our list of genes co-expressed with ArathACLL5 (Chapter4). Mutants lacking expression of CYP703A2 have reduced male fertility and impaired pollen wall development with the absence of exine. Biochemical  125  characterization of CYP703A2 heterologously expressed in yeast identified lauric acid and in-chain hydroxy lauric acid as the substrate and product, respectively. It is known that sporopollenin, major component of the pollen exine, is composed of fatty acids and phenolic units, so it should be expected that genes encoding these compounds are coexpressed in the same tissue, during the same developmental stage. A n interesting aspect of the CYP703A2 mutant phenotype is the absence of detectible phenylpropanoids in the sporopollenin, which could indicate that these components can only be attached to the pollen wall if polymerized with fatty acids. The substrate of ArathACLL5 remains a mystery. M y data strongly indicate a role in sporopollenin production and I favor a function in the synthesis of the phenolic components due to the close relationship to true 4CLs and ArathACLL5 co-expression with genes encoding phenylpropanoid-like enzymes. However, I cannot rule out a possible role in the synthesis of the fatty acid components of sporopollenin. Therefore, I suggest that in-chain hydroxy lauric acid is a candidate ArathACLL5 substrate that should be tested.  5.3 F i n a l r e m a r k s  This work demonstrates the usefulness of comparative genomics in understanding the roles of particular genes in given biological systems. I have used information available from the model plant Arabidopsis thaliana as a tool for gene discovery and to generate functional hypothesis regarding homologous genes in Populus (poplar). This shows the importance of having model organisms with large repositories of information and tools available, such as public global expression data using microarrays, together with less developed model systems such as Populus (poplar) (Jansson and Douglas, 2007).  126  Information obtained from targeted expression studies in poplar, such as herbivory and MeJA induced up-regulation of gene expression and, in particular, the male flower expression facilitated by the diecious nature of the poplar species, demonstrated that comparative genomics is a two way road. Whereas one single organism might be more feasible to be explored collectively, the knowledge of more than one "model" species will allow us to progress to a holistic view of gene functions in all plant species.  The data presented in the results chapters (Chapter 3 and 4) open new routes to the study of A C L L function in plants, as it allows for various insights into the comparative genomics of gene family evolution, identifies a crucial function for ArathACLL5 in pollen development, a key process in the perpetuation of life, and suggests new genes in this process for further studies. I hope this newly paved road becomes well traveled.  127  REFERENCES  Aarts, M.G.M., Hodge, R., Kalantidis, K., Florack, D., Wilson, Z.A., Mulligan, B.J., Stiekema, W.J., Scott, R. and Pereira, A. (1997) The Arabidopsis M A L E STERILITY 2 protein shares similarity with reductases in elongation/condensation complexes. Plant Journal, 12, 615-623. Ageez, A., Kazama, Y., Sugiyama, R. and Kawano, S. (2005) Male-fertility genes expressed in male flower buds of Silene latifolia include homologs of anther-specific genes. Genes Genet Syst, 80,403-413. Allina, S.M., Pri-Hadash, A., Theilmann, D.A., Ellis, B.E. and Douglas, C J . (1998) 4-Coumarate:coenzyme A ligase in hybrid poplar. Properties of native enzymes, cDNA cloning, and analysis of recombinant enzymes. Plant Physiol, 116, 743-754. Ariizumi, T., Hatakeyama, K., Hinata, K., Inatsugi, R., Nishida, I., Sato, S., Kato, T., Tabata, S. and Toriyama, K. (2004) Disruption of the novel plant protein NEF1 affects lipid accumulation in the plastids of the tapetum and exine formation of pollen, resulting in male sterility in Arabidopsis thaliana. Plant J, 39, 170-181. Ariizumi, T., Hatakeyama, K., Hinata, K., Sato, S., Kato, T., Tabata, S. and Toriyama, K. (2003) A novel male-sterile mutant of Arabidopsis thaliana, faceless pollen-1, produces pollen with a smooth surface and an acetolysis-sensitive exine. Plant Mol Biol, 53, 107-116. Arondel, V.V., Vergnolle, C , Cantrel, C. and Kader, J. (2000) Lipid transfer proteins are encoded by a small multigene family in Arabidopsis thaliana. Plant Science, 157, 112. Atanassov, I., Russinova, E., Ahtonov, L . and Atanassov, A. (1998) Expression of an anther-specific chalcone synthase-like gene is correlated with uninucleate microspore development in Nicotiana sylvestris. Plant Mol Biol, 38, 1169-1178. Austin, M.B. and Noel, J.P. (2003) The chalcone synthase superfamily of type III polyketide synthases. Nat Prod Rep, 20, 79-110. Becker-Andre, M . , Schulze-Lefert, P. and Hahlbrock, K. (1991) Structural comparison, modes of expression, and putative cis-acting elements of the two 4coumarate: CoA ligase genes in potato. J Biol Chem, 266, 8551-8559. Beggs, C.J., Stolzer-Jehle, A. and Wellmann, E. (1985) Isoflavonoid formation as an indicator of U V stress in bean (Phaseolus vulgaris L.) Leaves : The significance of photorepair in assessing potential damage by increased solar U V - B Radiation. Plant Physiol, 79, 630-634.  128  Blanc, G., Hokamp, K. and Wolfe, K.H. (2003) A recent polyploidy superimposed on older large-scale duplications in the Arabidopsis genome. Genome Res, 13, 137-144. Blokker, P., Boelen, P., Broekman, R. and Rozema, J. (2006) The occurrence of pcoumaric acid and ferulic acid in fossil plant materials and their use as UV-proxy. Plant Ecology, 182, 197-207. Boavida, L.C., Becker, J.D. and Feijo, J.A. (2005) The making of gametes in higher plants. Int J Dev Biol, 49, 595-614. Bohlenius, H., Huang, T., Charbonnel-Campaa, L . , Brunner, A.M., Jansson, S., Strauss, S.H. and Nilsson, O. (2006) CO/FT Regulatory Module Controls Timing of Flowering and Seasonal Growth Cessation in Trees. Science, 312, 1040-1043. Bonaventure, G., Gfeller, A., Proebsting, W.M., Hortensteiner, S., Chetelat, A., Martinoia, E. and Farmer, E.E. (2007) A gain-of-function allele of TPC1 activates oxylipin biogenesis after leaf wounding in Arabidopsis. Plant J, 49, 889-898. Cenzano, A., Vigliocco, A., Kraus, T. and Abdala, G. (2003) Exogenously applied jasmonic acid induces changes in apical meristem morphology of potato stolons. Ann Bot (Lond), 91, 915-919. Chen, X., Goodwin, S.M., Boroff, V.L., Liu, X. and Jenks, M.A. (2003) Cloning and characterization of the W A X 2 gene of Arabidopsis involved in cuticle membrane and wax production. Plant Cell, 15, 1170-1185. Clough, S.J. and Bent, A.F. (1998) Floral dip: a simplified method for Agrobacteriummediated transformation of Arabidopsis thaliana. Plant J, 16, 735-743. Conti, E., Franks, N.P. and Brick, P. (1996) Crystal structure of firefly luciferase throws light on a superfamily of adenylate-forming enzymes. Structure, 4, 287-298. Conti, E., Stachelhaus, T., Marahiel, M.A. and Brick, P. (1997) Structural basis for the activation of phenylalanine in the non-ribosomal biosynthesis of gramicidin S. Embo Journal, 16,4174-4183. Costa, M.A., Collins, R.E., Anterola, A.M., Cochrane, F.C., Davin, L.B. and Lewis, N.G. (2003) A n in silico assessment of gene function and organization of the phenylpropanoid pathway metabolic networks in Arabidopsis thaliana and limitations thereof. Phytochemistry, 64, 1097 - 1112. Croteau, R., Kutchan T.M. and Lewis, N. (2000) Natural Products (Secondary metabolites) in Buchannan, B.B., Gruissem, W. and Jones, R. L . (Eds.) Biochemistry and Molecular Biology of Plants. American Society of Plant Physiologists, Rockville. 12501318.  129  Cukovic, D., Ehlting, J., VanZiffle, J.A. and Douglas, C.J. (2001) Structure and evolution of 4-coumarate:coenzyme A ligase (4CL) gene families. Biol Chem, 382, 645654. Czernic, P., Visser, B., Sun, W., Savoure, A., Deslandes, L . , Marco, Y., Van Montagu, M . and Verbruggen, N. (1999) Characterization of an Arabidopsis thaliana receptor-like protein kinase gene activated by oxidative stress and pathogen attack. The Plant Journal, 18, 321-327. de Azevedo Souza, C. and Douglas, C.J. A method for generating male sterile plants. Provisional Patent USPTO Priority Date Nov 3-2006 (UBC UILO reference 07-082). Deluca, M . (1976) Firefly luciferase. Adv Enzymol Relat Areas Mol Biol, 44, 37-68. Dixon, R.A. and Paiva, N.L. (1995) Stress-induced phenylpropanoid metabolism. Plant Cell, 7, 1085-1097. Dong, J., Chen, C. and Chen, Z. (2003) Expression profiles of the Arabidopsis W R K Y gene superfamily during plant defense response. Plant Mol Biol, 51, 21-37. Douglas, C , Hoffmann, H., Schulz, W. and Hahlbrock, K. (1987) Structure and elicitor or u.v.-light-stimulated expression of two 4-coumarate:CoA ligase genes in parsley. Embo J, 6, 1189-1195. Douglas, C.J. (1996) Phenylpropanoid metabolism and lignin biosynthesis: from weeds to trees. Trends in Plant Science, 1, 171-178. Douglas, C.J. and Ehlting, J. (2005) Arabidopsis thaliana full genome longmer microarrays: a powerful gene discovery tool for agriculture and forestry. Transgenic Res, 14,551-561. Duarte, J.M., Cui, L., Wall, P.K., Zhang, Q., Zhang, X., Leebens-Mack, J., Ma, H., Altman, N. and dePamphilis, C W . (2006) Expression pattern shifts following duplication indicative of subfunctionalization and neofunctionalization in regulatory genes of Arabidopsis. Mol Biol Evol, 23,469-478. Ehlting, J., Buttner, D., Wang, Q., Douglas, C.J., Somssich, I.E. and Kombrink, E. (1999) Three 4-coumarate:coenzyme A ligases in Arabidopsis thaliana represent two evolutionarily divergent classes in angiosperms. Plant J, 19, 9-20. Ehlting, J., Mattheus, N., Aeschliman, D.S., L i , E., Hamberger, B., Cullis, I.F., Zhuang, J., Kaneda, M . , Mansfield, S.D., Samuels, L . , Ritland, K., Ellis, B.E., Bohlmann, J. and Douglas, C.J. (2005) Global transcript profiling of primary stems from Arabidopsis thaliana identifies candidate genes for missing links in lignin biosynthesis and transcriptional regulators of fiber differentiation. Plant J, 42, 618-640.  130  Ehlting, J., Provart, N J . and Werck-Reichhart, D. (2006) Functional annotation of the Arabidopsis P450 superfamily based on large-scale co-expression analysis. Biochem Soc Trans, 34, 1192-1198. Ehlting, J., Shin, J.J. and Douglas, C.J. (2001) Identification of 4-coumarate: coenzyme A ligase (4CL) substrate recognition domains. Plant J, 27, 455-465. Enjuto, M . , Lumbreras, V., Marin, C. and Boronat, A. (1995) Expression of the Arabidopsis H M G 2 Gene, Encoding 3-Hydroxy-3-Methylglutaryl Coenzyme A Reductase, Is Restricted to Meristematic and Floral Tissues. Plant Cell, 7, 517-527. Facchini, P.J. and St-Pierre, B. (2005) Synthesis and trafficking of alkaloid biosynthetic enzymes. Current Opinion in Plant Biology, 8, 657-666. Farmer, E.E., Almeras, E. and Krishnamurthy, V. (2003) Jasmonates and related oxylipins in plant responses to pathogenesis and herbivory. Curr Opin Plant Biol, 6, 372378. Feys, B., Benedetti, C.E., Penfold, C.N. and Turner, J.G. (1994) Arabidopsis mutants selected for resistance to the phytotoxin coronatine are male sterile, insensitive to methyl jasmonate, and resistant to a bacterial pathogen. Plant Cell, 6, 751-759. Goldberg, R.B., Beals, T.P. and Sanders, P.M. (1993) Anther development: basic principles and practical applications. Plant Cell, 5, 1217-1229. Guindon, S. and Gascuel, O. (2003) A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst Biol, 52, 696-704. Hahlbrock, K., Lamb, C.J., Purwin, C , Ebel, J., Fautz, E. and Schafer, E . (1981) Rapid response of suspension-cultured parsley cells to the elicitor from Phytophthora megasperma var. sojae: Induction of the Enzymes of General Phenylpropanoid Metabolism. Plant Physiol, 67, 768-773. Hahlbrock, K. and Scheel, D. (1989) Physiology and molecular-biology of phenylpropanoid metabolism. Annual Review of Plant Physiology and Plant Molecular Biology, 40, 347-369. Hamberger, B. and Hahlbrock, K. (2004) The 4-coumarate:CoA ligase gene family in Arabidopsis thaliana comprises one rare, sinapate-activating and three commonly occurring isoenzymes. Proc Natl Acad Sci U S A , 101, 2209-2214. Hamberger, B., Ellis M., Friedmann M., de Azevedo Souza, C , Barbazuk, B., and Douglas C. J. (2007) Genome-wide analyses of phenylpropanoid-related genes in Populus trichocarpa, Arabidopsis thaliana, and Oryza sativa: the Populus lignin toolbox and conservation and diversification of angiosperm gene families. Can. J. Bot, in press.  131  Harding, S.A., Leshkevich, J., Chiang, V . L . and Tsai, C.J. (2002) Differential substrate inhibition couples kinetically distinct 4-coumarate:coenzyme a ligases with spatially distinct metabolic roles in quaking aspen. Plant Physiol, 128, 428-438. Hauffe, K.D., Paszkowski, U., Schulze-Lefert, P., Hahlbrock, K., Dangl, J.L. and Douglas, C.J. (1991) A parsley 4CL-1 promoter fragment specifies complex expression patterns in transgenic tobacco. Plant Cell, 3,435-443. Hellens, R.P., Edwards, E.A., Leyland, N.R., Bean, S. and Mullineaux, P.M. (2000) pGreen: a versatile and flexible binary T i vector for Agrobacterium-mediated plant transformation. Plant Mol Biol, 42, 819-832. Hietala, A.M., Eikenes, M . , Kvaalen, H., Solheim, H. and Fossdal, C.G. (2003) Multiplex real-time PCR for monitoring Heterobasidion annosum colonization in Norway spruce clones that differ in disease resistance. Appl Environ Microbiol, 69, 4413-4420.  Hooker, T.S., Millar, A.A., and Kunst, L. (2002). Significance of the expression of the CER6 condensing enzyme for cuticular wax production in Arabidopsis. Plant Physiol, 129, 1568-1580.  Hsu, C.-Y., Liu, Y., Luthe, D.S. and Yuceer, C. (2006) Poplar FT2 shortens the juvenile phase and promotes seasonal flowering. Plant Cell, 18, 1846-1861. Hu, W.J., Kawaoka, A., Tsai, C.J., Lung, J., Osakabe, K., Ebinuma, H. and Chiang, V.L. (1998) Compartmentalized expression of two structurally and functionally distinct 4-coumarate:CoA ligase genes in aspen (Populus tremuloides). Proc Natl Acad Sci U S A , 95, 5407-5412. Humphreys, J . M . and Chappie, C. (2002) Rewriting the lignin roadmap. Current Opinion in Plant Biology, 5, 224-229. International Rice Genome Sequencing Project (2005) The map-based sequence of the rice genome. Nature, 436, 793-800. Itoh, T., Tanaka, T., Barrero, R.A., Yamasaki, C , Fujii, Y., Hilton, P.B., Antonio, B. A., Aono, H., Apweiler, R., Bruskiewich, R., Bureau, T., Burr, F., Costa de Oliveira, A., Fuks, G., Habara, T., Haberer, G., Han, B., Harada, E., Hiraki, A.T., Hirochika, H., Hoen, D., Hokari, H., Hosokawa, S., Hsing, Y.I., Ikawa, H., Ikeo, K., Imanishi, T., Ito, Y., Jaiswal, P., Kanno, M . , Kawahara, Y., Kawamura, T., Kawashima, H., Khurana, J.P., Kikuchi, S., Komatsu, S., Koyanagi, K.O., Kubooka, H., Lieberherr, D., Lin, Y.C., Lonsdale, D., Matsumoto, T., Matsuya, A., McCombie, W.R., Messing, J., Miyao, A., Mulder, N., Nagamura, Y., Nam, J., Namiki, N., Numa, H., Nurimoto, S., O'Donovan, C , Ohyanagi, H., Okido, T., Oota, S., Osato, N., Palmer, L.E., Quetier, F., Raghuvanshi, S., Saichi, N., Sakai, H., Sakai, Y.,  132  Sakata, K., Sakurai, T., Sato, F., Sato, Y., Schoof, H., Seki, M . , Shibata, M . , Shimizu, Y., Shinozaki, K., Shinso, Y., Singh, N.K., Smith-White, B., Takeda, J., Tanino, M., Tatusova, T., Thongjuea, S., Todokoro, F., Tsugane, M . , Tyagi, A.K., Vanavichit, A., Wang, A., Wing, R.A., Yamaguchi, K., Yamamoto, M., Yamamoto, N., Yu, Y., Zhang, H., Zhao, Q., Higo, K., Burr, B., Gojobori, T. and Sasaki, T. (2007) Curated genome annotation of Oryza saliva ssp. japonica and comparative genome analysis with Arabidopsis thaliana. Genome Res, 17, 175-183. Jansson, S. and Douglas, C.J. (2007) Populus: a model system for Plant Biology. Annu Rev Plant Biol, 58,435-458. Jefferson, R.A., Kavanagh, T.A. and Bevan, M.W. (1987) G U S fusions: betaglucuronidase as a sensitive and versatile gene fusion marker in higher plants. Embo J, 6, 3901-3907. Jensen, A.B., Raventos, D. and Mundy, J. (2002) Fusion genetic analysis of jasmonatesignalling mutants in Arabidopsis. Plant J, 29, 595-606. Ke, J., Behal, R.H., Back, S.L., Nikolau, B.J., Wurtele, E.S. and Oliver, D J . (2000) The role of pyruvate dehydrogenase and acetyl-coenzyme A synthetase in fatty acid synthesis in developing Arabidopsis seeds. Plant Physiol, 123,497-508. Koo, A.J., Chung, H.S., Kobayashi, Y. and Howe, G.A. (2006) Identification of a peroxisomal acyl-activating enzyme involved in the biosynthesis of jasmonic acid in Arabidopsis. J Biol Chem, 281, 33511-33520. Koopmann, E., Logemann, E . and Hahlbrock, K. (1999) Regulation and functional expression of cinnamate 4-hydroxylase from parsley. Plant Physiol., 119, 49-56. Kunst, L . , Klenz, J.E., Martinez-Zapater, J. and Haughn, G.W. (1989) A P 2 gene determines the identity of perianth organs in flowers of Arabidopsis thaliana. Plant Cell, 1,1195-1208. Kurata, T., Kawabata-Awai, C , Sakuradani, E., Shimizu, S., Okada, K. and Wada, T. (2003) The Y O R E - Y O R E gene regulates multiple aspects of epidermal cell differentiation in Arabidopsis. Plant J, 36, 55-66. Lauvergeat, V., Lacomme, C , Lacombe, E., Lasserre, E . , Roby, D. and GrimaPettenati, J . (2001) Two cinnamoyl-CoA reductase (CCR) genes from Arabidopsis thaliana are differentially expressed during development and in response to infection with pathogenic bacteria. Phytochemistry, 57, 1187-1195. Lee, D. and Douglas, C.J. (1996) Two divergent members of a tobacco 4coumaratexoenzyme A ligase (4CL) gene family. cDNA structure, gene inheritance and expression, and properties of recombinant proteins. Plant Physiol, 112, 193-205.  133  Li, C , Schilmiller, A.L., Liu, G., Lee, G.L, Jayanty, S., Sageman, C , Vrebalov, J., Giovannoni, J.J., Yagi, K., Kobayashi, Y. and Howe, G.A. (2005) Role of betaoxidation in jasmonate biosynthesis and systemic wound signaling in tomato. Plant Cell, 17, 971-986. Li, L., Zhao, Y., McCaig, B.C., Wingerd, B.A., Wang, J., Whalon, M.E., Pichersky, E. and Howe, G.A. (2004) The tomato homolog of CORONATINE-INSENSITIVE1 is required for the maternal control of seed maturation, jasmonate-signaled defense responses, and glandular trichome development. Plant Cell, 16, 126-143. Liechti, R. and Farmer, E.E. (2002) The jasmonate pathway. Science, 296, 1649-1650. Lindermayr, C , Mollers, B., Fliegmann, J., Uhlmann, A., Lottspeich, F., Meimberg, H. and Ebel, J. (2002) Divergent members of a soybean (Glycine max L.) 4coumaratexoenzyme A ligase gene family. Eur J Biochem, 269, 1304-1315. Lozoya, E . , Hoffmann, H., Douglas, C , Schulz, W., Scheel, D. and Hahlbrock, K. (1988) Primary structures and catalytic properties of isoenzymes encoded by the two 4coumarate: CoA ligase genes in parsley. Eur J Biochem, 176, 661-667. Ma, H. (2005) Molecular genetic analyses of microsporogenesis microgametogenesis in flowering plants. Annu Rev Plant Biol, 56, 393-434.  and  Martin, D., Tholl, D., Gershenzon, J. and Bohlmann, J. (2002) Methyl jasmonate induces traumatic resin ducts, terpenoid resin biosynthesis, and terpenoid accumulation in developing xylem of norway spruce stems. Plant Physiol., 129, 1003-1018. Matsuda, N., Tsuchiya, T., Kishitani, S., Tanaka, Y. and Toriyama, K. (1996) Partial male sterility in transgenic tobacco carrying antisense and sense P A L cDNA under the control of a tapetum-specific promoter. Plant and Cell Physiology, 37, 215-222. Meyer, S., Nowak, K., Sharma, V.K., Schulze, J., Mendel, R.R. and Hansch, R. (2004) Vectors for R N A i technology in poplar. Plant Biology, 6, 100-103. Moore, R.C. and Purugganan, M.D. (2005) The evolutionary dynamics of plant duplicate genes. Current Opinion in Plant Biology, 8, 122-128. Morant, M., Jorgensen, K., Schaller, H., Pinot, F., Moller, B.L., Werck-Reichhart, D. and Bak, S. (2007) CYP703 is an ancient cytochrome P450 in land plants catalyzing in-chain hydroxylation of lauric acid to provide building blocks for sporopollenin synthesis in pollen. Plant Cell, tpc. 106.045948. Nielsen, R. (2006) Comparative genomics: difference of expression. Nature, 440, 161. Nishiyama, T., Fujita, T., Shin, I.T., Seki, M., Nishide, H., Uchiyama, I., Kamiya, A.,  134  Carninci, P., Hayashizaki, Y., Shinozaki, K., Kohara, Y. and Hasebe, M . (2003) Comparative genomics of Physcomitrella patens gametophytic transcriptome and Arabidopsis thaliana: implication for land plant evolution. Proc Natl Acad Sci U S A , 100, 8007-8012. Noel, J.P., Austin, M.B. and Bomati, E.K. (2005) Structure-function relationships in plant phenylpropanoid biosynthesis. Curr Opin Plant Biol, 8, 249-253. Nyathi, Y. and Baker, A. (2006) Plant peroxisomes as a source of signalling molecules. Biochim Biophys Acta, 1763, 1478-1495. Paterson, A.H. (2006) Leafing through the genomes of our major crop plants: strategies for capturing unique information. Nat Rev Genet, 7, 174-184 Paxson-Sowders, D.M., Dodrill, C.H., Owen, H.A. and Makaroff, C A . (2001) D E X 1 , a novel plant protein, is required for exine pattern formation during pollen development in Arabidopsis. Plant Physiology, 127, 1739-1749. Persson, S., Wei, H., Milne, J., Page, G.P. and Somerville, C R . (2005) Identification of genes required for cellulose synthesis by regression analysis of public microarray data sets. Proc Natl Acad Sci U S A , 102, 8633-8638. Pighin, J.A., Zheng, H., Balakshin, L.J., Goodman, LP., Western, T.L., Jetter, R., Kunst, L . and Samuels, A.L. (2004) Plant cuticular lipid export requires an A B C transporter. Science, 306, 702-704. Raes, J., Rohde, A., Christensen, J.H., Van de Peer, Y. and Boerjan, W. (2003) Genome-wide characterization of the lignification toolbox in Arabidopsis. Plant Physiol., 133, 1051-1071. Ralph, S., Oddy, C , Cooper, D., Yueh, H., Jancsik, S., Kolosova, N., Philippe, R.N., Aeschliman, D., White, R., Huber, D., Ritland, C.E., Benoit, F., Rigby, T., Nantel, A. , Butterfield, Y.S.N., Kirkpatrick, R., Chun, E., Liu, J., Palmquist, D., Wynhoven, B. , Stott, J., Yang, G., Barber, S., Holt, R.A., Siddiqui, A., Jones, S.J.M., Marra, M.A., Ellis, B.E., Douglas, C.J., Ritland, K. & Bohlmann, J . (2006) Genomics of hybrid poplar ( Populus trichocarpa x deltoides ) interacting with forest tent caterpillars ( Malacosoma disstria ): Normalized and full-length cDNA libraries, expressed sequence tags (ESTs), and a cDNA microarray for the study of insect-induced defenses in poplar. Molecular Ecology, 75,1275-1297. Raven, P.H., Evert, R.F. and Curtis, H. (1996) Biology of Plants 5th ed edn. New York, N Y : Worth Publishers. Reumann, S. (2004) Specification of the peroxisome targeting signals type 1 and type 2 of plant peroxisomes by bioinformatics analyses. Plant Physiol, 135,783-800.  135  Reumann, S., Ma, C , Lemke, S. and Babujee, L . (2004) AraPerox. A database of putative Arabidopsis proteins from plant peroxisomes. Plant Physiol, 136, 2587-2608. Ritter, H. and Schulz, G.E. (2004) Structural basis for the entrance into the phenylpropanoid metabolism catalyzed by phenylalanine ammonia-lyase. Plant Cell, 16, 3426-3436. Rogers, L.A. and Campbell, M . M . (2004) The genetic control of lignin deposition during plant growth and development. New Phytologist, 164, 17-30. Sablowski, R.W., Moyano, E., Culianez-Macia, F.A., Schuch, W., Martin, C. and Bevan, M . (1994) A flower-specific Myb protein activates transcription of phenylpropanoid biosynthetic genes. Embo J, 13, 128-137. Sanchez-Fernandez, R., Davies, T.G., Coleman, J.O. and Rea, P.A. (2001) The Arabidopsis thaliana A B C protein superfamily, a complete inventory. J Biol Chem, 276, 30231-30244. Sanders, P.M., Bui, A.Q., Weterings, K., Mclntire, K.N., Hsu, Y . - C , Lee, P.Y., Truong, M.T., Seals, T.P. and Goldberg, R.B. (1999) Anther developmental defects in Arabidopsis thaliana male-sterile mutants. Sexual Plant Reproduction, 11, 297-322. Sasaki-Sekimoto, Y., Taki, N., Obayashi, T., Aono, M . , Matsumoto, F., Sakurai, N., Suzuki, H., Hirai, M.Y., Noji, M., Saito, K., Masuda, T., Takamiya, K., Shibata, D. and Ohta, H. (2005) Coordinated activation of metabolic pathways for antioxidants and defence compounds by jasmonates and their roles in stress tolerance in Arabidopsis. Plant J, 44, 653-668. Schaller, F., Schaller, A. and Stintzi, A. (2004) Biosynthesis and metabolism of jasmonates. Journal of Plant Growth Regulation, 23, 179-199. Schilmiller, A.L., Koo, A.J. and Howe, G.A. (2007) Functional diversification of acylcoenzyme a oxidases in jasmonic acid biosynthesis and action. Plant Physiol, 143, 812824. Schneider, K., Hovel, K., Witzel, K., Hamberger, B., Schomburg, D., Kombrink, E. and Stuible, H.P. (2003) The substrate specificity-determining amino acid code of 4coumarate:CoA ligase. Proc Natl Acad Sci U S A , 100, 8601-8606. Schneider, K., Kienow, L., Schmelzer, E., Colby, T., Bartsch, M . , Miersch, O., Wasternack, C , Kombrink, E . and Stuible, H.P. (2005) A new type of peroxisomal acyl-coenzyme A synthetase from Arabidopsis thaliana has the catalytic capacity to activate biosynthetic precursors of jasmonic acid. J Biol Chem, 280, 13962-13972. Scott, R.J., Spielman, M . and Dickinson, H.G. (2004) Stamen structure and function. Plant Cell, 16 Suppl, S46-60.  136  Shockey, J.M., Fulda, M.S. and Browse, J. (2003) Arabidopsis contains a large superfamily of acyl-activating enzymes. Phylogenetic and biochemical analysis reveals a new class of acyl-coenzyme a synthetases. Plant Physiol, 132, 1065-1076. Shockey, J.M., Fulda, M.S. and Browse, J.A. (2002) Arabidopsis contains nine longchain acyl-coenzyme a synthetase genes that participate in fatty acid and glycerolipid metabolism. Plant Physiol, 129, 1710-1722. Soltani, B.M., Ehlting, J. and Douglas, C.J. (2006) Genetic analysis and epigenetic silencing of A t 4 C L l and At4CL2 expression in transgenic Arabidopsis. Biotechnol J, 1, 1124-1136. Sommer, J.M., Cheng, Q.L., Keller, G.A. and Wang, C C . (1992) In vivo import of firefly luciferase into the glycosomes of Trypanosoma brucei and mutational analysis of the C-terminal targeting signal. Mol Biol Cell, 3, 749-759. Staswick, P.E., Su, W. and Howell, S.H. (1992) Methyl jasmonate inhibition of root growth and induction of a leaf protein are decreased in an Arabidopsis thaliana mutant. Proc Natl Acad Sci U S A , 89, 6837-6840. Staswick, P.E., Tiryaki, I. and Rowe, M.L. (2002) Jasmonate response locus JAR1 and several related Arabidopsis genes encode enzymes of the firefly luciferase superfamily that show activity on jasmonic, salicylic, and indole-3-acetic acids in an assay for adenylation. Plant Cell, 14, 1405-1415. Stenzel, I., Hause, B., Miersch, O., Kurz, T., Maucher, H., Weichert, H., Ziegler, J., Feussner, I. and Wasternack, C. (2003) Jasmonate biosynthesis and the allene oxide cyclase family of Arabidopsis thaliana. Plant Mol Biol, 51, 895-911. Steppuhn, A., Gase, K., Krock, B., Halitschke, R. and Baldwin, I.T. (2004) Nicotine's defensive function in nature. PLoS Biology, 2, e217. Stintzi, A . , Weber, H., Reymond, P., Browse, J. and Farmer, E.E. (2001) Plant defense in the absence of jasmonic acid: the role of cyclopentenones. Proc Natl Acad Sci U S A , 98, 12837-12842. Stuible, H.P. and Kombrink, E. (2001) Identification of the substrate specificityconferring amino acid residues of 4-coumarate:coenzyme A ligase allows the rational design of mutant enzymes with new catalytic properties. J Biol Chem, 276, 26893-26897. Taylor, P.E., Glover, J.A., Lavithis, M., Craig, S., Singh, M.B., Knox, R.B., Dennis, E.S. and Chaudhury, A . M . (1998) Genetic control of male fertility in Arabidopsis thaliana: structural analyses of postmeiotic developmental mutants. Planta, 205, 492-505.  137  The Arabidopsis Genome Initiative, A. (2000) Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature, 408, 796-815. Toufighi, K., Brady, S.M., Austin, R., Ly, E . and Provart, N J . (2005) The Botany Array Resource: e-Northerns, Expression Angling, and promoter analyses. Plant J, 43, 153-163. Tsai, C.J., Harding, S.A., Tschaplinski, T.J., Lindroth, R.L. and Yuan, Y. (2006) Genome-wide analysis of the structural genes regulating defense phenylpropanoid metabolism in Populus. New Phytol, 172, 47-62. Turner, G., Gershenzon, J., Nielson, E.E., Froehlich, J.E. and Croteau, R. (1999) Limonene synthase, the enzyme responsible for monoterpene biosynthesis in peppermint, is localized to leucoplasts of oil gland secretory cells. Plant Physiol., 120, 879-886. Turner, J.E., Greville, K., Murphy, E.C. and Hooks, M.A. (2005) Characterization of Arabidopsis fluoroacetate-resistant mutants reveals the principal mechanism of acetate activation for entry into the glyoxylate cycle. J. Biol. Chem., 280, 2780-2787. Tuskan, G.A., Difazio, S., Jansson, S., Bohlmann, J., Grigoriev, I., Hellsten, U., Putnam, N., Ralph, S., Rombauts, S., Salamov, A., Schein, J., Sterck, L., Aerts, A., Bhalerao, R.R., Bhalerao, R.P., Blaudez, D., Boerjan, W., Brun, A., Brunner, A., Busov, V., Campbell, M., Carlson, J., Chalot, M., Chapman, J., Chen, G.L., Cooper, D., Coutinho, P.M., Couturier, J., Covert, S., Cronk, Q., Cunningham, R., Davis, J., Degroeve, S., Dejardin, A., Depamphilis, C , Detter, J., Dirks, B., Dubchak, I., Duplessis, S., Ehlting, J., Ellis, B., Gendler, K., Goodstein, D., Gribskov, M . , Grimwood, J., Groover, A., Gunter, L., Hamberger, B., Heinze, B., Helariutta, Y., Henrissat, B., Holligan, D., Holt, R., Huang, W., Islam-Faridi, N., Jones, S., JonesRhoades, M . , Jorgensen, R., Joshi, C , Kangasjarvi, J., Karlsson, J., Kelleher, C , Kirkpatrick, R., Kirst, M . , Kohler, A., Kalluri, U., Larimer, F., Leebens-Mack, J., Leple, J . C , Locascio, P., Lou, Y., Lucas, S., Martin, F., Montanini, B., Napoli, C , Nelson, D.R., Nelson, C , Nieminen, K., Nilsson, O., Pereda, V., Peter, G., Philippe, R., Pilate, G., Poliakov, A., Razumovskaya, J., Richardson, P., Rinaldi, C , Ritland, K., Rouze, P., Ryaboy, D., Schmutz, J., Schrader, J., Segerman, B., Shin, H., Siddiqui, A., Sterky, F., Terry, A., Tsai, C J . , Uberbacher, E., Unneberg, P., Vahala, J., Wall, K., Wessler, S., Yang, G., Yin, T., Douglas, C , Marra, M., Sandberg, G., Van de Peer, Y. and Rokhsar, D. (2006) The genome of black cottonwood, Populus trichocarpa (Torr. & Gray). Science, 313, 1596-1604. V. Batagelj,., A.M. and Springer, B. (2003) Pajek - Analysis and visualization of large networks in Jlinger, M . , Mutzel, P., (Eds.) Graph Drawing Software. Springer, Berlin. 77103. van der Meer, I.M., Stam, M.E., van Tunen, A.J., Mol, J.N.M. and Stuitje, A.R. (1992) Antisense inhibition of flavonoid biosynthesis in petunia anthers results in male sterility. Plant Cell, 4, 253-262.  138  Varbanova, M.P., Atanassov, A.L and Atanassov, I.I. (2003) Anther-specific coumarate CoA ligase-like gene from Nicotiana sylvestris expressed during uninucleate microspore development. Plant Science, 164, 525-530. Vizcay-Barrena, G. and Wilson, Z.A. (2006) Altered tapetal P C D and pollen wall development in the Arabidopsis msi mutant. J Exp Bot, 57, 2709-2717. Walden, A.R., Walter, C. and Gardner, R.C. (1999) Genes expressed in Pinus radiata male cones include homologs to anther-specific and pathogenesis response genes. Plant Physiol, 121, 1103-1116. Waterhouse, P.M. and Helliwell, C A . (2003) Exploring plant genomes by R N A induced gene silencing. Nat Rev Genet, 4, 29-38. Yau, C P . , Zhuang, C X . , Zee, S.Y. and Yip, W.K. (2005) Expression of a microsporocyte-specific gene encoding dihydroflavonol 4-reductase-like protein is developmentally regulated during early microsporogenesis in rice. Sexual Plant Reproduction, 18, 65-74. Ylstra, B., Muskens, M . and Van Tunen, A.J. (1996) Flavonols are not essential for fertilization in Arabidopsis thaliana. Plant Mol Biol, 32, 1155-1158. Yuan, Q., Ouyang, S., Liu, J., Suh, B., Cheung, F., Sultana, R., Lee, D., Quackenbush, J. and Buell, C R . (2003) The TIGR rice genome annotation resource: annotating the rice genome and creating resources for plant biologists. Nucl. Acids Res., 31,229-233. Zhang, W., Sun, Y., Timofejeva, L . , Chen, C , Grossniklaus, U. and Ma, H . (2006) Regulation of Arabidopsis tapetum development and function by D Y S F U N C T I O N A L T A P E T U M 1 (DYT1) encoding a putative b H L H transcription factor. Development, 133, 3085-3095. Zhang, X.H. and Chiang, V.L. (1997) Molecular cloning of 4-coumarate:coenzyme A ligase in loblolly pine and the roles of this enzyme in the biosynthesis of lignin in compression wood. Plant Physiol, 113, 65-74.  139  APPENDIX 1 Genes co-expressed with A C L L s in Arabidopsis public microarray experiment datasets (  CLADE A  0.908 At4gl4080 0.905 At4g20420 0.891 At5g07230 0.879 At3g07450 0.878 At4g34850 0.876 At3g42960 0.873 At5g62080 0.868 At3g52130 0.863 Atlg01280 0.862 At5gl6920 0.861 At3gll980 0.854 Atlg69500 0.851 Atlg61070 0.84 At3gl3220 0.821 At3g23770 0.788 At2g42940 0.781 Atlg02813 0.773 At4g29980 0.771 At2gl6910 0.77 Atlg02050 0.77 Atlg33430 0.763 Atlg20150 0.742 At5g52160 0.739 At3g57620 0.727 At4g28395 0.717 At4gl2920 0.713 Atlg75790 0.707 Atlg06170 0.702 At3g52160 0.685 Atlg03390 0.68 At5gl3380 0.678 Atlg67990 0.669 At5g24820 0.655 At5g48210 0.652 Atlg71160 0.651 Atlg30020 0.65 Atlgl3140 0.646 At4g35420 0.637 At4gl4815 0.629 At5g60500 0.625 At5g62320 0.615 Atlg66850 0.615 At2gl9070 0.61 Atlg74540 0.61 At3g51590 0.61 At5g49070 0.608 Atlg22015 0.6 At4g29250  all data v3 (1388) glycosyl hydrolase family 17 protein / anther-specific protein (A6) identical to probable glucan endo-l,3-beta-glucosidase A6... tapetum-specific protein-related similar to SaTAP 35 [Sinapls alba] GI:408108 protease inhibitor/seed storage/lipid transfer protein (LTP) family protein identical to tapetum-specific protein A9... protease inhibitor/seed storage/lipid transfer protein (LTP) family protein similar to cysteine-rich 5B protein - Lycopersicon... chalcone and stilbene synthase family protein similar to chalcone synthase homolog PrChSl, Pinus radiata, gb:U90341; similar to... alcohol dehydrogenase (ATA1) identical to alcohol dehydrogenase (ATA1) GI:2501781 from [Arabidopsis thaliana] protease inhibitor/seed storage/lipid transfer protein (LTP) family protein similar to tapetum-specific protein a9 precursor... protease inhibitor/seed storage/lipid transfer protein (LTP) family protein similar to cysteine-rich 5B protein - Lycopersicon... cytochrome P450 family protein similar to cytochrome P450 GB:BAA92894 GI:7339658 from [ Petunia hybrida] expressed protein male sterility protein 2 (MS2) identical to male sterility protein 2 (MS2) SP:Q08891 (Arabidopsis thaliana) cytochrome P450 family protein similar to Cytochrome P450 86A2 (SP:O23066) [Arabidopsis thaliana]contains Pfam profile:... plant defensin-fusion protein, putative (PDF2.4) plant defensin protein family member, personal communication, Bart Thomma... ABC transporter family protein contains Pfam profile: PF00005 ABC transporter; similar to white protein GB:Q27256 [Anopheles... glycosyl hydrolase family 17 protein similar to A6 anther-specific protein SP:Q06915 [Arabidopsis thaliana] DNA-binding family protein contains a AT hook motif (DNA binding motifs with a preference for A/T rich regions), Pfam: PF02178 expressed protein contains Pfam profile PF04398: Protein of unknown function, DUF538 expressed protein basic helix-loop-helix (bHLH) family protein chalcone and stilbene synthase family protein Similar to rice chalcone synthase homolog, gp|U90341|2507617 and anther specific... galactosyltransferase family protein contains Pfam profile: PF01762 galactosyltransferase subtllase family protein similar to subtilisin-type protease precursor GI: 14150446 from [Glycine max] protease inhibitor/seed storage/lipid transfer protein (LTP) family protein contains Pfam protease inhibitor/seed storage/LTP... glyoxal oxidase-related contains similarity to glyoxal oxidase precursor [Phanerochaete chrysosporium] gi|10503021gb|AAA87594 lipid transfer protein, putative identical to anther-specific gene ATA7 [gi:2746339]; contains Pfam protease inhibitor/seed... aspartyl protease family protein low similarity to CND41, chloroplast nucleoid DNA binding protein [Nicotiana tabacum]... multi-copper oxidase type I family protein contains Pfam profile: PF00394 Multicopper oxidase basic helix-loop-helix (bHLH) family protein contains Pfam profile:PF00010 helix-loop-helix DNA-binding domain beta-ketoacyi-CoA synthase family protein beta-ketoacyl-CoA synthase - Simmondsia chinensis,PID:gl045614 transferase family protein similar to anthranilate N-hydroxycinnamoyl/benzoyltransferase from Dianthus caryophyllus... auxin-responsive GH3 family protein similar to auxin-responsive GH3 product [Glycine max] GI: 18591; contains Pfam profile... caffeoyl-CoA 3-O-methyltransferase, putative similar to GI:2960356 [Populus balsamifera subsp. trichocarpa], GI:684942... aspartyl protease family protein low similarity to CND41, chloroplast nucleoid DNA binding protein [Nicotiana tabacum]... expressed protein beta-ketoacyl-CoA synthase family protein similar to fatty acid elongase 3-ketoacyl-CoA synthase 1 GB:AAC99312, very-long-chain... expressed protein contains Pfam profile PF04398: Protein of unknown function, DUF538 cytochrome P450 family protein similar to Cytochrome P450 86A2 (SP:O23066) [Arabidopsis thaliana]; contains Pfam PF|00067... dihydroflavonol 4-reductase family / dihydrokaempferol 4-reductase family similar to dihydroflavonol 4-reductase (Rosa hybrid... protease inhibitor/seed storage/lipid transfer protein (LTP) family protein contains Pfam protease inhibitor/seed storage/LTP... undecaprenyl pyrophosphate synthetase family protein / UPP synthetase family protein contains putative undecaprenyl diphosphate... myb family transcription factor (MYB99) contains PFAM profile: myb DNA binding domain PF00249 protease inhibitor/seed storage/lipid transfer protein (LTP) family protein similar to GPI3062791 Lipid transfer protein... transferase family protein similar to anthranilate N-hydroxycinnamoyl/benzoyltransferase from Dianthus caryophyllus... cytochrome P450, putative similar to cytochrome P450 GB:048922 [Glycine max]; contains Pfam profile: PF00067 cytochrome P450 lipid transfer protein, putative similar, to lipid transfer protein E2 precursor, Brassica napus, PIR:T07984 [GI:899224];... beta-ketoacyl-CoA synthase family protein similar to very-long-chain fatty acid condensing enzyme CUT1 [GI:5001734],... galactosyltransferase family protein contains Pfam profile: PF01762 galactosyltransferase transferase family protein low similarity to CER2 Arabidopsis thaliana GI: 1213594, anthocyanin 5-aromatic acyltransferase...  ArathACLL5 (Atlg62940) 0.986 At4g34850 0.98 At3g42960 0.975 Atlg01280 0.975 At4gl4080 0.974 At4g20420 0.966 At3gll980 0.966 At5gl6920 0.964 At3g57620 0.959 At3g23770 0.958 At5g07230 0.953 At3gl3220 0.951 Atlg61070 0.95 0.946 0.944 0.942 0.939 0.937 0.934 0.928 0.924  Atlg02813 Atlg20150 At5g62080 At3g07450 At3g52130 Atlg0205O At2g42940 At5g62320 Atlg69500  0.92 At4g29980 0.914 At4g29250 0.91 At3g06100 0.91 At4g35420 0.909 At2gl6910 0.905 At4g28395 0.905 At5g52160 0.89 At5gl3380 0.883 At3g52160 0.882 At3g50580 0.874 Atlg67990 0.867 Atlg06170 0.866 Atlg74140 0.861 At5g24820 0.857 At5g49070 0.85 At4gl2920 0.849 Atlg30020 0.844 At5gl6960 0.842 At5g60090 0.836 At5g48210 0.834 Atlg03390 0.829 Atlg75790 0.827 Atlg33430 0.827 At2gl9070 0.826 At4g34210 0.818 At5g43340 0.817 At5g41890 0.815 Atlg68540 0.812 At2g31210 0.812 At4g22080 0.812 At5g40940 0.81 0.805 0.803 0.802 0.799 0.796 0.794 0.79 0.787 0.785 0.785 0.782 0.78 0.771 0.771 0.769 0.767 0.762 0.758 0.757 0.756  At5g61110 At4g30040 At3g58290 At5gl7200 Atlg71160 At5g43110 Atlgl3140 At5g60500 At4gl4815 Atlg22015 Atlg36150 Atlg64030 At5g65205 Atlg22090 At3g51590 Atlg74540 Atlg79780 Atlg08065 At5gl4980 At3g63100 At2g03170  0.75 0.747 0.746 0.746 0.745 0.745 0.744 0.742 0.742 0.741 0.733 0.733 0.732 0.726 0.724 0.719 ' 0.716 0.713 0.711 0.707 0.706 0.705 0.704 0.702 0.701 0.7  At4g27330 At3g 15870 At3g23840 At3g28470 Atlg56360 At5gl7340 At5g60080 Atlg23810 At4g28580 Atlg68875 At4g36350 At5g41090 Atlg44222 Atlg07340 At5gl7830 At4g33870 Atlg75030 At2g03740 Atlg23700 Atlg28375 Atlg75940 At5g53510 Atlg61630 At4g24890 Atlg73050 Atlg48940  tissue and develoment (237 data) chalcone and stilbene synthase family protein similar to chalcone synthase homolog PrChSl, Pinus radiata, gb:U90341; similar to... alcohol dehydrogenase (ATA1) identical to alcohol dehydrogenase (ATA1) GI:2501781 from [Arabidopsis thaliana] cytochrome P450 family protein similar to cytochrome P450 GB:BAA92894 GI:7339658 from [ Petunia hybrida] glycosyl hydrolase family 17 protein / anther-specific protein (A6) identical to probable glucan endo-l,3-beta-glucosidase A6... tapetum-specific protein-related similar to SaTAP 35 [Sinapis alba] GI:408108 male sterility protein 2 (MS2) identical to male sterility protein 2 (MS2) SP:Q08891 (Arabidopsis thaliana) expressed protein glyoxal oxidase-related contains similarity to glyoxal oxidase precursor [Phanerochaete chrysosporium] gi|1050302|gb|AAA87594 glycosyl hydrolase family 17 protein similar to A6 anther-specific protein SP:Q06915 [Arabidopsis thaliana] protease inhibitor/seed storage/lipid transfer protein (LTP) family protein identical to tapetum-specific protein A9... ABC transporter family protein contains Pfam profile: PF00005 ABC transporter; similar to white protein GB:Q27256 [Anopheles... plant defensin-fusion protein, putative (PDF2.4) plant defensin protein family member, personal communication, Bart Thomma... expressed protein contains Pfam profile PF04398: Protein of unknown function, DUF538 subtilase family protein similar to subtilisln-type protease precursor GI: 14150446 from [Glycine max] protease inhibitor/seed storage/lipid transfer protein (LTP) family protein similar to tapetum-specific protein a9 precursor... protease inhibitor/seed storage/lipid transfer protein (LTP) family protein similar to cysteine-rich SB protein - Lycopersicon... protease inhibitor/seed storage/lipid transfer protein (LTP) family protein similar to cysteine-rich 5B protein - Lycopersicon... chalcone and stilbene synthase family protein Similar to rice chalcone synthase homolog, gp|U90341[2507617 and anther specific... DNA-binding family protein contains a AT hook motif (DNA binding motifs with a preference for A/T rich regions), Pfam:PF02178 myb family transcription factor (MYB99) contains PFAM profile: myb DNA binding domain PF00249 cytochrome P450 family protein similar to Cytochrome P450 86A2 (SP:O23066) [Arabidopsis thaliana]contains Pfam profile:... expressed protein transferase family protein low similarity to CER2 Arabidopsis thaliana GI:1213594, anthocyanin 5-aromatic acyltransferase... major intrinsic family protein / MIP family protein contains Pfam profile: PF00230 major intrinsic protein; contains... dihydroflavonol 4-reductase family / di hydro kaempferol 4-reductase family similar to dihydroflavonol 4-reductase (Rosa hybrid... basic helix-loop-helix (bHLH) family protein lipid transfer protein, putative identical to anther-specific gene ATA7 [gi:2746339]; contains Pfam protease inhibitor/seed... protease inhibitor/seed storage/lipid transfer protein (LTP) family protein contains Pfam protease inhibitor/seed storage/LTP... auxin-responsive GH3 family protein similar to auxin-responsive GH3 product [Glycine max] GI: 18591; contains Pfam profile... beta-ketoacyl-CoA synthase family protein beta-ketoacyl-CoA synthase - Simmondsia chinensis,PID:gl045614 proline-rich family protein contains proline-rich extensin domains, INTERPRO:IPR002965 caffeoyl-CoA 3-O-methyltransferase, putative similar to GI:2960356 [Populus balsamifera subsp. trichocarpa], GI:684942... basic helix-loop-helix (bHLH) family protein contains Pfam profile:PF00010 helix-loop-helix DNA-binding domain hypothetical protein aspartyl protease family protein low similarity to CND41, chloroplast nucleoid DNA binding protein [Nicotiana tabacum]... beta-ketoacyl-CoA synthase family protein similar to very-long-chain fatty acid condensing enzyme CUT1 [GI:5001734],... aspartyl protease family protein low similarity to CND41, chloroplast nucleoid DNA binding protein [Nicotiana tabacum]... expressed protein contains Pfam profile PF04398: Protein of unknown function, DUF538 NADP-dependent oxidoreductase, putative similar to probable NADP-dependent oxidoreductase (zeta-cry stall in homolog) PI... protein kinase family protein contains protein kinase domain, Pfam:PF00069 expressed protein transferase family protein similar to anthranilate N-hydroxycinnamoyl/benzoyltransferase from Dianthus caryophyllus... multi-copper oxidase type I family protein contains Pfam profile: PF00394 Multicopper oxidase gatactosyltransferase family protein contains Pfam profile: PF01762 galactosyltransferase transferase family protein similar to anthranilate N-hydroxycinnamoyl/benzoyltransferase from Dianthus caryophyllus... E3 ubiquitin ligase SCF complex subunit SKP1/ASK1 (Atll), putative E3 ubiquitin ligase; similar to Skpl homolog Skpla... inorganic phosphate transporter identical to inorganic phosphate transporter [Arabidopsis thaliana] GI:3869190 GDSL-motif lipase/hydrolase family protein similar to family II lipase EXL3 (GI:15054386), EXL1 (GI: 15054382), EXL2... oxidoreductase family protein similar to cinnamoyl CoA reductase [Eucalyptus gunnii, gi:2058311], cinnamyl-alcohol... basic helix-loop-helix (bHLH) family protein contains Pfam profile: PF00010 helix-loop-helix DNA-binding domain; PMID: 12679534 pectate lyase family protein similar to pectate lyase 2 GP: 6606534 from [Musa acuminata] hypothetical protein hypothetical protein aspartyl protease family contains Pfam domain, PF00026: eukaryotic aspartyl protease meprin and TRAF homology domain-containing protein / MATH domain-containing protein similar to ubiquitin-specific protease 12... glycoside hydrolase family 28 protein / polygalacturonase (pectinase) family protein similar to polygalacturonase [Lycopersicon... beta-ketoacyl-CoA synthase family protein similar to fatty acid elongase 3-ketoacyl-CoA synthase 1 GB:AAC99312, very-long-chain... pumilio/Puf RNA-binding domain-containing protein,contains similarity to RNA-binding protein cytochrome P450 family protein similar to Cytochrome P450 86A2 (SP:O23066) [Arabidopsis thaliana]; contains Pfam PF|00067... undecaprenyl pyrophosphate synthetase family protein / UPP synthetase family protein contains putative undecaprenyl diphosphate... protease inhibitor/seed storage/lipid transfer protein (LTP) family protein contains Pfam protease inhibitor/seed storage/LTP... galactosyltransferase family protein contains Pfam profile: PF01762 galactosyltransferase protease inhibitor/seed storage/lipid transfer protein (LTP) family protein low similarity to glucoamylase S1/S2 [Precursor]... serpin family protein / serine protease Inhibitor family protein similar to phloem serpin-1 [Cucurbita maxima] GI:9937311,... short-chain dehydrogenase/reductase (SDR) family protein contains INTERPRO family IPR002198 short chain dehydrogenase/reductase., expressed protein contains Pfam profile PF04776: Protein of unknown function (DUF626) lipid transfer protein, putative similar to lipid transfer protein E2 precursor, Brassica napus, PIR:T07984 [GI:899224];... cytochrome P450, putative similar to cytochrome P450 GB:048922 [Glycine max]; contains Pfam profile: PF00067 cytochrome P450 integral membrane protein, putative contains 1 transmembrane domain; contains plant integral membrane protein domain,... carbonic anhydrase family protein similar to storage protein (dioscorin) [Dioscorea cayenensis] GI:433463; contains Pfam... esterase/lipase/thioesterase family protein low similarity to monoglyceride lipase from [Homo sapiens] GI:14594904, [Mus... glycine-rich protein E3 ubiquitin ligase SCF complex subunit SKP1/ASK1 (Atl4), putative E3 ubiquitin ligase; similar to Skpl homolog Skplb... sporocyteless (SPL) identical to sporocyteless SPL (MADS-box related protein) [Arabidopsis thaliana] gi|5566240|gb|AAD45344 fatty acid desaturase family protein simitar to delta 9 acyl-lipid desaturase (ADS1) GI: 2970034 from [Arabidopsis thaliana] transferase family protein low similarity to hypersensitivity-related gene [Nicotiana tabacum] GI:1171577,... myb family transcription factor (MYB35) similar to Atmybl03 GB:AAD40692 from [Arabidopsis thaliana]; contains PFAM profile: myb... calcineurin-like phosphoesterase family protein contains Pfam profile: PF00149 calcineurin-like phosphoesterase expressed protein weak similarity to M3.4 protein [Brassica napus] GI:4574746 protein kinase family protein contains protein kinase domain, Pfam:PF00069 paired amphipathic helix repeat-containing protein low similarity to transcriptional repressor SIN3B [Mus musculus] GI:2921547;... magnesium transporter CorA-like family protein (MRS2-6) weak similarity to SP1Q01926 RNA splicing protein MRS2, mitochondrial... expressed protein calcineurin-like phosphoesterase family protein contains Pfam profile: PF00149 calcineurin-like phosphoesterase no apical meristem (NAM) family protein contains Pfam PF02365: No apical meristem (NAM) domain; similar to unknown protein... hypothetical protein hexose transporter, putative similar to hexose transporter [Lycopersicon esculentum] GI:5734440; contains Pfam profile PF00083:... hypothetical protein contains.Pfam domain, PF04515: Protein of unknown function, DUF580 peroxidase, putative similar to peroxidase [Spinacia oleracea] gi[1781334|emb|CAA71494 pathogenesis-related thaumatin family protein identical to thaumatin-like protein [Arabidopsis thaliana] GI:2435406; contains... late embryogenesis abundant domain-containing protein / LEA domain-containing protein similar to cold-regulated gene corl5b... protein kinase family protein contains protein kinase domain, Pfam:PF00069 expressed protein glycosyl hydrolase family 1 protein / anther-specific protein ATA27 contains Pfam PF00232 : Glycosyl hydrolase family 1 domain;... oligopeptide transporter OPT family protein similar to SP|P40900 Sexual differentiation process protein isp4... equilibrative nucleoside transporter, putative (ENT7) identical to putative equilibrative nucleoside transporter ENT7... calcineurin-like phosphoesterase family protein contains Pfam profile: PF00149'calcineurin-like phosphoesterase (R)-mandelonitrile lyase, putative / (R)-oxynitrilase, putative similar to mandelonitrile lyase from Prunus serotina... plastocyanin-like domain-containing protein  141  CLADE B ArathACLL6 (At4g05160) 0.877 Atlg03950 0.873 At3g48990 0.871 At2g23430 0.861 Atlg21000 0.86 At3g51580 0.86 At4g24220 0.859 Atlg53560 0.858 At4gl7170 0.857 Atlg53570 0.855 Atlg72800 0.855 At4g20260 0.854 At4g36760 0.853 Atlg70810 0.852 At3g03890 0.848 At5g07220 0.848 At5g56180 0.847 At5g53330 0.846 Atlg26920 0.844 At3gll930 0.842 At3g01170 0.841 At4g26060 0.841 At5g63800 0.839 Atlg02170 0.839 Atlg80180 0.838 Atlg04440 0.838 At4g39140 0.838 At5g04080 0.836 At5g56150 0.833 At5g51070 0.832 Atlgl8470 0.832 At2g46030 0.832 At3gll660 0.832 At5gl8490 0.832 At5g33290 0.831 Atlgl4000 0.831 At5g54940 0.83 Atlgl5860 0.829 Atlg08320 0.829 At5gl7650 0.828 Atlg20440 0.826 Atlg58180 0.826 At3g05970 0.826 At5g47560 0.825 At5g39590 0.823 At3g51730 0.822 Atlg30000 0.822 At5gll680 0.821 Atlg47960 0.821 At2g25450 0.819 At4g29160 0.818 Atlg27290 0.818 At2g23450 ' 0.818 At3g57090 0.818 At5g55850 0.817 Atlg49670 0.816 Atlg32700 0.816 At2g30140 0.816 AtSgl8630 0.814 Atlgl0150 0.814 At2g02360 0.814 At3gl3720 0.813 Atlg49470 0.812 Atlg27000 0.812 At2g26670 0.812 At4g24990 0.811 At2gl6710 0.811 At4gl7830 0.81 At5g60360 0.807 Atlg04970 0.807 At3gll780 0.806 At2g39710 0.805 Atlg49300 0.805 Atlg76070 0.805 At4g08930 0.805 At4g30270 0.805 At4g36400 0.805 At5g24460 0.805 At5g40690 0.804 At3g54140 0.804 At4g32760 0.803 Atlg72510 • 0.803 At2g27310 0.803 At5g40670 0.803 At5g51640 0.802 At4gl6520 0.801 Atlg04960 0.801 At2g30550 0.8 Atlgl2140 0.8 Atlgl3990 0.8 Atlg80310 0.799 At2g38480 0.798 At3g23280 0.798 At5g45410 0.797 At3gl4050 0.797 At4g32870 0.797 At5g02600 0.797 At5g45550 0.796 Atlg32410 0.796 At2g26230 0.796 At3g02910  tissue and develoment (237 data) SNF7 family protein contains Pfam domain, PF03357: SNF7 family AMP-dependent synthetase and ligase family protein similar to peroxisomal-coenzyme A synthetase (FAT2) [gi:586339] from... kip-related protein 1 (KRP1) / cyclin-dependent kinase inhibitor 1 (ICK1) identical to cyclin-dependent kinase inhibitor (ICK1)... zinc-binding family protein similar to zinc-binding protein [Pisum sativum] GI:16117799; contains Pfam profile PF04640 expressed protein expressed protein protein induced upon wounding - Arabidopsis thaliana, PID:e257749 expressed protein Rab2-like GTP-binding protein (RAB2) identical to Rab2-like protein (At-RAB2) GI: 1765896 from [Arabidopsis thaliana] mitogen-activated protein kinase kinase kinase (MAPKKK), putative (MAP3Ka) identical to MEK kinase (MAP3Ka)[Arabidopsis... nuMl-related contains similarity with nuMl GI: 1279563 from [Medicago sativa] DREPP plasma membrane polypeptide family protein contains Pfam profile: PF05558 DREPP plasma membrane polypeptide aminopeptidase P similar to Xaa-Pro aminopeptidase 2 [Lycopersicon esculentum] GI: 15384991; contains Pfam profile PF00557:... C2 domain-containing protein similar to zinc finger and C2 domain protein GI:9957238 from [Arabidopsis thaliana] expressed protein BAG domain-containing protein contains Pfam:PF02179 BAG domain actin-related protein, putative (ARP8) strong similarity to actln-related protein 8A (ARP8) [Arabidopsis thaliana] GI:21427473;... expressed protein expressed protein Location of EST 228A16T7A, gb|N65686 universal stress protein (USP) family protein similar to ER6 protein GB:AAD46412 GI:5669654 from [Lycopersicon esculentum];... expressed protein expressed protein glycosyl hydrolase family 35 protein similar to beta-galactosidase GI:7939621 from [Lycopersicon esculentum]; contains Pfam... latex-abundant family protein (AMC1) / cas'pase family protein contains similarity to latex-abundant protein [Hevea... expressed protein casein kinase, putative similar to casein kinase I [Arabidopsis thaliana] gilll03318|emb|CAA55395; contains protein kinase... expressed protein expressed protein ubiquitin-conjugating enzyme, putative strong similarity to ubiquitin-conjugating enzyme UBC2 [Mesembryanthemum crystallinum]... ATP-dependent Clp protease ATP-binding subunit (CIpD), (ERD1) SAG15/ERD1; identical to ERD1 protein GI:497629, SP:P42762 from... zinc finger (C3HC4-type RING finger) family protein contains Pfam profile: PF00097 zinc finger, C3HC4 type ubiquitin-conjugating enzyme 6 (UBC6) E2; identical to gi|431267, SP:P42750, PIR:S52661; contains a ubiquitin-conjugating... harpin-induced family protein / HIN1 family protein / harpin-responsive family protein similar to harpin-induced protein hinl... expressed protein exostosin family protein contains Pfam profile: PF03016 Exostosin family protein kinase family protein / ankyrin repeat family protein contains.Pfam profiles: PF00069.protein kinase domain, PF00023... eukaryotic translation initiation factor S U U , putative similar to SP|P32911 Protein translation factor SUI1 {Saccharomyces... expressed protein bZIP family transcription factor contains Pfam profile: PF00170 bZIP transcription factor glycine/proline-rich protein glycine/proline-rich protein GPRP - Arabidopsis thaliana, EMBL:X84315 dehydrin (COR47) identical to dehydrin COR47 (Cold-induced COR47 protein) [Arabidopsis thaliana] SWISS-PROT:P31168 carbonic anhydrase family protein / carbonate dehydratase family protein similar to SPIP46512 Carbonic anhydrase 1 (EC long-chain-fatty-acid--CoA ligase / long-chain acyl-CoA synthetase (LACS6) strong similarity to AMP-binding protein (MF39P)... sodium/dicarboxylate cotransporter, putative similar to SWISS-PROT:Q13183 renal sodium/dicarboxylate cotransporter [Human]{Homo.. expressed protein saposin B domain-containing protein contains Pfam profiles: PF00026 eukaryotic aspartyl protease, PF03489 surfactant protein B,... glycoside hydrolase family 47 protein similar to GI:5579331 from [Homo sapiens]; contains Pfam profile PF01532: Glycosyl... expressed protein predicted proteins, Arabidopsis thaliana invertase/pectin methylesterase inhibitor family protein low similarity to SP|P83326 Pectinesterase inhibitor (Pectin... 2-oxoglutarate-dependent dioxygenase, putative similar to 2A6 (GI:599622) and tomato ethylene synthesis regulatory protein E8... SNF7 family protein contains Pfam domain, PF03357: SNF7 family expressed protein protein kinase family protein contains protein kinase domain, Pfam:PF00069 expressed protein nitrate-responsive NOI protein, putative similar to nitrate-induced NOI protein [Zea mays] GI: 2642213 ARP protein (REF) identical to ARP protein GB:CAA89858 GI:886434 from [Arabidopsis thaliana]; contains Pfam profile PF00107:... zinc-binding family protein similar to zinc-binding protein [Pisum sativum] GI: 16117799; contains Pfam profile PF04640 :... UDP-glucoronosyl/UDP-glucosyl transferase family protein contains Pfam profile: PF00201 UDP-glucoronosyl and UDP-glucosyl... lipase dass 3 family protein low similarity to Triacylglycerol Acylhydrolase (E.C. [Rhizomucor miehei] GI:230348;... . expressed protein similar to ESTs gb]T20511, gb|T45308, gb|H36493, and gb|AA651176 F-box family protein / SKP1 interacting partner 3-related contains similarity to SKP1 interacting partner 3 GI:10716951 from... prenylated rab acceptor (PRA1) family protein contains Pfam profile PF03208: Prenylated rab acceptor (PRA1) expressed protein contains Pfam profile PF04819: Family of unknown function (DUF716) (Plant viral-response family) bZIP family transcription factor heme oxygenase 1 (HOI) (HY1) identical to plastid heme oxygenase (HY1) [Arabidopsis thaliana] GI:4877362, heme oxygenase 1... ubiquitin family protein contains INTERPRO:IPR000626 ubiquitin domain hesB-like domain-containing protein similar to IscA (putative iron-sulfur cluster assembly protein) [Azotobacter vinelandii]... peptidase M20/M25/M40 family protein similar to acetylornithine deacetylase (Acetylornithinase, AO; N-acetylornithinase, NAO)... cysteine proteinase, putative / AALP protein (AALP) identical to AALP protein GI:7230640 from [Arabidopsis thaliana]; similar... lipid-binding serum glycoprotein family protein low similarity to SP|P17213 Bactericidal permeability-increasing protein... MD-2-related lipid recognition domain-containing protein / ML domain-containing protein weak similarity to... aspartyl protease family protein contains profile Pfam PF00026: Eukaryotic aspartyl protease; contains Prosite PS00141:... Ras-related GTP-binding protein, putative contains Pfam profile: PF00071 Ras family expressed protein thioredoxin-related contains weak similarity to Swiss-Prot:Q39239 thioredoxin H-type 4 (TRX-H-4). [Mouse-ear cress] MERI-5 protein (MERI-5) (MERI5B) / endo-xyloglucan transferase / xyloglucan endo-l,4-beta-D-glucanase (SEN4) identical to..." FAD linked oxidase family protein low similarity to SPIQ12627 from Kluyveromyces lactis and SP|P32891 from Saccharomyces... expressed protein expressed protein proton-dependent oligopeptide transport (POT) family protein contains Pfam profile: PF00854 POT family VHS domain-containing protein / GAT domain-containing protein weak similarity to hepatocyte growth factor-regulated tyrosine... expressed protein F-box family protein contains Pfam PF00646: F-box domain;; similar to SKP1 interacting partner 2 (SKIP2) TIGR_Athl:At5g67250 PQ-loop repeat family protein / transmembrane family protein similar to SPIO60931 Cystinosin {Homo sapiens}; contains Pfam... leaf senescence protein-related (YLS7 ) annotation temporarily based on supporting cDNAgi|13122291]dbj|AB047810.1|; identical... autophagy 8f (APG8f) identical to autophagy 8f [Arabidopsis thaliana] GI: 19912161; contains Pfam profile PF02991: Microtubule... expressed protein lipase dass 3 family protein similar to DEFECTIVE IN ANTHER DEHISCENCE1 [Arabidopsis thaliana] GI: 16215706; contains Pfam... flavin-containing monooxygenase family protein / FMO family protein similar to flavin-containing monooxygenase [Cavia... expressed protein expressed protein integral membrane protein, putative contains 4 transmembrane domains; contains plant integral membrane protein domain,... zinc finger (C3HC4-type RING finger) family protein / ankyrin repeat family protein contains Pfam profile: PF00097 zinc finger,... expressed protein similar to unknown protein (pir| IT05524) RelA/SpoT protein, putative (RSH2) nearly identical to RelA/SpoT homolog RSH2 [Arabidopsis thaliana] GI:7141306; contains Pfam... expressed protein hypothetical protein F17H15.20 Arabidopsis thaliana chromosome II BAC F17H15, PID:g3643606 heavy-metal-associated domain-containing protein low similarity to gi:3168840 copper homeostasis factor; contains Pfam... mobl/phocein family protein contains Pfam profile: PF03637 Mobl/phocein family vacuolar protein sorting 55 family protein / VPS55 family protein contains Pfam domain PF04133: Vacuolar protein sorting 55 uricase / urate oxidase / nodulin 35, putative identical to uricase SP: 004420 from [Arabidopsis thaliana] expressed protein contains Pfam domain PF03674:.Uncharacterised protein family (UPF0131)  142  CLADE C ArathACLL7 ( A t 4 g l 9 0 1 0 ) no coexpressed gene  CLADE D A r a t h A C L L l ( A t l g 2 0 4 8 0 ) Tissue and development (data 237) 0.801 A t 2 g l 7 3 7 0 3 - h y d r o x y - 3 - m e t h y l g l u t a r y l - C o A reductase 2 / H M G - C o A reductase 2 ( H M G R 2 ) identical to S P | P 4 3 2 5 6 . . . 0.781 A t 4 g 3 1 3 4 0 m y o s i n heavy chain-related contains weak similarity to Myosin h e a v y c h a i n , n o n m u s c l e type A (Cellular m y o s i n h e a v y c h a i n , type... 0.766 At3g05020 a c y l c a r r i e r p r o t e i n 1, c h l o r o p l a s t ( A C P - 1 ) i d e n t i c a l t o S P | P 1 1 8 2 9 A c y l c a r r i e r p r o t e i n 1 , c h l o r o p l a s t p r e c u r s o r ( A C P ) . . . 0.764 A t 5 g l 6 5 1 0 reversibly glycosylated polypeptide, putative similar to reversibly glycosylatable polypeptide ( R G P 1 ) [Pisum sativum]... 0.762 A t 5 g l 5 5 3 0 biotin carboxyl carrier protein 2 ( B C C P 2 ) identical to biotin carboxyl carrier protein isoform 2 [Arabidopsis thaliana]... 0.762 At5gS0390 pentatricopeptide (PPR) repeat-containing protein contains I N T E R P R O : I P R 0 0 2 8 8 5 PPR repeats 0.758 A t l g 2 2 1 7 0 phosphoglycerate/bisphosphoglycerate mutase family protein similar to S P J P 3 1 2 1 7 Phosphoglycerate mutase 1 (EC 0.753 At4g 11820 hydroxymethyiglutaryl-CoA synthase / H M G - C o A synthase / 3-hydroxy-3-methylglutaryl c o e n z y m e A synthase identical to... 0.75 0.746 0.741 0.727 0.725 0.723 0.721 0.718 0.715 0.713 0.712 0.711  Atlg78050 At3g20920 At2g43040 At2g35160 Atlg22800 Atlg76550 At5g04620 At2g29050 Atlg70770 At5g42780 Atlgl8180 At2g20840  0.71 0.709 0.709 0.708 0.707 0.705 0.705 0.702 0.7 0.7 0.699 0.699 0.699 0.698 0.696 0.693  At5g66310 At4g23490 At5g62500 Atlg09870 At3g54250 At2g46000 At3g08910 At4g 12700 Atlg67680 At3g09570 At2g29390 At5g06830 At5g42280 At2g22900 Atlg29310 At3g22845  0.69 0.688 0.686 0.686 0.686 0.685 0.685 0.684 0.684 0.683 0.682 0.681 0.681  At3g56640 Atlg54830 Atlgl9970 Atlg79750 At3g02230 Atlg76270 At2gl6760 At5gl8550 At5g59740 At2g01140 Atlg75110 AtSgl7770 At5g22940  0.68 0.679 0.677 0.676 0.675 0.674 0.674 0.673 0.673 0.673 0.673 0.672 0.672 0.671 0.669 0.669 0.669 0.668 0.668 0.667 0.667 0.663 0.662 0.661 0.659 0.657 0.656 0.656 0.656 0.656 0.654 0.654 0.654 0.654 0.653 0.652 0.651 0.651  At4g38480 At5g66460 At3g50960 At3g48410 Atlg74030 At3g25110 At4g 14695 Atlg06450 At2g 19600 At3g45090 At5g49460 At3g20560 At3g57650 At4gl3710 Atlgll680 Atlg54630 At4g35560 Atlgl4970 At5g01340 At3gl4000 At3g48440 At2g03120 Atlg25510 At3g54930 At2gl4835 Atlg05360 Atlg79360 At5g47520 At5g48230 At5g58190 Atlg76510 At2g21520 At3g54320 At5g23530 At5gl0550 Atlg6202O At3gl6950 At4g22250  0.65 0.649 0.649 0.648 0.648 0.647 0.647 0.646 0.645 0.644 0.644 0.644 0.643  At5g42630 Atlg23890 At2g40620 Atlg60810 At5gll230 Atlg21070 At4g39860 A t l g 10670 At5g22740 At3g04980 At3g46200 At5g43060 At2g38700  phosphoglycerate/bisphosphoglycerate mutase family protein similar to S P | P 3 1 2 1 7 Phosphoglycerate mutase 1 (EC translocation protein-related contains weak similarity to Drosophila translocation protein 1 ( G I : 5 5 8 1 8 1 ) [Drosophila melanogaster] calmodulin-binding protein similar to pollen-specific calmodulin-binding protein M P C B P G I : 1 0 0 8 6 2 6 0 from [ Z e a m a y s ] ; contains... S E T domain-containing protein ( S U V H 5 ) identical to S U V H S [Arabidopsis thaliana] G I : 1 3 5 1 7 7 5 1 ; contains Pfam profiles P F 0 0 8 5 6 : . . . expressed protein similar to Biotin synthesis protein bioC. {Serratia m a r c e s c e n s } ( S P : P 3 6 5 7 1 ) ; ESTs g b | Z 3 4 0 7 5 , g b | Z 3 4 8 3 5 and... pyrophosphate--fructose-6-phosphate 1-phosphotransferase alpha subunit, putative / pyrophosphate-dependent... a m i n o t r a n s f e r a s e c l a s s I a n d II f a m i l y p r o t e i n s i m i l a r t o 8 - a m i n o - 7 - o x o n o n a n o a t e s y n t h a s e . B a c i l l u s s p h a e r i c u s , P I R : J Q 0 5 1 2 . . . rhomboid family protein contains PFAM d o m a i n P F 0 1 6 9 4 , R h o m b o i d family expressed protein zinc finger h o m e o b o x family protein / Z F - H D h o m e o b o x family protein similar to u n k n o w n protein (pir| JT05568) expressed protein secretory carrier m e m b r a n e protein ( S C A M P ) family protein contains P f a m d o m a i n , PF04144: S C A M P family kinesin motor family protein contains Pfam d o m a i n , P F 0 0 2 2 5 : Kinesin motor d o m a i n f r i n g e - r e l a t e d p r o t e i n -+• w e a k s i m i l a r i t y t o F r i n g e [ S c h i s t o c e r c a g r e g a r i a ] ( G I : 6 5 7 3 1 3 8 ) ; F r i n g e e n c o d e s a n e x t r a c e l l u l a r p r o t e i n . . . microtubule-associated E B 1 family protein similar to E B F 3 - S (Microtubule-associated protein) [ H o m o sapiens] G I : 1 2 7 5 1 1 3 1 ; . . . histidine acid p h o s p h a t a s e family protein c o n t a i n s Pfam profile P F 0 0 3 2 8 : Histidine acid p h o s p h a t a s e ; similar to multiple... m e v a l o n a t e diphosphate d e c a r b o x y l a s e , putative similar to m e v a l o n a t e diphosphate d e c a r b o x y l a s e [Arabidopsis thaliana]... expressed protein DNA) heat expressed expressed expressed  s h o c k p r o t e i n , p u t a t i v e s i m i l a r t o S P | P 2 5 6 8 5 D n a ) h o m o l o g s u b f a m i l y B m e m b e r 1 ( H e a t s h o c k 4 0 k D a p r o t e i n 1) { H o m o . . . protein protein protein  sterol 4 - a l p h a - m e t h y l - o x i d a s e 1 ( S M O l ) nearly identical to sterol 4 - a l p h a - m e t h y l - o x i d a s e G I : 1 6 9 7 3 4 6 9 f r o m [Arabidopsis... expressed protein contains Pfam profile: PF05600 protein of u n k n o w n function ( D U F 7 7 3 ) DC1 d o m a i n - c o n t a i n i n g protein c o n t a i n s P f a m profile P F 0 3 1 0 7 : D C 1 d o m a i n galactosyl transferase G M A 1 2 / M N N 1 0 family protein very low similarity to a l p h a - l , 2 - g a l a c t o s y l t r a n s f e r a s e , S c h i z o s a c c h a r o m y c e s . . . protein transport protein s e c 6 1 , putative similar t o P f S e c 6 1 [ P l a s m o d i u m falciparum] G I : 3 0 5 7 0 4 4 ; c o n t a i n s P f a m profile P F 0 0 3 4 4 : . . . e m p 2 4 / g p 2 5 L / p 2 4 protein-related contains weak similarity to t r a n s m e m b r a n e protein (GI: 1 2 1 2 9 6 5 ) [ H o m o . s a p i e n s ] e x o c y s t c o m p l e x subunit S e c l 5 - l i k e family protein c o n t a i n s P f a m profile P F 0 4 0 9 1 : Exocyst c o m p l e x subunit S e c l 5 - l i k e C C A A T - b o x binding transcription factor H a p 5 a , putative similar to h e m e activated protein G I : 6 2 8 9 0 5 7 f r o m (Arabidopsis thaliana)... ER l u m e n protein retaining receptor family protein similar to S P | P 3 3 9 4 6 ER l u m e n protein retaining receptor 1 ( K D E L receptor 1)... malate o x i d o r e d u c t a s e , putative similar to malate oxidoreductase ( N A D P - d e p e n d e n t malic e n z y m e ) G B : P 3 4 1 0 5 (Populus b a l s a m i f e r a . . reversibly glycosylated polypeptide-1 ( R G P 1 ) identical to reversibly glycosylated polypeptide-1 (AtRGP) [Arabidopsis thaliana]... e x p r e s s e d p r o t e i n c o n t a i n s P f a m P F 0 3 1 3 8 : Plant protein f a m i l y . T h e f u n c t i o n of t h i s f a m i l y of p l a n t p r o t e i n s is u n k n o w n ; . . . expressed protein zinc finger ( C C C H - t y p e ) family protein contains Pfam d o m a i n , P F 0 0 6 4 2 : Zinc finger C - x 8 - C - x 5 - C - x 3 - H type ( a n d similar) U D P - g a l a c t o s e / U D P - g l u c o s e t r a n s p o r t e r - r e l a t e d w e a k s i m i l a r i t y to U D P - g a l a c t o s e / U D P - g l u c o s e t r a n s p o r t e r [ A r a b i d o p s i s t h a l i a n a ] . . . fructose-bisphosphate aldolase, putative similar to plastldic aldolase NPALDP1 from Nicotiana paniculata [ G I : 4 8 2 7 2 5 1 ] ; contains... expressed protein N A D H - c y t o c h r o m e b5 reductase identical to N A D H - c y t o c h r o m e b5 reductase [Arabidopsis thaliana] G I : 4 2 4 0 1 1 6 exostosin family protein contains P f a m profile: P F 0 3 0 1 6 exostosin family > transducin family protein / W D - 4 0 repeat family protein contains contains Pfam PF00400: W D d o m a i n , G-beta repeat (7 copies, 3... ( l - 4 ) - b e t a - m a n n a n e n d o h y d r o l a s e , putative similar to ( l - 4 ) - b e t a - m a n n a n endohydrolase [Coffea arabica] G I : 1 0 1 7 8 8 7 2 ; contains... expressed protein hydrolase, a l p h a / b e t a fold family protein low simiilarity to 2 - h y d r o x y - 6 - o x o - 6 - p h e n y l h e x a - 2 , 4 - d i e n o a t e hydrolase [ R h o d o c o c c u s . . . e n o l a s e , putative similar to S w i s s - P r o t : P 1 5 0 0 7 enolase ( E C (2-phosphoglycerate d e h y d r a t a s e ) ( 2 - p h o s p h o - D - glycerate... acyl-[acyl carrier protein] thioesterase / a c y l - A C P thioesterase / oleoyl-[acyl-carrier protein] hydrolase / S - a c y l fatty acid... expressed protein contains Pfam d o m a i n , P F 0 3 6 5 0 : Uncharacterized protein family ( U P F 0 0 4 1 ) C C R 4 - N O T t r a n s c r i p t i o n c o m p l e x protein,' putative s i m i l a r to S W I S S - P R O T : Q 9 U F F 9 C C R 4 - N O T t r a n s c r i p t i o n c o m p l e x , s u b u n i t 8 . . . K + efflux antiporter, putative ( K E A 4 ) s i m i l a r t o g l u t a t h i o n e - r e g u l a t e d potassium-efflux s y s t e m protein K E F B , Escherichia coli,... 2-phosphoglycerate kinase-related contains weak similarity to 2-phosphoglycerate kinase ( G I : 4 6 7 7 5 1 ) [ M e t h a n o t h e r m u s fervidus] ATP-citrate synthase, putative / ATP-citrate ( p r o - S - ) - l y a s e , putative / citrate cleavage e n z y m e , putative strong similarity to... thioredoxiri family protein contains Pfam profile P F 0 0 0 8 5 : Thioredoxin a c y i - C o A : l - a c y l g l y c e r o l - 3 - p h o s p h a t e acyltransferase, putative similar to a c y l - C o A : l - a c y l g l y c e r o l - 3 - p h o s p h a t e acyltransferase... pectate lyase family protein obtusifoliol 1 4 - d e m e t h y l a s e (CYP51) identical to obtusifoliol 1 4 - d e m e t h y l a s e ( G I : 1 4 6 2 4 9 8 3 ) [Arabidopsis thaliana] acyl carrier protein 3 , chloroplast ( A C P - 3 ) nearly identical to S P | P 2 5 7 0 2 A c y l carrier protein 3 , chloroplast precursor ( A C P ) . . . expressed protein e x p r e s s e d p r o t e i n c o n t a i n s P f a m P F 0 3 1 3 8 : Plant p r o t e i n f a m i l y . T h e f u n c t i o n of t h i s f a m i l y of plant p r o t e i n s is u n k n o w n ; . . . mitochondrial substrate carrier family protein contains Pfam profile: P F 0 0 1 5 3 mitochondrial carrier protein expressed protein zinc finger (CCCH-type) family protein contains Pfam d o m a i n , P F 0 0 6 4 2 : Zinc finger C - x 8 - C - x 5 - C - x 3 - H type (and similar) s i g n a l p e p t i d e p e p t i d a s e f a m i l y protein c o n t a i n s P f a m d o m a i n P F 0 4 2 5 8 : M e m b r a n e p r o t e i n of u n k n o w n f u n c t i o n ( D U F 4 3 5 ) aspartyl protease family protein contains Pfam d o m a i n , P F 0 0 0 2 6 : eukaryotic aspartyl protease serine/threonine protein phosphatase 2 A (PP2A) regulatory subunit B', putative similar to S W I S S - P R O T : Q 2 8 6 5 3 s e r i n e / t h r e o n i n e . . . z i n c f i n g e r ( C 3 H C 4 - t y p e R I N G f i n g e r ) f a m i l y p r o t e i n c o n t a i n s P f a m p r o f i l e : P F 0 0 0 9 7 z i n c f i n g e r , C 3 H C 4 t y p e ( R I N G finger) e x p r e s s e d protein Similar to Arabidopsis hypothetical protein P I D : e 3 2 6 8 3 9 ( g b ] Z 9 7 3 3 7 ) contains t r a n s m e m b r a n e d o m a i n s transporter-related low similarity to S P | O 7 6 0 8 2 Organic cation/carnitine transporter 2 (Solute carrier family 2 2 , m e m b e r 5 ) . . . Ras-related G T P - b i n d i n g protein, putative similar to G T P - b i n d i n g protein R A B I D G I : 1 3 7 0 1 6 0 f r o m [Lotus japonicus] acetyl-CoA C-acyltransferase, putative / 3 - k e t o a c y l - C o A thiolase, putative strong similarity to A c e t o a c e t y i - c o e n z y m e A thiolase... e x p r e s s e d protein c o n t a i n s P f a m profile P F 0 4 1 4 6 : Y T 5 2 1 - B - l i k e family A R I D / B R I G H T D N A - b i n d i n g d o m a i n - c o n t a i n i n g protein c o n t a i n s P f a m profile P F 0 1 3 8 8 : A R I D / B R I G H T D N A binding d o m a i n S E C 1 4 cytosolic factor, putative / phosphoglyceride transfer protein, putative contains Pfam PF00650 : C R A L / T R I O d o m a i n ; . . . ovule d e v e l o p m e n t protein, putative similar to ovule d e v e l o p m e n t protein a i n t e g u m e n t a ( G I : 1 2 0 9 0 9 9 ) [Arabidopsis thaliana] e x p r e s s e d protein contains similarity to P r M C 3 [Pinus radiata] G I : 5 4 8 7 8 7 3 D N A - b i n d i n g b r o m o d o m a i n - c o n t a i n i n g protein low similarity to kinase [Gallus gallus] G I : 1 3 7 0 0 9 2 ; c o n t a i n s P f a m profile P F 0 0 4 3 9 : . . . c o a t o m e r protein c o m p l e x , subunit a l p h a , putative contains Pfam P F 0 0 4 0 0 : W D d o m a i n , G - b e t a repeat; similar to C o a t o m e r alpha... d i h y d r o l i p o a m i d e d e h y d r o g e n a s e 1, p l a s t i d i c / l i p o a m i d e d e h y d r o g e n a s e 1 ( P T L P D 1 ) identical to plastidic l i p o a m i d e d e h y d r o g e n a s e . . . zinc finger ( C 3 H C 4 - t y p e RING finger) family protein contains Pfam profile: P F 0 0 0 9 7 z i n c finger, C 3 H C 4 type ( R I N G finger) m y b family transcription factor ( K A N 4 ) contains P f a m profile: P F 0 0 2 4 9 myb-like D N A - b i n d i n g d o m a i n ; identical to c D N A G A R P - l i k e . . . N H L r e p e a t - c o n t a i n i n g protein c o n t a i n s P f a m profile P F 0 1 4 3 6 : N H L r e p e a t bZIP transcription factor family protein identical to b-Zip D N A binding protein G I : 2 2 4 6 3 7 6 f r o m [Arabidopsis thaliana];... A T P citrate-lyase -related similar to A T P citrate-lyase G I : 9 4 9 9 8 9 f r o m [Rattus norvegicus] p h o s p h a t e t r a n s l o c a t o r - r e l a t e d low similarity t o p h o s p h o e n o l p y r u v a t e / p h o s p h a t e translocator p r e c u r s o r [ M e s e m b r y a n t h e m u m . . . transporter-related low similarity to G D P - M a n n o s e transporter [Arabidopsis thaliana] G I : 1 5 4 8 7 2 3 7 ; contains Pfam profile... expressed protein expressed protein glycosyl transferase family 2 protein similar to b e t a - ( l - 3 ) - g l u c o s y l transferase G B : A A C 6 2 2 1 0 G I : 3 6 8 7 6 5 8 f r o m [ B r a d y r h i z o b i u m . . . D N A ) heat s h o c k N-terminal d o m a i n - c o n t a i n i n g protein c o n t a i n s P f a m profile P F 0 0 2 2 6 DnaJ d o m a i n MutT/nudix family protein similar to head organizer protein P 1 7 F 1 1 G I : 1 7 9 7 6 9 7 3 f r o m [ X e n o p u s laevis]; contains a N U D I X . . . cysteine p r o t e i n a s e , putative / thiol p r o t e a s e , putative simitar to c y s t e i n e proteinase R D 2 1 A p r e c u r s o r (thiol p r o t e a s e ) . . . m e v a l o n a t e diphosphate d e c a r b o x y l a s e ( M V D 1 ) identical to m e v a l o n a t e diphosphate decarboxylase [Arabidopsis thaliana]...  143  ArathACLL2 (Atlg20490) 0.732 At5g58700 0.721 Atlg08920 0.718 At5g58690 0.695 Atlg58270 0.694 At3g29575 0.691 Atlg07430 0.688 AtlgSOllO 0.685 Atlg69260 0.684 Atlg01650 0.68 At5g65280 0.665 Atlg67300 0.665 At3g48510 0.662 At3g02480 0.658 At5g04250 0.657 At5g06760 0.657 At5gl5960 0.655 At2g47780 0.653 At3g03170 0.651 Atlg05100 0.648 Atlg62570 0.646 At2g39050 0.646 At5g50720 0.642 At5g57050 0.641 At2gl5970 0.638 At2g34850 0.638 At5g59220 0.637 Atlg56600 0.637 At3g50970 0.63 At3g62700 0.63 At5g20900 0.629 At2g42540 ' 0.628 At2g47770 0.627 At4g33905 0.625 Atlg66830 0.624 Atlg52890 0.618 Atlg52690 0.618 Atlg54830 0.618 At2g33380 0.618 At5g65990 0.616 At5gl3750 0.615 Atlgl7550 0.612 At3g28007 ' 0.612 At3g55610 0.611 Atlg58360 0.611 At2gl9810 0.611 At2g41190 0.611 At4gl6760 0.61 At4gl9390 0.61 At4g26080 0.61 At4g27410 0.609 At3g09910 0.609 At5g52310 0.608 At3gll410 0.607 Atlgl6850 0.607 At4gl0960 0.606 At3g27870 0.605 At3g25870 0.604 At4g27840 0.603 At2g47600  Stress treatments v . l (298 data) phosphoinositide-specific phospholipase C family protein contains Pfam profile: PF00388 phosphatidyiinositol-specific... sugar transporter, putative similar to ERD6 protein {Arabidopsis thaliana} GI:3123712, sugar-porter family proteins 1 and 2... phosphoinositide-specific phospholipase C family protein contains Pfam profile: PF00388 phosphatidyiinositol-specific... meprin and TRAF homology domain-containing protein / MATH domain-containing protein similar to ubiquitin-specific protease 12... expressed protein protein phosphatase 2C, putative / PP2C, putative similar to GB:CAB90633 from [Fagus sylvatica] expressed protein contains similarity to SKP1' interacting partner 3 [Arabidopsis thaliana] GI: 10716951 expressed protein • protease-associated (PA) domain-containing protein contains protease associated (PA) domain, Pfam:PF02225 lanthionine synthetase C-like family protein contains Pfam domain, PF05147: Lanthionine synthetase C-like protein hexose transporter, putative similar to hexose transporters from Solanum tuberosum [GI:8347246], Nicotiana tabacum... expressed protein ABA-responsive protein-related similar to ABA-inducible protein [Fagus sylvatica] GI:3901016, cold-induced protein kinl... OTU-like cysteine protease family protein contains Pfam profile PF02338: OTU-like cysteine protease late embryogenesis abundant group 1 domain-containing protein / LEA group 1 domain-containing protein low similarity to... stress-responsive protein (KIN1) / stress-induced protein (KIN1) identical to SP|P18612 Stress-induced KIN1 protein... rubber elongation factor (REF) protein-related similar to Small rubber particle protein (SRPP) (22 kDa rubber particle protein)... expressed protein protein kinase family protein contains protein kinase domain, Pfam:PF00069 flavin-containing monooxygenase family protein / FMO family protein low similarity to flavin-containing monooxygenase FM03... hydroxyproline-rich glycoprotein family protein contains QXW lectin repeat domain, Pfam:PF00652 ABA-responsive protein (HVA22e) identical to AtHVA22e [Arabidopsis thaliana] GI: 11225589 protein phosphatase 2C ABI2 / PP2C ABI2 / abscisic acid-insensitive 2 (ABI2) identical to SP|O04719 Protein phosphatase 2C ABI2... cold-acclimation protein, putative (FL3-5A3) similar to cold acclimation WCOR413-like protein gamma form [Hordeum vulgare]... NAD-dependent epimerase/dehydratase family protein similar to UDP-galactose 4-epimerase from Cyamopsis tetragonoloba... . protein phosphatase 2C, putative / PP2C, putative ABA induced protein phosphatase 2C, Fagus sylvatica, EMBL: FSY277743 galactinol synthase, putative similar to galactinol synthase, isoform GolS-1 GI:5608497 from [Ajuga reptans] dehydrin xero2 (XER02) / low-temperature-induced protein LTI30 (LTI30) identical to dehydrin Xero 2 (Low-temperature-induced... glutathione-conjugate transporter, putative similar to glutathione-conjugate transporter AtMRP4 GI:2959767 from [Arabidopsis... expressed protein co Id-responsive protein / cold-regulated protein (corl5a) identical to cold-regulated protein corl5a [Arabidopsis thaliana]... benzodiazepine receptor-related contains weak similarity to Peripheral-type benzodiazepine receptor (PBR) (PKBS) (Mitochondrial... peroxisomal membrane protein 22 kDa, putative similar to 22 kDa peroxisomal membrane protein PMP22 [Mus musculus]... teucine-rich repeat transmembrane protein kinase, putative contains Pfam profiles: PF00069: Eukaryotic protein kinase domain,... no apical meristem (NAM) family protein contains Pfam PF02365: No apical meristem (NAM) domain; similar to NAM (no apical... late embryogenesis abundant protein, putative / LEA protein, putative similar to SPJP13934 Late embryogenesis abundant protein... CCAAT-box binding transcription factor Hap5a, putative similar to heme activated protein GI:6289057 from (Arabidopsis thaliana)... calcium-binding RD20 protein (RD20) induced by abscisic acid during dehydration PMID: 10965948; putative transmembrane channel... amino acid transporter family protein similar to proton/amino acid transporter 1 [Mus musculus] GI:21908024; contains Pfam... transporter-related protein phosphatase 2C-related / PP2C-related similar to protein phosphatase 2C GI:3242077 from (Arabidopsis thaliana) nodulin MtN3 family protein contains Pfam PF03083 MtN3/saliva family; similar to UM7 GI:431154 (induced in meiotic prophase in... delta l-pyrroline-5-carboxylate synthetase B / P5CS B (P5CS2) identical to SP|P54888 amino acid permease I (AAP1) identical to amino acid permease I GI:22641 from [Arabidopsis thaliana] zinc finger (CCCH-type) family protein contains Pfam domain, PF00642: Zinc finger C-x8-C-x5-C-x3-H type (and similar) amino acid transporter family protein low similarity to vesicular GABA transporter [Rattus norvegicus] GI:2587061; belongs to... acyl-CoA oxidase (ACX1) identical to acyl-CoA oxidase [Arabidopsis thaliana] GI:3044214 expressed protein protein phosphatase 2C ABI1 / PP2C ABI1 / abscisic acid-insensitive 1 (ABI1) nearly identical to SP|P49597 Protein phosphatase... no apical meristem (NAM) family protein (RD26) contains Pfam PF02365: No apical meristem (NAM) domain; Arabidopsis thaliana nap... Ras-related GTP-binding protein, putative similar to GTP-binding protein GI:2723477 from [Arabidopsis thaliana] ;contains Pfam... low-temperature-responsive protein 78 (L7T78) / desiccation-responsive protein 29A (RD29A) protein phosphatase 2C, putative / PP2C, putative identical to protein phosphatase 2C (PP2C) GB: P49598 [Arabidopsis thaliana];... expressed protein UDP-glucose 4-epimerase, putative / UDP-galactose 4-epimerase, putative / Galactowaldenase, putative similar to UDP-galactose... haloadd dehalogenase-like hydrolase family protein similar to Potential phospholipid-transporting ATPase (EC from... expressed protein expressed protein magnesium/proton exchanger (MHX1) identical to magnesium/proton exchanger AtMHX [Arabidopsis thaliana] gi|6492237|gb|AAF14229;.  144  ArathACLL3 (Atlg20500) 0.99 At5g07200 0.988 Atlg62070 0.986 At5g50750 0.985 At4gl0490 0.984 Atlg04380 0.984 At5g07260 0.983 At5g38170 0.981 Atlg60970 0.981 Atlg62060 0.98 At3g58740 0.979 At5g07210 0.979 At5g38160 0.978 At5g08460 0.976 At4g34520 0.974 Atlg25410 0.974 At3g63040 0.974 At5g38180 0.974 At5g40420 0.972 Atlg47540 0.972 Atlg71250 0.972 At4g27160 0.971 At4g36700 0.971 At5g44120 0.969 Atlg28590 0.969 At2g44470 0.969 At4g28520 0.968 Atlg78390 0.968 At3g03230 0.967 At2g27380 0.967 At3gl2203 0.967 At4g25140 0.967 At4g27150 0.965 Atlg65090 0.965 At4g37360 0.964 Atlg03880 0.963 At2g23580 0.963 At5g47670 0.962 At4g27170 0.961 At3g01570 0.961 At5g07500 0.961 At5g22810 0.96 At2g45420 0.959 Atlg68380 0.958 At2g28490 0.958 At5g38195 0.958 At5g49190 0.956 At2g42860 0.955 At2g23220 0.955 At3g04200 0.955 At5g44360 0.954 At2g23260 0.953 At5g51490 0.951 At5g48100 0.95 Atlg21970 0.95 At3g22640 0.95 At5g03810 0.947 Atlg67100 0.947 At5g62800 0.946 At3g28360 0.945 Atlg28030 0.945 At5g09640 0.944 At4g32490 0.943 At3g24250 0.942 Atlgl5150 0.942 At3g04280 0.941 At5g39130 0.94 Atlg05280 0.94 At2g46960 0.94 At4g27140 0.94 At5g50770 0.939 Atlgl8100 0.939 At2g28650 0.938 At3g44460 0.937 At4g26740 0.935 At3g24650 0.934 Atlg28650 0.934 At5g57260 0.933 Atlgl7810 0.932 At3g02590 0.932 At3gl3540 0.932 At4g00220 0.932 At5g54740 0.931 Atlg27080 0.931 At3g27785 0.927 Atlgl6980 0.927 Atlg48130 0.927 Atlg48910 0.927 Atlg80330 0.927 At3g54940 0.927 At5g59170 0.925 At5g07190 0.924 At5g03800 0.924 At5gl2460 0.924 At5g57920 0.922 At5g51210 0.922 At5g55370 0.921 Atlg71120 0.918 At2gl5325 0.916 Atlg04560 0.915 At2g23550  tissue and development (237 data) gibberellin 20-oxidase identical to GI: 1109699 expressed protein reversibly glycosylated polypeptide, putative strong similarity to reversibly glycosylated polypeptide-1 (AtRGP) [Arabidopsis... oxidoreductase, 20G-Fe(II) oxygenase family protein similar to naringenin,2-oxoglutarate 3-dioxygenase [Dianthus... 2-oxoglutarate-dependent dioxygenase, putative Strong similarity to Arabidopsis 2A6 (gb|X83096), tomato ethylene synthesis... homeobox protein-related contains weak similarity to Homeobox protein FWA (Swiss-Prot:Q9FVI6) [Arabidopsis thaliana] protease inhibitor/seed storage/lipid transfer protein (LTP) family protein contains Pfam profile: PF00234 protease... clathrin adaptor complex small chain family protein contains Pfam profile: PF01217 clathrin adaptor complex small chain expressed protein citrate synthase, glyoxysomal, putative strong similarity to SP|P49299 Citrate synthase, glyoxysomal precursor {Cucurbita... two-component responsive regulator family protein / response regulator family protein contains Pfam profile: PF00072 response... protease inhibitor/seed storage/lipid transfer protein (LTP) family protein contains Pfam profile: PF00234 protease... GDSL-motif lipase/hydrolase family protein similar to family II lipase EXL3 (GI:15054386), EXL1 (GI: 15054382), EXL2... fatty acid elongase 1 (FAE1) identical to fatty acid elongase 1 [GI:881615] adenylate isopentenyltransferase 6 / adenylate dimethylallyltransferase / cytokinin synthase (IPT6) identical to adenylate... expressed protein predicted protein, Celegans protease inhibitor/seed storage/lipid transfer protein (LTP) family protein contains Pfam profile: PF00234 protease... glycine-rich protein / oleosin trypsin inhibitor, putative similar to SP|P26780 Trypsin inhibitor 2 precursor (MTI-2) {Sinapis alba} GDSL-motif lipase/hydrolase family protein similar to family II lipases EXL3 GI: 15054386, EXL1 GI: 15054382, EXL2 GI: 15054384... 2S seed storage protein 3 / 2S albumin storage protein / NWMU2-2S albumin 3 identical to SP|P15459 cupin family protein low similarity to preproMP27-MP32 from Cucurbita cv. Kurokawa Amakuri [GI:691752]; contains Pfam profile... 12S seed storage protein (CRA1) nearly identical to SPJP15455 [Plant Mol Biol 11:805-820 (1988)]; contains Pfam profile PF00190... lipase, putative similar to lipase [Arabidopsis thaliana] GI:1145627; contains InterPro Entry IPR001087 Lipolytic enzyme,... glycosyl hydrolase family 1 protein contains Pfam PF00232 : Glycosyl hydrolase family 1 domain; TIGRFAM T1GR01233:... 12S seed storage protein, putative / cruciferin, putative strong similarity to SP|P33525 Crudferin CRU1 precursor ( U S . . . 9-cis-epoxycarotenoid dioxygenase, putative / neoxanthin cleavage enzyme, putative / carotenoid cleavage dioxygenase, putative... esterase/1 ipase/thioesterase family protein contains Interpro entry IPR000379 proline-rich family protein contains proline-rich extensin domains, INTERPRO:IPR002965 serine carboxypeptidase S10 family protein contains Pfam profile: PF00450 serine carboxypeptidase; similar to serine... glycine-rich protein / oleosin 2S seed storage protein 2 / 2S albumin storage protein / NWMU2-2S albumin 2 identical to SP[P15458 expressed protein cytochrome P450 family protein cytochrome P450 monooxygenase, Arabidopsis thaliana, PID:dl029478 12S seed storage protein (CRB) identical to 12S seed storage protein, gi|808937 [SP|P15456] [Plant Mol Biol 11:805-820 (1988)];... hydrolase, alpha/beta fold family protein similar to ethylene-induced esterase [Citrus sinensis] GI:14279437, polyneuridine... CCAAT-box binding transcription factor family protein / leafy cotyledon 1-related (L1L) supporting cDNA... 2S seed storage protein 4 / 2S albumin storage protein / NWMU2-2S albumin 4 identical to SP|P15460 glycine-rich protein / oleosin similar to oleosin GB:AAB58402 [Sesamum indicum] zinc finger (CCCH-type) family protein contains Pfam domain, PF00642: Zinc finger C-x8-C-x5-C-x3-H type (and similar) GDSL-motif lipase, putative similar to EXL3 (GP: 15054386) [Arabidopsis thaliana] LOB domain protein 18 / lateral organ boundaries domain protein 18 (LBD18) identical to LOB DOMAIN 18 [Arabidopsis thaliana]... expressed protein contains Pfam profile PF03267: Arabidopsis protein of unknown function, DUF266 cupin family protein similar to preproMP27-MP32 [Cucurbita cv. Kurokawa Amakuri] GI:691752, allergen Gly m Bd 28K [Glycine max]., protease inhibitor/seed storage/lipid transfer protein (LTP) family protein contains Pfam protease inhibitor/seed storage/LTP... sucrose synthase / sucrose-UDP glucosytransferase (SUS2) nearly identical to SP|Q00917 Sucrose synthase (EC expressed protein cytochrome P450, putative germin-like protein, putative contains Pfam profile: PF01072 germin family; similar to germin type2 GB:CAA63023 [SP|P92996]... FAD-binding domain-containing protein similar to SP|P30986 reticuline oxidase precursor (Berberine-bridge-forming enzyme)... UDP-glucoronosyl/UDP-glucosyl transferase family protein contains Pfam profile: PF00201 UDP-glucoronosyl and UDP-glucosyl... pectinesterase family protein contains Pfam profile: PF01095 pectinesterase laccase family protein / diphenol oxidase family protein similar to laccase [Pinus taeda][GI: 13661197] CCAAT-box binding transcription factor (LEC1) similar to CAAT-box DNA binding protein subunit B (NF-YB) (SP:P25209) (GI:22380)... cupin family protein contains similarity to vicilin-like protein precursor [Juglans regia] GI:6580762, vicilin precursor... GDSL-motif lipase/hydrolase family protein similar to family II lipase EXL3 (GI: 15054386), EXL1 (GI: 15054382), EXL2... LOB domain protein 40 / lateral organ boundaries domain protein 40 (LBD40) identical to SP|Q9ZW96 LOB domain protein 40... seven in absentia (SINA) family protein similar to SIAH1 protein [Brassica napus var. napus] GI:7657876; contains Pfam profile... ABC transporter family protein simitar to P-glycoprotein homologue GI:2292907 from [Hordeum vulgare subsp. vulgare] oxidoreductase, 20G-Fe(II) oxygenase family protein similar to GS-AOP loci [GI: 16118889, GI: 16118887, GI: 16118891,... sinapoylglucose:choline sinapoyltransferase (SNG2) GC donor splice site at exon 11 and 13; TA donor splice site at exon 10;...' plastocyanin-Mke domain-containing protein glycine-rich protein MATE efflux family protein similar to ripening regulated protein DDTFR18 [Lycopersicon esculentum] GI:12231296; contains Pfam... two-component responsive regulator family protein / response regulator family protein contains Pfam profile: PF00072 response... germin-like protein, putative identical to germin-like protein subfamily 1 member 16 (SP|Q9FIC8) fringe-related protein Similar to hypothetical protein PID|e327464 (gb|Z97338) various hypothetical proteins from Arabidopsis... cytochrome P450 family protein similar to cytochrome P450 72A1 (SP:Q05047) [Catharanthus roseus]; contains Pfam profile:... 2S seed storage protein 1 / 2S albumin storage protein / NWMU1-2S albumin 1 identical to SP|P15457 short-chain dehydrogenase/reductase (SDR) family protein similar to sterol-binding dehydrogenase steroleosin GI: 15824408 from... mother of FT and TF1 protein (MFT) identical to SP[Q9XFK7 MOTHER of FT and TF1 protein {Arabidopsis thaliana}; contains Pfam... exocyst subunit EXO70 family protein contains Pfam domain PF03081: Exo70 exocyst complex subunit basic leucine zipper transcription factor (BZIP67) identical to basic leucine zipper transcription factor GI: 18656053 from... embryo-spedfic protein 1 (ATS1) identical to embryo-specific protein 1 [Arabidopsis thaliana] GI:3335169 abscisic acid-insensitive protein 3 (ABO) identical to abscisic acid-Insensitive protein 3 GI: 16146 SP:Q01593 from... lipase, putative strong similarity to lipase [Arabidopsis thaliana] GI: 1145627 cytochrome P450 71B10 identical to cytochrome P450 71B10 (SP:Q9LVD2) [Arabidopsis thaliana] major intrinsic family protein / MIP family protein contains Pfam profile: MIP PF00230 delta 7-sterol-C5-desaturase, putative similar to delta7 sterol C-5 desaturase GI:5031219 from [Arabidopsis thaliana] myb family transcription factor contains Pfam profile: PF00249 myb-like DNA-binding domain LOB domain protein 30 / lateral organ boundaries domain protein 30 (LBD30) identical to LOB DOMAIN 30 [Arabidopsis thaliana]... protease inhibitor/seed storage/lipid transfer protein (LTP) family protein similar to 2S seed storage proteinsfromArabidosis... proton-dependent oligopeptide transport (POT) family protein similar to nitrate transporter NRT1-5 [Glycine max] GI: 11933414;... myb family transcription factor (MYB118) contains PFAM profile: PF00249 myb-like DNA binding domain alpha, alpha-trehalose-phosphate synthase, UDP-forming, putative / trehalose-6-phosphate synthase, putative / . . . peroxiredoxin (PERI) / rehydrin, putative identical to peroxiredoxin (Rehydrin homolog) [Arabidopsis thaliana]... flavin-containing monooxygenase family protein / FMO family protein similar to flavin monoxygenase-like protein floozy [Petunia... gibberellin 3-beta-dioxygenase, putative / gibberellin 3 beta-hydroxylase, putative similar to gibberellin 3 beta-hydroxylase... cysteine proteinase, putative contains similarity to cysteine proteinase GI:479060 from [Glycine max] proline-rich family protein contains proline-rich extensin domains, INTERPRO: IPR002965 embryo-spedfic protein 3, putative similar to embryo-specific protein 3 GI:3335171 from [Arabidopsis thaliana] exostosin family protein / pentatricopeptide (PPR) repeat-containing protein contains Pfam profiles: PF03016 exostosin family,... fringe-related protein similarity to predicted proteins + similar to hypothetical protein GB:AAC23643 [Arabidopsis thaliana] +... plastocyanin-like domain-containing protein glycine-rich protein / oleosin long-chain-alcohol O-fatty-acytransferase family protein / wax synthase family protein contains similarity to wax synthase... GDSL-motif lipase/hydrolase family protein similar to family II lipases EXL3 GI:15054386 from [Arabidopsis thaliana]; contains... protease inhibitor/seed storage/lipid transfer protein (LTP) family protein contains Pfam protease inhibitor/seed storage/LTP... AWPM-19-like membrane family protein contains Pfam PF05512: AWPM-19-like family; similar to late embryogenesis abundant... hydrolase, alpha/beta fold family protein similar to ethylene-induced esterase [Citrus sinensis] GI: 14279437, polyneuridine...  145  ArathACLL4 (Atlg20510) 0.861 Atlgl7380 0.851 At3g25780 0.847 At2g06050 0.831 Atlg72520 0.83 Atlgl7420 0.827 Atlg73080 0.798 Atlgl7750 0.793 At3g09830 0.791 Atlg74950 0.79 At3g51450 0.788 At4g34410 0.787 At5gl3220 0.784 At3g44860 0.777 Atlg28480 0.776 Atlg32640 0.768 At5gl2340 0.766 At2g44840 0.754 Atlgl9180 0.751 Atlg30135 0.751 Atlg76040 0.744 At3g08720 0.741 Atlg28380 . 0.735 Atlg06620 0.731 At4g24380 0.731 At5g42650 0.723 At2gl3790 0,721 Atlg44350 0.721 At4g23180 0.72 At3g01830 0.72 At4gl7230 0.719 Atlg70700 0.718 At2gl4290 0.711 At5g61900 0.71 At3g23250 0.709 At3gl3050 0.707 At2g22880 0.707 At2g33580 0.706 Atlg74430 0.706 At5g44070 0.705 At3gll820 0.702 At2g32140 0.7 At3gl7690 0.698 Atlg22810 0.696 Atlg72280 0.696 Atlg72450 0.694 At5g53050 0.689 Atlg07000 0.689 At2g46510 0.687 At2g27690 0.686 At4gl2720 0.684 At4g30430 0.683 At4g23190 0.683 At4g34390 0.683 At5g57890 0.681 Atlg80840 0.681 At5g25930 0.68 At4gl0390 0.679 At2g27310 0.679 At5g47220 0.678 At2g25460 0.676 At3gl9970 0.676 At5gl3190 0.675 Atlgl6370 0.675 Atlg69840 0.673 At2g34600 0.673 At5gl4700 0.671 Atlg20310 0.67 Atlg27770 0.67 At3g44400 0.67 At4g39890 0.67 At5g66640 0.669 At3gl6860 0.669 At5g05300 0.668 At4gl4680 0.667 Atlgl9210 0.665 At3g55950 0.665 At4g23220 0.664 Atlg26730 0.664 At2g24850 0.664 At3g02840 0.664 At4g39030 0.663 At2g26530 0.662 Atlg03370 0.661 At5g05140 0.659 Atlgl2610 0.658 Atlg53885 0.658 At2g22860 0.658 At3gll840 0.657 At4g21390 0.654 Atlg51780 0.654 At4g24160 0.651 Atlg71697 0.651 At5g52050 0.648 At3gl5210 0.648 At4gl4365 0.647 At3g06500 0.647 At5gl9110 0.646 At3g21070 0.646 At5g01540 0.646 At5g63450  stress treatments (298 data) expressed protein allene oxide cyclase, putative / early-responsive to dehydration protein, putative / ERD protein, putative similar to allene... 12-oxophytodienoate reductase (OPR3) / delayed dehiscencel (DDE1) nearly identical to DELAYED DEHISCENCE1 [GI:7688991] and to... lipoxygenase, putative similar to lipoxygenase gi:1495804 [Solanum tuberosum], gi:1654140 [Lycopersicon esculentum],... lipoxygenase, putative similar to lipoxygenase gi:1495804 [Solanum tuberosum], gi:1654140 [Lycopersicon esculentum] leucine-rich repeat transmembrane protein kinase, putative similar to receptor protein kinase GI:1389566 from [Arabidopsis... leucine-rich repeat transmembrane protein kinase, putative similar to receptor-like protein kinase INRPK1 GI: 1684913 from... protein kinase, putative similar to protein kinase [Lophopyrum elongatum] gi|13022177|gb|AAK11674 expressed protein strictosidrne synthase family protein similar to hemomucin [Drosophila melanogaster][GI: 1280434], strictosidine synthase... AP2 domain-containing transcription factor, putative ethylene-responsive element binding protein homolog, Stylosanthes hamata,... expressed protein S-adenosyl-L-methionine:carboxyl methytransferase family protein similar to defense-related protein cjsl [Brassica... glutaredoxin family protein contains INTERPRO Domain IPR002109, Glutaredoxin (thioltransferase) basic helix-loop-helix (bHLH) protein (RAP-1) identical to bHLH protein GB:CAA67885 GI: 1465368 from [Arabidopsis thaliana] expressed protein ; expression supported by MPSS ethylene-responsive element-binding protein, putative expressed protein expressed protein calcium-dependent protein kinase, putative / CDPK, putative similar to calcium-dependent protein kinase GB:AAC25423 GI:3283996... serine/threonine protein kinase (PK19) identical to serine/threonine-protein kinase AtPK19 (Ribosomal-protein S6 kinase... expressed protein 2-oxoglutarate-dependent droxygenase, putative similar to 2A6 (GI:599622) and tomato ethylene synthesis regulatory protein E8... expressed protein contains Pfam profile: PF03959 domain of unknown function (DUF341) allene oxide synthase (AOS) / hydroperoxide dehydrase / cytochrome P450 74A (CYP74A) identical to Allene oxide synthase,... leucine-rich repeat family protein / protein kinase family protein IAA-amino acid hydrolase 6, putative (ILL6) / IAA-Ala hydrolase, putative virtually identical to grl-protein from [Arabidopsis... receptor-like protein kinase 4, putative (RLK4) nearly identical to receptor-like protein kinase 4 [Arabidopsis thaliana]... calmodulin-related protein, putative similar to regulator of gene silencing calmodulin-related protein GI: 12963415 from... scarecrow-like transcription factor 13 (SCL13) expressed protein F-box family protein contains F-box domain Pfam:PF00646 copine BONZAI1 (BON1) nearly identical to BONZAU [Arabidopsis thaliana] GI: 15487382; contains Pfam profile PF00168: C2 domain myb family transcription factor (MYB15) similar to myb-related transcription factor GB:CAA66952 from [Lycopersicon esculentum] transporter-related low similarity to apical organic cation transporter [Sus scrofa] GI:2062135, SP|Q02563 Synaptic vesicle... VQ motif-containing protein contains PF05678: VQ motif protein kinase family protein / peptidoglycan-binding LysM domain-containing protein protein kinase [Arabidopsis thaliana]... myb family transcription factor (MYB95) contains Pfam profile: PF00249 myb-like DNA-binding domain phytochelatin synthase 1 (PCS1) identical to phytochelatin synthase [Arabidopsis thaliana] gi|18254401|gb|AAL66747; identical... syntaxin 121 (SYP121) / syntaxin-related protein (SYR1) contains Pfam profiles: PF00804 syntaxin and PF05739: SNARE domain;... disease resistance protein (TIR class), putative domain signature 71R exists, suggestive of a disease resistance protein, cyclic nucleotide-binding transporter 2 / CNBT2 (CNGC19) identical to cyclic nucleotide-binding transporter 2 (CNBT2)... AP2 do ma in-containing transcription factor, putative Contains similarity to transcription factor (TINY) isolog T02O04.22... endoplasmic reticulum oxidoreductin 1 (EROl) family protein contains Pfam domain, PF04137: Endoplasmic Reticulum Oxidoreductin... expressed protein hydrolase, alpha/beta fold family protein contains Pfam profile PF00561: hydrolase, alpha/beta fold family exocyst subunit EXO70 family protein similar to leucine zipper protein GI:10177020 from [Arabidopsis thaliana] contains Pfam... basic helix-loop-helix (bHLH) family protein contains Pfam profile: PF00010 helix-loop-helix DNA-binding domain cytochrome P450, putative similar to Cytochrome P450 94A1 (P450-dependent fatty acid omega-hydroxylase) (SP:081117) {Vicia... MutT/nudix family protein similar to SP|P53370 Nucleoside diphosphate-linked moiety X motif 6 {Homo sapiens}; contains Pfam... senescence-associated family protein similar to senescence-associated protein 5 [Hemerocallis hybrid cultivar]... protein kinase family protein contains Pfam PF00069: Protein kinase domain extra-large guanine nucleotide binding protein, putative / G-protein, putative similar to extra-large G-protein (XLG)... anthranilate synthase beta subunit, putative strong similarity to anthranilate synthase beta chain GI:403434 [Arabidopsis... WRKY family transcription factor similar to WRKY transcription factor GB:BAA87058 GI:6472585 from [Nicotiana tabacum] leucine-rich repeat family protein / protein kinase family protein contains similarity to Swiss-Prot:P47735 receptor-like..." protein kinase family protein contains protein kinase domain, Pfam:PF00069 F-box family protein contains Pfam PF00646: F-box domain;; similar to SKP1 interacting partner 2 (SKIP2) TTGR_Athl:At5g67250 ethylene-responsive element-binding factor 2 (ERF2) identical to SP|O80338 Ethylene responsive element binding factor 2... expressed protein expressed protein expressed protein transporter-related low similarity to organic cation transporter OCTN1 from [Homo sapiens] GI:2605501, [Mus musculus]... band 7 family protein strong similarity to hypersensitive-induced response protein [Zea mays] GI: 7716466; contains Pfam profile... expressed protein cinnamoyl-CoA reductase-related similar to cinnamoyl-CoA reductase from Pinus taeda [GI: 17978649], Saccharum offlcinarum... expressed protein calcium-transporting ATPase 1, plasma membrane-type / Ca(2+)-ATPase isoform 1 (ACA1) / plastid envelope ATPase 1 (PEA1)... disease resistance protein (TTR-NBS-LRR class), putative domain signature TIR-NBS-LRR exists, suggestive of a disease... Ras-related GTP-binding family protein contains Pfam profile: PF00071 Ras family LIM domain-containing protein-related contains low similarity to Pfam profile PF00412: LIM domain phytochelatin synthetase-related contains Pfam PF04833: Phytochelatin synthetase-like conserved region expressed protein similar to unknown protein (gb|AAF01528.1)rexpression supported by MPSS sulfate adenylyltransferase 3 / ATP-sulfurylase 3 (APS3) identical to ATP sulfurylase (APS3) [Arabidopsis thaliana] GI: 1575327 AP2 domain-containing transcription factor, putative similar to AP2 domain transcription factor GI:4567204 from [Arabidopsis... protein kinase family protein contains protein kinase domain, Pfam:PF00069; similar to cytokinin-regulated kinase 1 [Nicotiana... protein kinase family protein contains Pfam PF00069: Protein kinase domain EXS family protein / ERD1/XPR1/SYG1 family protein similar to PHOl protein [Arabidopsis thaliana] GI:20069032; contains Pfam... aminotransferase, putative similar to nicotianamine aminotransferase from Hordeum vulgare [GI:6498122, GI:6469087]; contains... immediate-early fungal elicitor family protein similar to immediate-early fungal elicitor protein CMPG1 (GI: 14582200)... enhanced disease susceptibility 5 (EDS5) / salicylic acid induction deficient 1 (SID1) identical to SP|Q945F0; contains Pfam... expressed protein C2 domain-containing protein / GRAM domain-containing protein contains Pfam profiles PF00168: C2 domain; contains PF02893: GRAM., transcription elongation factor-related low similarity to transcription elongation factor TFlIS.h [Mus musculus] GI:3288547,... DRE-binding protein, putative / CRT/DRE-binding factor, putative similar to DREB1A GI:3738224 from [Arabidopsis thaliana] and... senescence-associated protein-related similar to senescence-associated protein SAG102 (GI:22331931) [Arabidopsis thaliana]; phytosulfokines 2 (PSK2) identical to phytosulfokines 2 (PSK2) from [Arabidopsis thaliana] U-box domain-containing protein low similarity to immediate-early fungal elicitor protein CMPG1 [Petroselinum crispum]... S-locus lectin protein kinase family protein contains Pfam profiles: PF00954 S-locus glycoprotein family, PF00069 protein.,. IAA-amino acid hydrolase 5 / auxin conjugate hydrolase (ILL5) identical to auxin conjugate hydrolase ILL5 [Arabidopsis... hydrolase, alpha/beta fold family protein contains Pfam profile PF00561: hydrolase; alpha/beta fold family choline kinase, putative similar to GmCK2p choline kinase gr| 143888i|gb|AAC49375 MATE efflux protein-related contains Pfam profile PF01554: Uncharacterized membrane protein family ethylene-responsive element-binding factor 4 (ERF4) identical to ethylene responsive element binding factor 4 SP:O80340 from... zinc finger (C3HC4-type RING finger) family protein / ankyrin repeat family protein contains Pfam profile: PF00097 zinc finger,... beta-fructofuranosidase, putative / invertase, putative / saccharase, putative / beta-fructosidase, putative similar to neutral... extracellular dermal glycoprotein-related / EDGP-re!ated similar to extracellular dermal glycoprotein EDGP precursor [Daucus... ATP-NAD kinase family protein contains Pfam domain, PF01513: ATP-NAD kinase lectin protein kinase, putative similar to receptor lectin kinase 3 [Arabidopsis thaliana] gi|4100060|gb|AAD00733; contains... cytochrome P450, putative  146  ArathACLL8 (At5g38120) 0.846 Atlg70720 0.811 At3g51410 0.806 At2gl7470 0.798 At4gl4750 0.796 Atlg59640 0.789 Atlgl5360 0.781 Atlg75880 0.777 At2g3434Q 0.761 Atlg75300 0.741 At4gl0240 0.73 0.727 0.726 0.719 0.714 0.712 0.707 0.706 0.705 0.703 0.7 0.696 0.692 0.689 0.688 0.685 0.683  At3gl6170 AClg32780 At4gl6590 At2g44770 At5g20240 Atlgl6750 At2g38110 Atlg35310 Atlg78960 At3gl0570 At4gl3790 At5g50335 At3g44610 At2g42900 At3g50630 Atlg61680 At4g32460  0.68 Atlg66350 0.68 At2g41830 0.676 At4gl3000 0.675 AtlgllOOO 0.675 At5gl7540 0.673 At5g40350 0.672 At4g22730 0.669 At5g45960 0.668 At5g49130 0.664 Atlg78680 0.664 At5gl6440 0.656 Atlg01600 0.655 At2g24210 0.654 Atlg63710 0.654 At2g22960 0.654 At3g48460 0.653 At5g50710 0.652 Atlg35180 0.646 At3g01980 0.645 At3g01750 0.644 Atlg23600 0.644 At2gl6260 0.644 At4g01080 0.643 Atlg22460 0.642 At2g47050 0.641 At3g01140 0.639 At3g23450 0.639 At4g33390 0.638 At3g28007 0.634 At3g27810 0.633 At3g51150 0.63 0.629 0.629 0.625 0.625 0.622 0.622 0.622 0.622 0.621 0.619 0.617 0.617 0.616 0.615 0.614 0.614 0.613 0.612 0.611  At3g55310 Atlgl0060 Atlgl6705 Atlgl9650 At2g27920 Atlg08510 Atlgll410 At2g40475 At5g55720 At3gl8850 At3g55700 At2g42540 At2g42990 Atlgll850 At2g04570 At2g42620 At4g27790 At3gl4380 At4g34940 At3g08990  0.61 0.609 0.608 0.608 0.608 0.604 0.602 0.601 0.601 0.601 0.6  At3g29370 At4glS980 Atlg61350 At4g27840 At5gl5780 At2g20870 At5g49330 At2gl3570 At3gll210 At5g48900 At5g23970  Tissue and development (237data) invertase/pectin methylesterase inhibitor family protein low similarity to pectinesterase from Lycopersicon esculentum... expressed protein contains Pfam profile PF03087: Arabidopsis protein of unknown function expressed protein contains Pfam profile PF01027: Uncharacterized protein family UPF0005 calmodulin-binding family protein contains Pfam profile PF00612: IQ calmodulin-binding motif basic helix-loop-helix (bHLH) family protein AP2 domain-containing transcription factor family protein Similar to SP|P16146 PPLZ02 protein {Lupinus polyphyllus}; contains... family II extracellular lipase 1 (EXL1) EXL1 (PMID: 11431566); similar to anter-specific proline-rich protein (APG) SP:P40602... expressed protein contains Pfam profile PF04520: Protein of unknown function, DUF584 isonavone reductase, putative identical to SP|P52577 Isoflavone reductase homolog P3 (EC 1.3.1.-) {Arabidopsis thaliana};... zinc finger (B-box type) family protein zinc-finger protein R2931, Oryza sativa, PIR3:JE0116 acyl-activating enzyme 13 (AAE13) similar to malonyl CoA synthetase GB:AAF28840 from [Bradyrhizobium japonicum]; contains Pfam... alcohol dehydrogenase, putative similar to alcohol dehydrogenase GB:CAA37333 GI: 297178 from [Solanum tuberosum]; contains Pfam.. glucosyltransferase-related low similarity to beta-(l-3)-glucosyl transferase [Bradyrhizobium japonicum] GI:3687658 phagocytosis and cell motility protein ELMOl-related contains weak similarity to ELM01 [Mus musculus] gi|16118551|gb|AAL14464 floral homeotic protein PISTILLATA (PI) contains Pfam profiles PF01486: K-box region and PF00319: SRF-type transcription factor., expressed protein contains Pfam profile PF04784: Protein of unknown function, DUF547 phospholipid/glycerol acyltransferase family protein low similarity to SP|O87707 QcA protein {Caulobacter crescentus};... Bet v I allergen family protein similar to Csf-2 [Cucumis sativus][GI:5762258][J Am Soc Hortic Sci 124, 136-139 (1999)] ;... lupeot synthase, putative / 2,3-oxidosqualene-triterpenoid cyclase, putative similar to lupeo] synthase GI: 1762150 from... cytochrome P450, putative similar to cytochrome P450 77A3 GB:048928 [Glycine max] auxin-responsive protein, putative similar to small auxin up RNA (SAUR-AC1) (SP:S70188) [Arabidopsis thaliana] expressed protein protein kinase family protein similar to viroid symptom modulation protein (protein kinase)[Lycopersicon esculentum]... expressed protein kip-related protein 2 (KRP2) / cyclin-dependent kinase inhibitor 2 (ICK2) / cdc2a-interacting protein identical to... terpene synthase/cyclase family protein similar to 1,8-cineole synthase [GI:3309117][Salvia officinalis]; contains Pfam... expressed protein contains Pfam profile PF04862: Protein of unknown function, DUF642 gibberellin regulatory protein (RGL1) similar to GB:CAA75492 from [Arabidopsis thaliana]; contains Pfam profile PF03514: GRAS... cyclin-related contains Pfam profile PF02984: Cyclin, C-termina! domain protein kinase family protein contains protein kinase domain, Pfam:PF00069 seven transmembrane MLO family protein / MLO-like protein 4 (ML04) identical to membrane protein Mlo4 [Arabidopsis thaliana]... transferase family protein similar to hypersensitivity-related gene product HSR201 - Nicotiana tabacum, EMBL:X95343; contains... myb family transcription factor (MYB24) similar to Myb26 GI: 1841475 from [Pisum sativum] leucine-rich repeat transmembrane protein kinase, putative leucine rich repeat receptor-like kinase, Oryza sativa, PATCHX:E267533 GDSL-motif lipase/hydrolase family protein MATE efflux family protein contains Pfam profile PF01554: MatE Uncharacterized membrane protein family gamma-glutamyl hydrolase (GGH1) / gamma-Glu-X carboxypeptidase / conjugase identical to SP|065355 Gamma-glutamyl hydrolase... isopentenyl-diphosphate detta-isomerase I / isopentenyl diphosphate:dimethylallyl diphosphate isomerase I (IPP1) identical to... cytochrome P450, putative similar to cytochrome P450 GI: 10442763 from [Triticum aestivum] myrcene/ocimene synthase (TPS10) nearly identical to GI:9957293; contains Pfam profile: PF01397 terpene synthase family cytochrome P450, putative similar to cytochrome P450 GB:O23066 [Arabidopsis thaliana] serine carboxypeptidase S10 family protein contains Pfam profile: PF00450 serine carboxypeptidase ;similar to... GDSL-motif lipase/hydrolase family protein similar to lipase [Arabidopsis thaliana] GI: 1145627; contains InterPro Entry... hypothetical protein expressed protein similar to hypothetical protein GB:AAF69173 GI:7767676 from [Arabidopsis thaliana] short-chain dehydrogenase/reductase (SDR) family protein contains Pfam profiles: PF00106 short chain dehydrogenase, PF00678... ankyrin repeat family protein contains ankyrin repeats, Pfam:PF00023 expressed protein contains Pfam profile PF02713: Domain of unknown function DUF220; expression supported by MPSS glycine-rich RNA-binding protein, putative similar to Glycine-rich RNA-binding protein from {Daucus carota} SP|Q03878, {Sinapis... expressed protein expressed protein similar to axi 1 [Nicotiana tabacum] GI:559921; contains Pfam profile PF03138: Plant protein family invertase/pectin methylesterase inhibitor family protein low similarity to pollen-specific pectin esterase [Brassica rapa... myb family transcription factor (MYB106) similar to transforming protein (myb) homolog GB:S26605 from [Petunia x hybrida] pseudogene, similar to unnamed protein product blastp match of 19% identity and 2.8e-15 P-value to... hypothetical protein contains Pfam profile PF05701: Plant protein of unknown function (DUF827) nodulin MtN3 family protein contains Pfam PF03083 MtN3/saliva family; similar to UM7 GI:431154 (induced in meiotic prophase in... myb family transcription factor (MYB3) (MYB21) contains Pfam profile: PF00249 myb-like DNA-binding domain .identical to ATMYB3... kinesin motor family protein contains Pfam domain, PF00225: Kinesin motor domain short-chain dehydrogenase/reductase (SDR) family protein contains similarity to 3-oxoacyl-[acyl-carrier protein] reductase... branched-chain amino acid aminotransferase 1 / branched-chain amino acid transaminase 1 (BCAT1) nearly identical to SP|Q93Y32... D300/CBP acetyltransferase-related protein-related similar to p300/CBP acetyltransferase-related protein 2 [Arabidopsis... SEC14 cytosolic factor, putative / phosphoglyceride transfer protein, putative similar to SP:P24859 from [Kluyveromyces... serine carboxypeptidase S10 family protein similar to retinoid-inducible serine carboxypeptidase precursor (GI.15146429) [Mus... acyl-[acyl carrier protein] thioesterase / acyl-ACP thloesterase / oleoyl-[acyl-carrier protein] hydrolase / S-acyl fatty acid... S-locus protein kinase, putative similar to receptor-like protein kinase [Arabidopsis thaliana] gi|4008008|gb|AAC95352;... expressed protein pectate lyase family protein similar to pectate lyase 1 GP: 6606532 from [Musa acuminata] phospholipid/glycerol acyltransferase family protein contains Pfam profile: PF01553 Acyltransferase UDP-glucoronosyl/UDP-glucosyl transferase family protein glucuronosyl transferase homolog, Lycopersicon esculentum, PIR:S39507... cold-responsive protein / cold-regulated protein (corl5a) identical to cold-regulated protein corl5a [Arabidopsis thaliana]... GDSL-motif lipase/hydrolase family protein similar to family II lipase EXL3 (GI:15054386), EXL1 (GI: 15054382), EXL2... expressed protein GDSL-motif lipase/hydrolase family protein similar to family II lipase EXL3 (GI: 15054386), EXL1 (GI: 15054382), EXL2... F-box family protein (ORE9) E3 ubiquitin ligase SCF complex F-box subunit; identical to F-box containing protein ORE9... calcium-binding EF hand family protein contains INTERPRO:IPR002048 calcium-binding EF-hand domain integral membrane family protein similar to unknown protein GB:AAD50013 from [Arabidopsis thaliana]; contains TIGRFAM TIGR01569... armadillo/beta-catenin repeat family protein contains Pfam profile: PF00514 armadillo/beta-catenin-like repeat yippee family protein similar to qdgl-1 [Coturnix coturnix] GI:10441650, Yippee protein [Drosophila melanogaster] GI:5713279;... expressed protein pectinesterase family protein contains Pfam profile: PF01095 pectinesterase armadillo/beta-catenin repeat family protein armadillo/beta-catenin-like repeats, Pfam:PF00514 expressed protein pollen Ole e 1 allergen and extensin family protein contains Pfam profile PF01190: Pollen proteins Ole e I family cell wall protein precursor, putative identical to Putative cell wall protein precursor (Swiss-Prot:P47925) [Arabidopsis... myb family transcription factor contains Pfam profile: PF00249 myb-like DNA binding domain; identical to cDNA putative... CCAAT-box binding transcription factor, putative similar to CAAT-box DNA binding protein subunit B (NF-YB) (SP:P25209)... GDSL-motif lipase/hydrolase family protein contains Pfam profile PF00657: Lipase/Acylhydrolase with GDSL-like motif pectate lyase family protein similar to pectate lyase GP: 14531296 from [Fragaria x ananassa]; non-consensus AG donor splice... transferase family protein similar to acetyl CoA: benzylalcohol acetyltransferase; BEAT [Clarkia...  147  CLADEE  0.827 0.814 0.812 0.811 0.808 0.807 0.801 0.801 0.797 0.795 0.795 0.79 0.789 0.788 0.787 0.787 0.785 0.785 0.784 0.783 0.783 0.782 0.782 0.781 0.78 0.78 0.78 0.778 0.778 0.777 0.777 0.777 0.776 0.776 0.776 0.775 0.775 0.774 0.774 0.773 0.773 0.773 0.771 0.771 0.77 0.77 0.77 0.77 0.77 0.77 0.769 0.768 0.767 0.767 0.766 0.765 0.76S 0.764 0.763 0.762 0.762 0.762 0.762 0.762 0.762 0.761 0.76 0.759 0.759 0.759 0.759 0.759 0.757 0.755 0.754 0.753 0.753 0.752 0.751 0.751 0.751 0.75 0.75 0.75 0.748 0.748 0.748 0.748 0.748 0.748 0.747 0.747 0.747 0.746 0.746 0.746 0.745 0.745 0.745 0.744  At4g33530 At5g40890 At3g23820 At5gl3710 At2g04780 Atlg45688 Atlg79340 At5g03760 Atlg01620 At2g39900 At5g47770 At3g23810 At4g22010 Atlg24170 Atlg48480 At5g49720 At4g29220 At4g34640 Atlg70370 At3g01810 At5gl5350 Atlg04430 At5g20700 At5g46700 Atlg66200 At5g44020 At5g67420 Atlg27930 At5gl7820 Atlg67330 At3g23000 At4gl2730 Atlg04040 Atlg05210 Atlg62660 Atlgl2500 Atlg20010 Atlg09780 At3gl6460 Atlg70410 At2g32380 At4g37450 Atlg02500 At3g05890 Atlg45130 At2gl5970 At2g36880 At4gll290 At4g23850 At4g25570 Atlg23480 Atlg03870 Atlgl0670 At5g67400 Atlg50430 At2g36870 At4g26010 At5gl7330 At5g49460 Atlg04680 Atlg65960 At3gl3520 At3gS2470 At4gl9120 At5gl5230 At2g36830 Atlg75680 Atlg55330 At3g49670 At3g49940 At4g08685 At5g64100 At2g44160 At5g55730 At5g43830 At2g26730 At5g56870 Atlg01430 At3gS1670 At4gl2420 At5gl9250 At2g37180 At3gl6390 At4gl2390 Atlg28290 Atlg30120 Atlg66280 At3gl2710 At5g05960 At5g60920 Atlg28130 At2g38390 At3g56240 At4g37800 At5gl8500 At5g43060 At3g27170 At5gll890 At5g44130 Atlg53290  hormone treatments (236 data) potassium transporter family protein similar to K+ transporter HAK5 [Arabidopsis thaliana] GI:7108597; KUP/HAK/KT Transporter... chloride channel protein (CLC-a) identical to GI: 1742952 (gb|AAC05742,l) NAD-dependent epimerase/dehydratase family protein similar to nucleotide sugar epimerase from Vibrio vulnificus GI:3093975... sterol 24-C-methyltransferase, putative similar to SP:P25087 Sterol 24-C-methyltransferase, Delta(24)-sterol C-... fasciclin-like arabinogalactan-protein (FLA7) identical to gi_13377782_gb_AAK20860 expressed protein latex-abundant protein, putative (AMC7) / caspase family protein similar to latex-abundant protein [Hevea brasiliensis]... glycosyl transferase family 2 protein similar tq beta-(l-3)-glucosyl transferase GB:AAC62210 GI:3687658 from [Bradyrhizobium... plasma membrane intrinsic protein 1C (PIP1C) / aquaporin PIP1.3 (PIP1.3) / transmembrane protein B (TMPB) identical to plasma... UM domain-containing protein similar to pollen specific LIM domain protein lb [Nicotiana tabacum] GI:6467905, PGPS/D1 [Petunia... farnesyl pyrophosphate synthetase 1, mitochondrial (FPS1) / FPP synthetase 1 / farnesyl diphosphate synthase 1 identical to.., adenosylhomocysteinase, putative / S-adenpsyl-L-homocysteine hydrolase, putative / AdoHcyase, putative strong similarity to... multi-copper oxidase type I family protein similar to pollen-specific BP10 protein [SP|Q00624][Brassica napus]; contains Pfam... glycosyl transferase family 8 protein contains Pfam profile: PF01501 glycosyl transferase family 8 leucine-rich repeat transmembrane protein kinase, putative contains similarity to many predicted protein kinases endo-l,4-beta-glucanase KORRIGAN (KOR) / cellulase (OR16pep) identical to endo-l,4-beta-D-glucanase KORRIGAN [Arabidopsis... phosphofructokinase family protein similar to phosphofructokinase [Amycolatopsis methanolica] GI:17432243; contains Pfam... famesyl-diphosphate farnesyltransferase 1 / squalene synthase 1 (SQS1) identical to SP|P53799 Farnesyl-diphosphate... BURP domain-containing protein / polygalacturonase, putative similar to polygalacturonase isoenzyme 1 beta subunit... expressed protein plastocyanin-like domain-containing protein contains plastocyanin-like domain Pfam:PF02298 dehydration-responsive protein-related similar to early-responsive to dehydration stress ERD3 protein [Arabidopsis thaliana]... senescence-associated protein-related similar to senescence-associated protein SAG102 (GI:22331931) [Arabidopsis thaliana]; senescence-associated protein, putative similar to senescence-associated protein 5 [Hemerocallis hybrid cultivar]... glutamine synthetase, putative similar to glutamine synthetase, cytosolic isozyme (Glutamate— ammonia ligase, GS1) [Lotus... acid phosphatase dass B family protein similar to SP|P15490 STEM 28 kDa glycoprotein precursor (Vegetative storage protein A)... LOB domain protein 37 / lateral organ bqundaries domain protein 37 (LBD37) identical to LOB DOMAIN 37 [Arabidopsis thaliana]... expressed protein contains Pfam profile PF04669: Protein of unknown function (DUF579) peroxidase 57 (PER57) (P57) (PRXR10) identical to SPIQ43729 Peroxidase 57 precursor (EC (Atperox P57) (PRXR10)... expressed protein contains Pfam profile PF04669: Protein of unknown function (DUF579) CBL-interacting protein kinase 7 (CIPK7) identical to CBL-interacting protein kinase 7 [Arabidopsis thaliana]... fascidin-like arabinogalactan-protein (FLA2) identical to gi_13377778_gb_AAK20858 acid phosphatase dass B family protein similar to SPIP15490 STEM 28 kDa glycoprotein precursor (Vegetative storage protein A)... expressed protein beta-fructosidase (BFRUCT3) / beta-fructofuranosidase / invertase, vacuolar identical to beta-fructosidase GB:CAA67560... phosphate translocator-related low similarity to glucose-6-phosphate/phosphate-translocator precursor [Zea mays] GI:2997589,... tubulin beta-5 chain (TUB5) nearly identical to SP|P29513 Tubulin beta-5 chain {Arabidopsis thaliana} 2,3-biphosphoglycerate-independent phosphoglycerate mutase, putative / phosphoglyceromutase, putative strong similarity to... jacalin lectin family protein contains Pfam profile: PF01419 jacalin-like lectin domain; similar to myrosinase binding protein... carbonic anhydrase, putative / carbonate dehydratase, putative similar to SPIP42737 Carbonic anhydrase 2 (EC expressed protein arabinogalactan-protein (AGP18) identical to gi_11935088_gb_AAG41964 S-adenosylmethionine synthetase 1 (SAM1) identical to S-adenosylmethionine synthetase 1 (Methionine adenosyltransferase 1,... hydrophobic protein (RCI2B) / low temperature and salt responsive protein (LTI6B) identical to SPIQ9ZNS6 Hydrophobic protein... beta-galactosidase, putative / lactase, putative similar to beta-galactosidase [Lycopersicon esculentum] GI:7939619,... cold-acclimation protein, putative (FL3-5A3) similar to cold acclimation WCOR413-like protein gamma form [Hordeum vulgare]... S-adenosylmethionine synthetase, putative similar to S-adenosylmethionine synthetase 3 (Methionine adenosyltransferase 3,... peroxidase, putative identical to peroxidase ATP19a [Arabidopsis thaliana] gi| 1546692|emb|CAA67337 long-chain-fatty-acid—CoA ligase / long-chain acyl-CoA synthetase nearly identical to acyl-CoA synthetase (MF7P) from Brassica... cytochrome B561 family protein contains Pfam domain, PF03188: Cytochrome b561 glycosyl transferase family 2 protein similar to cellulose synthase from Agrobacterium tumeficiens [gi:710492] and... fasddin-like arabinogalactan-protein (FLA9) identical to gi_13377784_gb_AAK20861 expressed protein peroxidase 73 (PER73) (P73) (PRXR11) identical to SPIQ43873 Peroxidase 73 precursor (EC (Atperox P73) (PRXR11)... 7-dehydrocholesterol reductase / 7-DHC reductase / sterol delta-7-reductase (ST7R) / dwarfs protein (DWF5) identical to... xytoglucan:xyloglucosyl transferase, putative / xyloglucan endotransglycosylase, putative / endo-xyloglucan transferase,... peroxidase, putative peroxidase ATP13a - Arabidopsis thaliana, PID:e264765; identical to cDNA class III peroxidase ATP35,... glutamate decarboxylase 1 (GAD 1) sp|Q42521 ATP-citrate synthase, putative / ATP-citrate (pro-S-)-lyase, putative / citrate cleavage enzyme, putative strong similarity to... pectate lyase family protein similar to pectate lyase GP: 14531296 from [Fragaria x ananassa] glutamate decarboxylase 2 (GAD 2) similar to glutamate decarboxylase (gad) GI:294111 from [Petunia hybrida] arabinogalactan-protein (AGP12) identical to gi|10880501|gb|AAG24280 harpin-induced family protein / HIN1 family protein / harpin-responsive family protein similar to harpin-induced protein hinl (... early-responsive to dehydration stress protein (ERD3) identical to ERD3 protein [Arabidopsis thaliana] GI:15320410; contains... gibberellin-regulated protein 4 (GASA4) / gibberetlin-responsive protein 4 identical to SP|P46690 Gibberellin-regulated protein... major intrinsic family protein / MIP family protein contains Pfam profile: MIP PF00230 glycosyl hydrolase family 9 protein similar to endo-beta-l,4-glucanase GB:AAC12685 GI: 3025470 from [Pinus radiata] arabinogalactan-protein (AGP21) leudne-rlch repeat transmembrane protein kinase, putative CLAVATA1 receptor kinase, Arabidopsis thaliana, EMBL:ATU96879 LOB domain protein 38 / lateral organ boundaries domain protein 38 (LBD38) identical to SP1Q9SN23 LOB domain protein 38... pollen Ole e 1 allergen and extensin family protein contains Pfam domain, PF01190: Pollen proteins Ole e I family peroxidase, putative identical to peroxidase ATP3a [Arabidopsis thaliana] gi|1546698|emb|CAA67340 methylenetetrahydrofolate reductase 2 (MTHFR2) identical to SPIO80585 Methylenetetrahydrofolate reductase (EC fascidin-like arabinogalactan-protein (FLA1) Identical to gi|13377776||AAK20857|13377775|gb|AF333970 expressed protein similar to auxin down-regulated protein ARG10 [Vigna radiata] GI:2970051, wali7 (aluminum-induced protein)... leudne-rich repeat transmembrane protein kinase, putative beta-galactosidase, putative / lactase, putative similar to beta-galactosidase precursor GI:3869280 from [Carica papaya] expressed protein similar to hypothetical protein GB:CAB80917 GI:7267605 from [Arabidopsis thaliana] SEC14 cytosolic factor family protein / phosphoglyceride transfer family protein similar to polyphosphoinositide binding... multi-copper oxidase, putative (SKU5) identical to multi-copper oxidase-related protein (SKU5)(GI: 18158154) [Arabidopsis... expressed protein plasma membrane intrinsic protein 2C (PIP2C) / aquaporin PIP2.3 (PIP2.3) / water-stress induced tonoplast intrinsic protein... jacalin lectin family protein similar to myrosinase-binding protein homolog [Arabidopsis thaliana] GI:2997767, epithiospecifier... invertase/pectin methylesterase inhibitor family protein low similarity to pectinesterase from Arabidopsis thaliana SPIQ42534,... pollen Ole e 1 allergen and extensin family protein similar to arabinogalactan protein [Daucus carota] GI: 11322245; contains... pyruvate dehydrogenase E l component beta subunit, chloroplast identical to pyruvate dehydrogenase E l beta subunit [Arabidopsis... glycosyl hydrolase family 1 protein contains Pfam PF00232 : Glycosyl hydrolase family 1 domain; TIGRFAM TIGR01233:... methyladenine glycosylase family protein similar to SP|P05100 DNA-3-methyladenine glycosylase I (EC protease inhibitor/seed storage/lipid transfer protein (LTP) family protein contains Pfam protease inhibitor/seed storage/LTP... phytochelatin synthetase, putative / COBRA cell expansion protein COB, putative similar to phytochelatin synthetase... auxin-responsive GH3 family protein similar to auxin-responsive GH3 product [Glycine max] GI: 18591; contains Pfam profile... peroxidase, putative similar to peroxidase isozyme [Armoracia rusticana] gi|217934|dbj|BAA14144; identical to cDNA class III... copper homeostasis factor / copper chaperone (CCH) (ATX1) identical to gi:3168840 Pfam profile PF00403: Heavy-metal-associated... xyloglucamxyloglucosyl transferase, putative / xyloglucan endotransglycosylase, putative / endo-xyloglucan transferase,... protein kinase family protein contains protein kinase domain, Pfam: PF00069 cysteine proteinase, putative/ thiol protease, putative similar to cysteine proteinase RD21A precursor (thiol protease)... chloride channel protein (CLC-b) identical to CLC-b chloride channel protein GB:CAA96058 from [Arabidopsis thaliana] (J. Biol.... expressed protein fasddin-like arabinogalactan-protein, putative similar to gi_13377784_gb_AAK20861 galactosyltransferase family protein contains Pfam profile: PF01762 galactosyltransferase ;contains similarity to Avr9 elicitor...  


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items