UBC Faculty Research and Publications

The family of Deg/HtrA proteases in plants Schuhmann, Holger; Huesgen, Pitter F; Adamska, Iwona Apr 20, 2012

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


52383-12870_2011_Article_1085.pdf [ 292.32kB ]
JSON: 52383-1.0223463.json
JSON-LD: 52383-1.0223463-ld.json
RDF/XML (Pretty): 52383-1.0223463-rdf.xml
RDF/JSON: 52383-1.0223463-rdf.json
Turtle: 52383-1.0223463-turtle.txt
N-Triples: 52383-1.0223463-rdf-ntriples.txt
Original Record: 52383-1.0223463-source.json
Full Text

Full Text

RESEARCH ARTICLE Open AccessThe family of Deg/HtrA proteases in plantsHolger Schuhmann1,2, Pitter F Huesgen1,3 and Iwona Adamska1*AbstractBackground: The Deg/HtrA family of ATP-independent serine endopeptidases is present in nearly all organismsfrom bacteria to human and vascular plants. In recent years, multiple deg/htrA protease genes were identified invarious plant genomes. During genome annotations most proteases were named according to the order ofdiscovery, hence the same names were sometimes given to different types of Deg/HtrA enzymes in different plantspecies. This can easily lead to false inference of individual protease functions based solely on a shared name.Therefore, the existing names and classification of these proteolytic enzymes does not meet our current needs anda phylogeny-based standardized nomenclature is required.Results: Using phylogenetic and domain arrangement analysis, we improved the nomenclature of the Deg/HtrAprotease family, standardized protease names based on their well-established nomenclature in Arabidopsis thaliana,and clarified the evolutionary relationship between orthologous enzymes from various photosynthetic organismsacross several divergent systematic groups, including dicots, a monocot, a moss and a green alga. Furthermore, weidentified a “core set” of eight proteases shared by all organisms examined here that might provide all theproteolytic potential of Deg/HtrA proteases necessary for a hypothetical plant cell.Conclusions: In our proposed nomenclature, the evolutionarily closest orthologs have the same protease name,simplifying scientific communication when comparing different plant species and allowing for more reliableinference of protease functions. Further, we proposed that the high number of Deg/HtrA proteases in plants ismainly due to gene duplications unique to the respective organism.BackgroundProteolysis, the enzyme-mediated hydrolysis of peptidebonds, is a vital process for every organism. It is asso-ciated with many intracellular and extracellular events,e.g. the removal of damaged proteins, nutrient uptake,processing of protein precursors, and signaling [1,2] . Ahuge variety of proteolytic enzymes, utilizing severaldifferent catalytic mechanism, mediate these processes.The family of Deg proteases (for degradation ofperiplasmic proteins) [3], also known as HtrA proteases(for high temperature requirement A) [4], are oneimportant group of these proteolytic enzymes. They areATP-independent serine endopeptidases found in alldomains of life, including Bacteria, Archaea andEukarya. Deg/HtrA proteases belong to the S1B sub-family of the clan PA according to MEROPS nomencla-ture [5], which features a catalytic domain of the trypsintype, with His-Asp-Ser as catalytic triad. Most Deg/HtrAfamily members contain one to four PDZ protein-proteininteraction domains [6], but members without PDZdomains have been described in plants [7-9] andanimals [8,10]. Deg/HtrA proteases are best studied inEscherichia coli and mammals, where three (DegP,DegQ and DegS) or five (HtrA1-4 and Tysnd1) Deg/HtrAparalogs are present, respectively. DegP from E. coli is aprotein quality control enzyme in the periplasm, acting as aprotease and degrading irreversibly damaged proteins, or asa chaperone, thereby assisting with refolding of denaturatedproteins [11]. A second E. coli protease, DegS, acts in astress signaling cascade sensing misfolded proteins in theperiplasm and transducing the signal to the cytoplasm [12].Human Deg/HtrA proteases have been shown to playcritical roles in severe diseases, such as Alzheimer, age-related macular degeneration and several cancers(reviewed in [13]).Compared to the vast literature on prokaryotic andmammalian Deg/HtrA proteases, relatively little is knownabout members of this family in plants. Searches in* Correspondence: iwona.adamska@uni-konstanz.de1Department of Plant Physiology and Biochemistry, University of Konstanz,Universitätsstr. 10, 78457 Konstanz, GermanyFull list of author information is available at the end of the article© 2012 Schuhmann et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of theCreative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use,distribution, and reproduction in any medium, provided the original work is properly cited.Schuhmann et al. BMC Plant Biology 2012, 12:52http://www.biomedcentral.com/1471-2229/12/52genomic databases revealed 16 genes encoding putativeDeg/HtrA proteases in Arabidopsis thaliana [14], 15 inOryza sativa [15] and 20 in Populus trichocarpa [16].However, to date only a few Deg/HtrA proteases fromA. thaliana have been studied in detail. It was experimen-tally shown that six AtDeg proteases are located inchloroplasts [17-22], one in peroxisomes [8], one in mito-chondria [E. Zeiser, C. Huber, P. Huesgen, H. Schuhmann,I. Adamska, unpublished], and one in the nucleus [23]. Twomore Deg proteases are predicted to reside in chloroplasts,five in mitochondria (one of them with a possible dualchloroplastidial/mitochondrial localization), and the subcel-lular location of one protein is uncertain (reviewed [24]).The chloroplast-located Deg/HtrA proteases were reportedto be involved in the degradation of damaged photosyn-thetic proteins, especially the photosystem II (PSII) reactioncenter D1 protein under light stress conditions (reviewed[24]). Additionally, the thylakoid lumen-located AtDeg1protease acts as a chaperone, assisting in the assembly ofPSII dimers and supercomplexes [25].Little is known about Deg/HtrA proteases targeted tocompartments other than the chloroplast. However, itwas demonstrated that the peroxisomal AtDeg15 prote-ase is a processing enzyme, cleaving the N-terminal per-oxisomal targeting signal 2 that is present in somenuclear-encoded peroxisomal proteins [7,8].Based on the evolutionary relationship of the con-served trypsin domain, Deg/HtrA proteases from Ar-chaea, Bacteria and Eukarya cluster into four distinctclades, whereby plants are the only organisms containingproteases from all four clades [7]. The relatively highnumber of Deg/HtrA proteases and their diversity inplants, together with the observation that some of themlocalize to the same compartment, have a similar domainarrangements, and comparable sizes [7,14,16], carries ahigh risk of confusion. This is potentiated by the factthat during genome annotation of vascular plants (e.g.A. thaliana and O. sativa), Deg/HtrA proteases werenumbered according to the order of their discovery, thusgiving orthologous proteins different numbers andnames depending on the organism. For rice, the situationis even more complex with two independent genome an-notation databases for O. sativa ssp. japonica, i.e. theRice Annotation Database [26] and the MSU Rice Gen-ome Annotation Project Database [27]. Therefore, onegene might occur in the literature under more than oneidentifier or name.In the study presented here, we reassessed the numberof Deg/HtrA proteases in several photosynthetic eukaryoticmodel organisms from the Viridiplantae line, such asthe dicots A. thaliana and P. trichocarpa, the monocotO. sativa, the moss Physcomitrella patens and the unicellu-lar green alga Chlamydomonas reinhardtii, whose genomesare completely sequenced. Using phylogenetic comparisonand domain structure analysis, we propose a unified no-menclature for Deg/HtrA proteases in green plants(including green algae) based on the long-established no-menclature reported for A. thaliana [28]. Furthermore, wewere able to identify a “core set” of eight Deg/HtrA pro-teases shared by all organisms examined here and postu-late that the high number of Deg/HtrA proteases in plantsis mainly due to gene duplications unique to the respectiveorganism.Results and discussionAn inventory of Deg/HtrA proteasesTo establish a standardized nomenclature, we reassessedthe number of Deg/HtrA proteases in the vascular plantsO. sativa ssp. japonicaand P. trichocarpa, the mossP. patens and the green alga C. reinhardtii by searchingannotated genome databases for the presence of deg/htrA sequences (see Methods for details). The secondarystructure of these sequences was analyzed using theHHpred platform [29] in order to confirm the presenceof a Deg/HtrA protease domain, thereby excluding falsepositives from the database searches (data not shown).Additionally, this approach also yielded the domainarchitecture of con firmed Deg/HtrA proteases, which isincluded in Tables 1, 2, 3, 4, 5.Table 1 summarizes the Deg/HtrA proteases fromA. thaliana, which were reported before based on aminoacid (aa) sequence alignments [14] (Table 1, columns 1–3).Using the HHpred platform [29], the presence of a Deg/HtrA-like protease domain could be confirmed for all ofthese proteins (Table 1, column 5), although two proteinsseem to be proteolytically inactive. In AtDeg6 the proteasedomain is truncated and the protease domain of AtDeg16lacks the Asp residue of the catalytic triad (Table 1, column5 and Additional file 1 showing all protease sequences ana-lyzed in this study). The remaining 14 Deg/HtrA proteasescontain the conserved catalytic triad of His, Asp and Serrequired for proteolytic activity (Table 1, column 5). Of thepotentially active proteases, AtDeg5 and AtDeg15 (thelatter with an elongated N-terminus) do not contain anyrecognizable PDZ domain. AtDeg7 possesses two predictedprotease domains, one potentially active and a second,degenerated one with a mutated catalytic triad [6,30], aswell as four PDZ domains arranged in tandems (Table 1,column 5). Considering the domain arrangement andlength of AtDeg7, which is twice as long as most otherDeg/HtrA family members, it was proposed that this prote-ase arose from a gene duplication and fusion event, where-after the second protease domain lost its proteolyticactivity and acquired a new function in protein-proteininteraction [30].For the poplar tree P. trichocarpa, 20 deg/htrA geneswere identified in an initial survey [16]. However, only17 of those genes could be confirmed by this workSchuhmann et al. BMC Plant Biology 2012, 12:52 Page 2 of 14http://www.biomedcentral.com/1471-2229/12/52Table 1 The family of Deg/HtrA proteases in Arabidopsis thalianaGene modela ProteinnamebUniProtKBacc. no.caa domainarrangementdOrthologs in otherplants (this study)Protein nameused in this studyAt3g27925 DEG 1 O22609 439 PD-PDZ Cre02.g088400Cre14.g630550Cre12.g498500Os05g0568900Pp1s160_79V6Pp1s198_100V6POPTR_0001s34960AtDeg1At2g47940 DEG 2 O82261 607 PD-PDZ-PDZ Cre19.g752200Os05g0147500Pp1s8_140V6POPTR_0014s12970POPTR_0020s00220AtDeg2At1g65630 DEG 3 Q9SHZ1 559 PD-PDZ-PDZ Deg 10 Subgroup AtDeg3At1g65640 DEG 4 Q9SHZ0 518 PD-PDZ-PDZ Deg10 Subgroup AtDeg4At4g18370 DEG 5 Q9SEL7 323 PD Cre02.g110600Os12g0616600Pp1s63_95V6POPTR_0011s02330AtDeg5At1g51150 DEG 6 Q9C691 219 PDia n.a. AtDeg6At3g03380 DEG 7 Q8RY22 1097 PD-PDZ-PDZ-PDia-PDZ-PDZCre03.g180650Os02g0712000Pp1s237_5V6Pp1s21_327V6POPTR_0017s03050POPTR_0004s08740POPTR_0004s08720AtDeg7At5g39830 DEG 8 Q9LU10 448 PD-PDZ Cre01.g028350Os04g0459900Pp1s31_50V6POPTR_0004s13440AtDeg8At5g40200 DEG9 Q9FL12 592 PD-PDZ-PDZ Cre19.g752200Os02g0742500Os06g0234100Pp1s176_87V6Pp1s1_203V6POPTR_0015s08440POPTR_0004s13440AtDeg9At5g36950 DEG10 Q9FIV6 586 PD-PDZ-PDZ Cre14.g617600Cre01.g013300Os05g0417100Pp1s55_7V5.1POPTR_0008s07940AtDeg10At3g16540 DEG11 Q9LK71 555 PD-PDZ-PDZ Deg10 Subgroup AtDeg11At3g16550 DEG12 Q9LK70 499 PD-PDZ-PDZ Deg10 Subgroup AtDeg12At5g40560 DEG13 Q9FM41 486 PD-PDZ-PDZ Deg10 Subgroup AtDeg13At5g27660 DEG14 Q3E6S8 429 PD-PDZ Os11g0246600Pp1s180_15V6POPTR_0013s01900AtDeg14At1g28320 DEG15 Q8VZD4 709 NT-PD Cre12.g548200Os05g0497700Pp1s196_28V6POPTR_0004s04650POPTR_0011s05510AtDeg15At5g54745 DEG16 Q3E8B4 198 PDia n.a. AtDeg16a According to TAIR database. b According to [14]. c If more than one protein entry was present, the different versions were analyzed by the HHPred platform(http://toolkit.tuebingen.mpg.de/hhpred/), and the one with intact protease domain and (if present) PDZ domain(s) was considered here. Sequences used in thisstudy are supplied as Supplementary material (Additional file 1). d According to the HHPred platform. Abbreviations: aa, amino acids; n.a., not available; NT,elongated N-terminus; PD, potentially active protease domain; PDia, inactive protease domain (i.e. at least one residue of the catalytic triad is mutated or missing);PDZ, PDZ domain.Schuhmann et al. BMC Plant Biology 2012, 12:52 Page 3 of 14http://www.biomedcentral.com/1471-2229/12/52Table 2 The family of Deg/HtrA proteases in Populus trichocarpaGene modela ProteinnamebUniProtKBacc. no.caa domainarrangementdOrthologs in otherplants (this study)Proposedprotein namePOPTR_0001s34960Pt706718PtDeg1 A9PI52 429 PD-PDZ At3g27925Cre02.g088400Cre14.g630550Cre12.g498500Os05g0568900Pp1s160_79V6Pp1s198_100V6PtDeg1POPTR_0014s12970Pt572750PtDeg2.1 B9I9X1 592 PD-PDZ-PDZ At2g47940Cre19.g752200Os05g0147500Pp1s8_140V6PtDeg2.1POPTR_0020s00220Pt775566PtDeg2.2 B9IBU0 624 PD-PDZ-PDZ At2g47940Cre19.g752200Os05g0147500Pp1s8_140V6PtDeg2.2POPTR_0011s02330Pt771291PtDeg5.1 B9HYW4 316 PD At4g18370Cre02.g110600Os12g0616600Pp1s63_95V6PtDeg5POPTR_0017s03050Pt816849PtDeg7.1 B9GV35 1128 PD-PDZ-PDZ-PDia-PDZ-PDZAt3g03380Cre03.g180650Os02g0712000Pp1s237_5V6Pp1s21_327V6PtDeg7.1POPTR_0004s08740Pt555951PtDeg7.2 B9H390 1080 PD-PDZ-PDZ-PDia-PDZ-PDZAt3g03380Cre03.g180650Os02g0712000Pp1s237_5V6Pp1s21_327V6PtDeg7.2POPTR_0004s08720Pt714140PtDeg7.3 B9H391 1117 PD-PDZ-PDZ-PDia-PDZ-PDZAt3g03380Cre03.g180650Os02g0712000Pp1s237_5V6Pp1s21_327V6PtDeg7.3POPTR_0004s13440Pt199267PtDeg8 B9H3X7 465 PD-PDZ At5g39830Cre01.g028350Os04g0459900Pp1s31_50V6PtDeg8POPTR_0015s08440Pt251989PtDeg9.1 B9IEN8 556 PD-PDZ-PDZ At5g40200Cre19.g752200Os02g0742500Os06g0234100Pp1s176_87V6Pp1s1_203V6PtDeg9.1POPTR_0012s07930Pt728836/Pt823359PtDeg9.2 B9I375 559 PD-PDZ-PDZ At5g40200Cre19.g752200Os02g0742500Os06g0234100Pp1s176_87V6Pp1s1_203V6PtDeg9.2POPTR_0008s07940 B9HI10 587 PD-PDZ-PDZ At5g36950Cre01.g013300Cre14.g617600Os05g0417100Pp1s55_7V5.1PtDeg10POPTR_0013s01900Pt662713/Pt662714 PtDeg14.1PtDeg14.2B9I7J6(partial)422 PD-PDZ At5g27660Os11g0246600Pp1s180_15V6PtDeg14POPTR_0004s04650Pt555773PtDeg15.1 B9H2S3 752 NT-PD At1g28320Cre12.g548200Os05g0497700Pp1s196_28V6PtDeg15.1Schuhmann et al. BMC Plant Biology 2012, 12:52 Page 4 of 14http://www.biomedcentral.com/1471-2229/12/52(Table 2, columns 1–3). The discrepancy between the twostudies is due to improved gene models provided by themore recent release of the P. trichocarpa Phytozome 7.0database (http://www.phytozome.net). Previously describedPtDeg5.2, PtDeg10.1 and PtDeg10.2 (gene modelsPt792125, Pt430673 and Pt567140, respectively), [16]) areobsolete, while PtDeg14.1 and PtDeg14.2 (Pt662713 andPt662714, respectively) are parts of a single open readingframe (ORF), designated as POPTR_0013s01900 (Table 2,columns 1–3). Additionally, a new gene model, similar tothe former Pt430673 (PtDeg10.1), was identified (POPTR_0008s07940). Therefore, the genome of P. trichocarpa con-tains less deg/htrA protein genes than described before.The 15deg/htrA protease genes that were reported earlierfor O. sativa [15] were confirmed in this study (Table 3, col-umns 1–3). However, the protease previously reported asOsDegP4 (LOC_Os03g62900) was only found in the MSURice Genome Annotation Project Database [27], but not inthe Rice Annotation Database [26], and an additional poten-tial OsDeg protease was identified (Os03g0608600/LOC_Os03g41170) by BLAST search and homology predic-tion (Table 3, columns 1–3). Both proteases lackrecognizable PDZ domains. The protein Os02g0712000(LOC_Os02g48180), originally named OsDegP2, possessesa similar domain arrangement to AtDeg7, since it containstwo protease domains (a putative active and a second withmutated catalytic triad residues) and four PDZ domains(Table 3, column 5). Proteins Os01g0278600 (OsDegP1,LOC_Os01g17070), Os08g0144400 (OsDegP11, LOC_Os08g04920), and Os12g0141600 (OsDegP14, LOC_Os12g04750) appear to be proteolytically inactive due to mutatedactive site residues, with the latter containing two inactiveprotease domains and lacking a PDZ domain (Table 3, col-umn 5, and Additional file 1).Seventeen genes encoding for Deg/HtrA proteins arepresent in the genome of the moss P. patens (Table 4, col-umns 1 and 2). Two of these proteins, Pp1s176_111V6 andPp1s67_44V6, have mutated active site residues in theirprotease domain and are predicted to be proteolyticallyinactive (see Additional file 1 for aa sequences), whilePp1s63_95V6 and Pp1s196_28V6 do not contain any de-tectable PDZ domain. Two other proteins, Pp1s237_5V6and Pp1s21_327V6 have, similarly to AtDeg7, a potentiallyactive and an inactive protease domain (Table 4, column 4).In the genome of C. reinhardtii 15 deg/htrA genes wereidentified (Table 5, columns 1–3). Three of these genes,Cre38.g785300, Cre03.g203700, and Cre13.g579900.t1,encode proteolytically inactive enzymes, since at leastone residue of the catalytic triad is missing in each ofthese proteins (column 5, see Additional file 1 for aasequences). Cre19.g752200 contains, in addition to aDeg/HtrA protease domain, a beta-glycanhydrolase do-main in the same ORF, but at present it is not clearwhether this constitutes a new type of domain combin-ation or is the result of an erroneous gene annotation.During the analysis of the Deg/HtrA sequences from C.reinhardtii, the occurrence of long (i.e. 10–20 aa) singleaa repeats reduced the quality of sequence alignmentsand hints to a general problem with the assembly of theC. reinhardtii genome. Therefore, the number of Deg/HtrA proteases might change with future genome data-base updates, similar to the situation in P. trichocarpa.As mentioned earlier, the number of Deg/HtrA pro-teases present in non-plant organisms is much lower. Ageneral trend to an increased number of protein familymembers in plants has also been observed for otherserine protease families [31]. However, the reasons forthis phenomenon remain elusive. Compared to otherorganisms, plants have acquired an additional, highlystructured and complex compartment, the chloroplast,and perform oxygenic photosynthesis, a process that isconnected to the generation of reactive oxygen species.It is tempting to speculate that this might contribute toan increased need for proteolytic capabilities, andTable 2 The family of Deg/HtrA proteases in Populus trichocarpa (Continued)POPTR_0011s05510Pt266544PtDeg15.2 B9N3H9 729 NT-PD At1g28320Cre12.g548200Os05g0497700Pp1s196_28V6PtDeg15.2POPTR_0018s04140Pt787034PtDeg17.1 B9NA38 356 PDia-PDZ n.a. PtDeg17.1POPTR_0394s00220Pt586371PtDeg17.2 B9NA39(fragment)298 PDia-PDZ n.a. PtDeg17.2POPTR_0018s04150Pt577788PtDeg17.3 B9INA2 364 PDia-PDZ n.a. PtDeg17.3a First model identifier is from Phytozome v7.0 (http://www.phytozome.net), the second identifier is the corresponding identifier according to [16]. Discrepanciesbetween the suggested gene model and the UniprotKB entry were solved by analyzing the EST data (if present) and analysis of the genomic sequence for thepresence of ORFs yielding aa sequences similar to ortholog or paralog proteins, with respect to potential splicing sites.b According to [16] c If more than oneprotein entry was present, the different versions were analyzed by the HHPred platform (http://toolkit.tuebingen.mpg.de/hhpred/), and the one with intactprotease domain and (if present) PDZ domain(s) was considered here. Sequences used in this study are supplied as Supplementary material (Additional file 1).d According to the HHPred platform. Abbreviations: aa, amino acids; n.a., not available; NT, elongated N-terminus; PD, potentially active protease domain; PD(1/2),truncated protease domain, probably proteolytically inactive; PDia, inactive protease domain (i.e. at least one residue of the catalytic triad is mutated, or proteasedomain is incomplete); PDZ, PDZ domain.Schuhmann et al. BMC Plant Biology 2012, 12:52 Page 5 of 14http://www.biomedcentral.com/1471-2229/12/52Table 3 The family of Deg/HtrA proteases in Oryza sativaGene modela Previouisprotein namebUniProtKBacc. no.caa DomainarrangementdOrthologs in otherplants (this study)Proposedprotein nameOs01g0278600LOC_Os01g17070Os01g0278600OsDegP1Q5NBK7 470 PDia-PDZ n.a. OsDeg-like 1Os02g0712000LOC_Os02g48180Os02g0712000OsDegP2Q6ZIR2/B9F2C1 1092e PD-PDZ-PDZ-PDia-PDZ-PDZAt3g03380Cre03.g180650Pp1s237_5V6Pp1s21_327V6POPTR_0017s03050POPTR_0004s08740POPTR_0004s08720OsDeg7Os02g0742500LOC_Os02g50880Os02g0742500OsDegP3Q6Z806 567 PD-PDZ-PDZ At5g40200Cre19.g752200Pp1s176_87V6Pp1s1_203V6POPTR_0015s08440POPTR_0004s13440OsDeg9.1-LOC_Os03g62900 -OsDegP4 Q84SQ1 299 PD n.a. – not a Deg? OsDeg-like 6Os04g0459900LOC_Os04g38640Os04g0459900OsDegP5B7EBF9 445 PD-PDZ At5g39830Cre01.g028350Pp1s31_50V6POPTR_0004s13440OsDeg8Os05g0147500LOC_Os05g05480Os05g0147500OsDegP6Q6ASR0 596 PD-PDZ-PDZ At2g47940Cre19.g752200Pp1s8_140V6POPTR_0014s12970POPTR_0020s00220OsDeg2Os05g0417100LOC_Os05g34460Os05g0417100OsDegP7Q6AT72 614 PD-PDZ-PDZ At5g36950Cre01.g013300Cre14.g617600Pp1s55_7V5.1POPTR_0008s07940OsDeg10Os05g0497700LOC_Os05g41810Os05g0497700OsDegP8Q0DH14 722e NT-PD At1g28320Cre12.g548200Pp1s196_28V6POPTR_0004s04650POPTR_0011s05510OsDeg15Os05g0568900LOC_Os05g49380Os05g0568900OsDegP9Q6AUN5 437 PD-PDZ At3g27925Cre02.g088400Cre14.g630550Cre12.g498500Pp1s160_79V6Pp1s198_100V6POPTR_0001s34960OsDeg1Os06g0234100LOC_Os06g12780Os06g0234100OsDegP10Q67VA4 628 PD-PDZ-PDZ At5g40200Cre19.g752200Pp1s176_87V6Pp1s1_203V6POPTR_0015s08440POPTR_0004s13440OsDeg9.2Os08g0144400LOC_Os08g04920Os08g0144400OsDegP11Q7EYD8 496 NT-PDia-PDZf n.a. OsDeg-like 2Os11g0246600LOC_Os11g14170Os11g0246600OsDegP12Q0ITK5 472e PD-PDZ At5g27660Pp1s180_15V6POPTR_0013s01900OsDeg14Os12g0141500LOC_Os12g04740Os12g0141500OsDegP13Q2QXV8 228 PD n.a. – not a Deg? OsDeg-like 3Os12g0141600LOC_Os12g04750Os12g0141600OsDegP14Q2QXV6 593 PDia-PDia n.a. OsDeg-like 4Os12g0616600LOC_Os12g42210Os12g0616600OsDegP15Q2QM57 313 PD At4g18370Cre02.g110600Pp1s63_95V6POPTR_0011s02330OsDeg5Schuhmann et al. BMC Plant Biology 2012, 12:52 Page 6 of 14http://www.biomedcentral.com/1471-2229/12/52therefore higher protease numbers. On the other hand,although land plants are sessile and therefore cannot es-cape from stress conditions, the high number of genesencoding Deg/HtrA proteases is unlikely to reflect anadaptation to this life style, since the motile green algaeC. rheinhardtii possesses a comparable number of Deg/HtrA encoding genes.Phylogenetic analysis of “green“Deg/HtrA proteases –proposal of a standardized nomenclatureTo establish a nomenclature system based on homolo-gies, we next examined the evolutionary relationship ofthe Deg/HtrA proteases retrieved from the databasesearches. The aa sequences of protease domains contain-ing an intact catalytic triad as identified by the sequencealignment were phylogenetically analyzed using the max-imum likelihood (ML) method. Proteases HtrA [UniProt:P73354], HhoA [UniProt: P72780], and HhoB [UniProt:P73940] from the cyanobacterium Synechocystis sp.PCC6803 [32] were included into the tree for comparision,due to the cyanobacterial origin of chloroplasts [33]. Asthe focus of this study is on green plants, no sequencesfrom other photosynthetic eukaryotes (e.g. reg algae,diatoms) were included. Proteins lacking the catalytic triador with an incomplete protease domain (Tables 1, 2, 3, 4, 5)were not included in this analysis to avoid misleadingpositions in the resulting phylogenetic tree. The presenceof such inactive protease variants in plant genomessuggests that they might have acquired roles other thanproteolysis, resulting in altered evolutionary pressure on theprotease domain and the potential for higher mutagenesisrates.Initial phylogentic analysis showed that four proteins,such as Os12g0141500 (LOC_Os03g62900), Os12g0141500(LOC_Os12g04740) and Os03g0608600 (LOC_Os03g41170) from O. sativa and Cre07.g332050 from C. rheinhardtii(Tables 3 and 5) did not cluster with any other analyzedDeg/HtrA protease and seemed to be only distant relativesof this protease family (see Additional file 2 for the respect-ive ML tree). Hence these proteases were excluded in thefurther analysis for clarity (see Additional file 3 for final in-put data).The Deg/HtrA proteases investigated here form fourdistinct clades (Figure 1; see Addtional file 4 for a treecontaining the original gene model names), similar to anearlier study that included Deg/HtrA proteases fromevolutionarily very distant taxa and only a few plantorthologs [7]. Clade I is further split into two subgroups,where subgroup IA includes orthologs of Deg1, Deg5and Deg8 (Figure 1, Addtional file 4). Subgroup IB com-prises the prokaryotic (cyanobacterial) Deg/HtrA pro-teases, and one protease each from the land plantsA. thaliana (AtDeg14, Table 1), P. trichocarpa (PtDeg14,Table 2), O. sativa (OsDeg14, originally called OsDegP12,Table 3) and P. patens (PpDeg14, Table 4). Notably, theDeg14 protease is missing in the green alga C. reinhardti(Table 5).PpDeg1-group-like (Pp1s152_166V5.1), which passedall validation procedures as described above and in the'Methods' section, seems to be more distantly related toDeg/HtrA proteases from groups IA and IB (Figure 1).Based on its position in the tree, and the comparablylow bootstrap support, it was not possible to decidewhether it can be included in subgroup IA or IB.Alternatively, the gene model and the respective proteinsequence might require improvement. Clade II includesAtDeg2-AtDeg4 and AtDeg9-AtDeg13 and their orthologs(Figure 1, Addtional file 4). Clades III and IV gatherAtDeg15 and AtDeg7 and their orthologs, respectively(Figure 1, Addtional file 4).Based on the phylogenetic tree, we grouped allorthologous Deg/HtrA proteases from analyzed plantspecies and propose a common name for enzymes fromthe same group in order to unify the nomenclaturebetween different plant species (Tables 1, 2, 3, 4, 5, lasttwo columns). Since the majority of detailed studies onplant Deg/HtrA proteases focused on A. thalianaenzymes, we used their well-established nomenclature[14,28] as a guideline for renaming Deg/HtrA orthologsin the other organisms analyzed here (Tables 2, 3, 4, 5last columns).In P. trichocarpa, we renamed PtDeg5.1 (Pt771291) toPtDeg5 since only one isoform of this protein is presentin this organism and combined PtDeg14.1 (Pt662713)and PtDeg14.2 (Pt662714) encoded by the same ORF(see above) under the common name PtDeg14 (Table 2).A new gene model (POPTR_0008s07940) similar toAtDeg10 was named PtDeg10.Table 3 The family of Deg/HtrA proteases in Oryza sativa (Continued)Os03g0608600LOC_Os03g41170Os03g0608600expr. proteinQ75HK9 271 PD n.a. – not a Deg? OsDeg-like 5a First model identifier from the Rice Annotation Project (Build5), second identifier according to the TIGR/MSU nomenclature (Osa1 Release 6.1). b First nameaccording to GenBank/UnitProtKB, second identifier according to the TIGR/MSU nomenclature. c If more than one protein entry was present, the different versionswere analyzed by the HHPred platform (http://toolkit.tuebingen.mpg.de/hhpred/), and the one with intact protease domain and (if present) PDZ domain(s) wasconsidered here. Sequences used in this study are supplied as Supplementary material (Additional file 1). d According to the HHPred platform. Abbreviations: aa,amino acids; n.a., not available; NT, elongated N-terminus; PD, potentially active protease domain; PDia: inactive protease domain (i.e. at least one residue of thecatalytic triad is mutated); PDZ, PDZ domain. e Sequence was modified based on the EST data (http://compbio.dfci.harvard.edu/tgi/plant.html) and comparisonwith orthologs from other species. f The HHPred platform detects secondary structures similar to RNA polymerase II large subunit from Saccharomyces cerevisiae inthe N-terminal part of the protein – this is an indication that the predicted transcription start is incorrectly annotated.Schuhmann et al. BMC Plant Biology 2012, 12:52 Page 7 of 14http://www.biomedcentral.com/1471-2229/12/52Table 4 The family of Deg/HtrA proteases in Physcomitrella patensGene modela UniProtKBacc. no.baa DomainarrangementcOrthologues in otherplants (this study)Proposedprotein namePp1s160_79V6 A9T3R3 500 PD-PDZ At3g27925Cre02.g088400Cre14.g630550Cre12.g498500Os05g0568900POPTR_0001s34960PpDeg1.1Pp1s198_100V6 A9TBD2 475 PD-PDZ At3g27925Cre02.g088400Cre14.g630550Cre12.g498500Os05g0568900POPTR_0001s34960PpDeg1.2Pp1s79_92V6 A9SHE2 501 PD-PDZ At3g27925Cre02.g088400Cre14.g630550Cre12.g498500Os05g0568900POPTR_0001s34960PpDeg1.3Pp1s21_138V6 A9RQ01 486 PD-PDZ At3g27925Cre02.g088400Cre14.g630550Cre12.g498500Os05g0568900POPTR_0001s34960PpDeg1.4Pp1s8_140V6 A9RGN6 618 PD-PDZ-PDZ At2g47940Cre19.g752200Os05g0147500POPTR_0014s12970POPTR_0020s00220PpDeg2Pp1s63_95V6 A9SBN1 362 PD At4g18370Cre02.g110600Os12g0616600POPTR_0011s02330PpDeg5Pp1s237_5V6 A9TIB2 1076 PD-PDZ-PDZ-PDia-PDZ-PDZAt3g03380Cre03.g180650Os02g0712000POPTR_0017s03050POPTR_0004s08740POPTR_0004s08720PpDeg7.1Pp1s21_327V6 A9RQ611072PD-PDZ-PDZ-PDia-PDZ-PDZAt3g03380Cre03.g180650Os02g0712000POPTR_0017s03050POPTR_0004s08740POPTR_0004s08720PpDeg7.2Pp1s31_50V6 A9RVV4 493 PD-PDZ At5g39830Cre01.g028350Os04g0459900POPTR_0004s13440PpDeg8Pp1s176_87V6 A9T734 612 PD-PDZ-PDZ At5g40200Cre19.g752200Os02g0742500Os06g0234100POPTR_0015s08440POPTR_0004s13440PpDeg9.1Pp1s1_203V6 A9RB23 540 PD-PDZ At5g40200Cre19.g752200Os02g0742500Os06g0234100POPTR_0015s08440POPTR_0004s13440PpDeg9.2Schuhmann et al. BMC Plant Biology 2012, 12:52 Page 8 of 14http://www.biomedcentral.com/1471-2229/12/52For Deg/HtrA proteases from O. sativa, we propose tochange the existing nomenclature present in the TIGR/MSU database [27], and we also provide preliminary newnames for the more distantly related Deg/HtrA-like pro-teases or proteins without an intact protease domain(Table 3). For these proteins, we suggest to use the names“OsDeg-like1-6”, in order to prevent confusion between e.g.OsDeg1 (Os05g0568900, LOC_Os05g49380) and the moredistantly related protein formerly know as „OsDegP1“, nowOsDeg-like1 (Os01g0278600, LOC_Os01g17070) (Table 3).Since no names were given for annotated Deg/HtrAproteases in P. patens we propose to name them basedon phylogeny as suggested in Table 4 (last column).For C. reinhardtii, the proposed nomenclature of Deg/HtrA proteases partially matched those present in thePhytozome 7.0 and UniProt databases (Table 5). How-ever, we suggest to change the names of Deg1 (Cre02.g088400), Deg11 (Cre12.g498500) and Deg13 (Cre14.g630550) to CrDeg1.1, CrDeg1.2, and CrDeg1.3 (Table 5)since all three proteases are more closely related to AtDeg1than to AtDeg11 or AtDeg13 (Figure 1, Addtional file 4).For Cre19.g752200, we propose the name CrDeg9.1, sinceits protease domain seems to be evolutionary relatedto AtDeg9, although the domain arrangement of thisprotease (it contains a beta-glycanhydrolase domain inthe C-terminal half of the protein) is unusual for theseenzymes (Table 5). The protease domain of Cre14.g617600, described as Deg9 in both the Phytozome 7.0and UniProt databases, seems to be more closelyrelated to those of Deg10 proteases, but the bootstrapsupport is insufficient to justify its renaming. For thisreason we suggest the name CrDeg9.2 for this protein(Table 5). A new gene model Cre12.g548200 wasnamed CrDeg15 (Table 5) since the protease domainwas the closest related to those of AtDeg15 (Figure 1,Addtional file 4).Analysis of domain arrangement supports proposednomenclatureAnalysis of the protein aa sequences with the HHpredplatform yielded predictions for the number and the ar-rangement of protease and PDZ domains in each Deg/HtrA protease (Figure 1 and Tables 1, 2, 3 and 5, column5; Table 4, column 4). This data supports the presence offour major Deg/HtrA clades (Figure 1), as reported be-fore [7]. Proteases from clade I contain one protease do-main and one PDZ domain (with the exception of allDeg5 orthologs, where the PDZ domain is missing),whereas proteases from clade II contain one protease do-main and two PDZ domains (Figure 1). Clades III and IVcontain Deg/HtrA proteases with non-canonical domainarrangements: Clade III consists of very large proteins(approximately 1,000 aa), which according to predictioncontain one active and one inactive protease domain,and 4 PDZ domains (Figure 1). Recently, it was shownthat the inactive protease domain in AtDeg7 is involvedin trimerization of this enzyme [30]. Whether this holdstrue for other Deg7 orthologs remains to be examined.Proteins from clade IV do not contain any detectablePDZ domain, and their protease domain is shifted to-wards the C-terminus (Figure 1). Since this domain ar-rangement is unusual for Deg/HtrA proteases [6],Table 4 The family of Deg/HtrA proteases in Physcomitrella patens (Continued)Pp1s55_7V5 651 PD-PDZ-PDZ At5g36950Cre01.g013300Cre14.g617600Os05g0417100POPTR_0008s07940PpDeg10Pp1s180_15V6 A9T7W1 473 PD-PDZ At5g27660Os11g0246600POPTR_0013s01900PpDeg14Pp1s196_28V6 A9TAV2 784 NT-PD At1g28320Cre12.g548200Os05g0497700POPTR_0004s04650POPTR_0011s05510PpDeg15Pp1s152_166V5.1 339d PD-PDZ Group 1a PpDeg1-group-likePp1s176_111V6 527 PDia-PDZ n.a.Pp1s67_44V6 A9SD45 408 PDia-PDZ n.a.a Model identifier according to Phytozome v7.0 (http://www.phytozome.net). Discrepancies between the suggested gene model and the UniprotKB entry weresolved by analyzing EST data (if present) and analysis of the genomic sequence for the presence of ORFs yielding aa sequences similar to ortholog and paralogproteins, with respect to potential splicing sites. b If more than one protein entry was present, the different versions were analyzed by the HHPred platform (http://toolkit.tuebingen.mpg.de/hhpred/), and the one with intact protease domain and (if present) PDZ domain(s) was considered here. Sequences used in this studyare supplied as Supplementary material (Additional file 1). c According to the HHPred platform. Abbreviations: aa, amino acids; n.a., not available; NT, elongated N-terminus; PD, potentially active protease domain; PD(1/2), truncated protease domain, probably proteolytically inactive; PDia, inactive protease domain (i.e. at leastone residue of the catalytic triad is mutated, or protease domain is incomplete); PDZ, PDZ domain. d Fragment extended based on the EST data (asmbl_4603.p5physco4 from Phytozome 5.0., TC42496 in DCFI http://compbio.dfci.harvard.edu/cgi-bin/tgi/tc_report.pl?tc=TC42496&species=moss).Schuhmann et al. BMC Plant Biology 2012, 12:52 Page 9 of 14http://www.biomedcentral.com/1471-2229/12/52Table 5 The family of Deg/HtrA proteases in Chlamydomonas reinhardtiiGene modela ProteinnamebUniProtKBacc. no.caa DomainarrangementdOrthologs in otherplants (this study)Proposedprotein nameCre02.g088400 Deg1Deg1A A8I8X2 530 PD-PDZ At3g27925Os05g0568900Pp1s160_80V2.1Pp1s198_95V2.1POPTR_0001s34960CrDeg1.1Cre14.g630550 Deg13- - 555 PD-PDZ At3g27925Os05g0568900Pp1s160_80V2.1Pp1s198_95V2.1POPTR_0001s34960CrDeg1.2Cre12.g498500 Deg11- - 462 PD-PDZ At3g27925Os05g0568900Pp1s160_80V2.1Pp1s198_95V2.1POPTR_0001s34960CrDeg1.3Cre02.g092000 Deg2Deg2 A8I9B8 656 PD-PDZ-PDZ Deg2 Group CrDeg2Cre02.g110600 Deg5Deg5 A8I3D5 356 PD At4g18370Os12g0616600Pp1s63_93V2.1POPTR_0011s02330CrDeg5Cre03.g180650 Deg7Deg7 A8JH35 1108 PD-PDZ-PDZ-PDia-PDZ-PDZAt3g03380Os02g0712000Pp1s237_5V2.1Pp1s21_312V2.1POPTR_0017s03050POPTR_0004s08740POPTR_0004s08720CrDeg7Cre01.g028350 Deg8Deg8 A8HQB3 436 PD-PDZ At5g39830Os04g0459900Pp1s31_48V2.1POPTR_0004s13440CrDeg8Cre19.g752200e -- A8JBP6 1353 PD-betaglycan-hydrolaseAt5g40200Os02g0742500Os06g0234100Pp1s176_79V2.1Pp1s1_200V2.1POPTR_0015s08440POPTR_0004s13440At2g47940Os05g0147500Pp1s8_145V2.1POPTR_0014s12970POPTR_0020s00220CrDeg9.1Cre14.g617600 Deg9Deg9 A8HNV3 619 PD-PDZ-PDZ At5g36950Os05g0417100Pp1s55_7V5.1POPTR_0008s07940CrDeg9.2Cre01.g013300 Deg10- - 739 PD-PDZ-PDZ At5g36950Os05g0417100Pp1s55_7V5.1POPTR_0008s07940CrDeg10Cre12.g548200 -- A8IYE3(fragment)1249 NT-PD At1g28320Os05g0497700Pp1s196_28V2.1POPTR_0004s04650POPTR_0011s05510CrDeg15Cre07.g332050 -- A8IGX3(fragment)284 PD n.a. – not a Deg?Cre13.g579900 -- - 415 PDia-PDZ-PDZ n.a.Schuhmann et al. BMC Plant Biology 2012, 12:52 Page 10 of 14http://www.biomedcentral.com/1471-2229/12/52Table 5 The family of Deg/HtrA proteases in Chlamydomonas reinhardtii (Continued)Cre03.g203730 -CrDegO A8IXF5 789 PDia-PDZ n.a.Cre38.g785300 - A8JG98 319 PDia n.a.a According to the Phytozome v7.0 database (http://www.phytozome.net/). b First name according the Phytozome v7.0 database, second name according to UniprotKB(http://www.uniprot.org/). c If more than one protein entry was present, the different versions were analyzed by the HHPred platform (http://toolkit.tuebingen.mpg.de/hhpred/), and the one with intact protease domain and (if present) PDZ domain(s) was considered here. Sequences used in this study are supplied as Supplementarymaterial (Additional file 1). d According to the HHPred platform. Abbreviations: aa, amino acids; n.a., not available; NT, elongated N-terminus; PD, potantially activeprotease domain; PD(1/2), truncated protease domain, probably proteolytically inactive; PDia, inactive protease domain (i.e. at least one residue of the catalytic triad ismutated, or protease domain is incomplete); PDZ, PDZ domain. e Model is probably not correct, not supported by EST, repetetive stretches of single amino acids.> 90%> 70%group IIgroup IIIgroup IVgroup IDeg5 groupgroup IAOsDeg10PpDeg10PtDeg10AtDeg10AtDeg13AtDeg12AtDeg11AtDeg3AtDeg4CrDeg10.1CrDeg9.2CrDeg9AtDeg9OsDeg9.1OsDeg9.2PtDeg9.2PtDeg9.1PpDeg9.2PpDeg9.1PpDeg2PtDeg2.2PtDeg2.1OsDeg2AtDeg2 CrDeg2SynHhoA SynHhoBSynHtrA PpDeg14OsDeg14PtDeg14AtDeg14PpDeg-group-likePtDeg1AtDeg1PpDeg1.1PpDeg1.2OsDeg1CrDeg1.1 PpDeg1.3PpDeg1.4CrDeg1.2CrDeg1.3CrDeg5PpDeg5OsDeg5PtDeg5.1AtDeg5OsDeg8PtDeg8Ateg8PpDeg8CrDeg8PtDeg7.2PtDeg7.3PpDeg7.2PpDeg7.1OsDeg7CrDeg7PtDeg7.1AtDeg7CrDeg15OsDeg15PpDeg15AtDeg15PtDeg15.2PtDeg15.10.5group IBN- -CN- -CN- -CN- -CN- -CN- -CFigure 1 Maximum likelihood phylogenetic tree of Deg/HtrA proteases in selected plant species. Following plant species wereinvestigated: Arabidopsis thaliana, Oryza sativa, Populus trichocarpa, Physcomitrella patens, Chlamydomonas reinhardtii, and the cyanobacteriumSynechocystis sp. PCC6803. Phylogenetic tree labeled labeled with the new names as suggested by this study. Filled circles indicated a bootstrapsupport (100 replicates) of> 90%, empty circles indicate a bootstrap support of> 70%. Additionally, the domain arrangement representative forproteases from each group is indicated. Deg/HtrA proteases from clade I contain one protease domain (oval shapes) and one PDZ domain(diamonds), with the exception of Deg5 proteases, which possess a protease domain only. Proteases from clade II contain an additional PDZdomain, clade III gathers proteases with one active (oval shape) and one inactive (discontinous oval shape) protease domain and four PDZdomains, whereas enzymes from clade IV contain a single protease domain, which is shifted toward the C-terminus.Schuhmann et al. BMC Plant Biology 2012, 12:52 Page 11 of 14http://www.biomedcentral.com/1471-2229/12/52proteins from this group are sometimes not classified asmembers of this family, e.g. the mammalian ortholog ofplant Deg15, called Tysnd1 [10]. However, due to thepresence of a Deg/HtrA protease domain we classifiedDeg15 orthologs as Deg/HtrA family members (Tables 1,2, 3, 4, 5).Although the phylogenetic tree and, as a consequence,the standardized protease nomenclature are built on theaa sequences of the protease domains alone, they aresupported by the analysis of the domain arrangements,using the aa sequence of the full-length protein. All pro-teases share the same domain arrangement with theirnearest ortholog, e.g. all Deg1 proteins from the five ana-lyzed organisms possess one PDZ domain, all Deg5 pro-teins contain none and all Deg7 proteins contain twoprotease and four PDZ domains (Tables 1, 2, 3 and 5,column 5; Table 4, column 4).A “core set“of Deg/HtrA proteases in plantsAll organisms examined here contain between 15 to17 deg/htrA-encoding genes, whereas the number of po-tentially active enzymes is slightly lower. Although thetotal number of Deg/HtrA proteases is similar in allplants analyzed in this study, the distribution of the pro-teases within the phylogenetic tree (Figure 1) differs foreach species.In the genome of P. trichocarpa, several genes for Deg/HtrA protease isoforms exist (e.g. PtDeg2.1 and PtDeg2.2,PtDeg7.1-7.3, PtDeg9.1 and PtDeg9.2, and PtDeg15.1 andPtDeg15.2, Figure 1 and Table 2) and this is probably theresult of a whole genome duplication [34]. A similar large-scale duplication event [35] could explain the presence ofduplicated Deg/HtrA protease genes in the genome ofP. patens (for PpDeg2, PpDeg9, and PpDeg7, Table 4). Incontrast, AtDeg3, AtDeg4, AtDeg11, AtDeg12, andAtDeg13 in A. thaliana seem to be duplicated versions ofAtDeg10, since all of them belong to clade II and clusterexclusively with Deg10 proteases from all species investi-gated here (Figure 1). AtDeg3 (At1g65630) and AtDeg4(At1g65640), as well as AtDeg11 (At3g16540) and AtDeg12(At3g16550), are encoded by genes arranged in tandemarrays, indicating individual gene duplication events.From this collection of Deg/HtrA protease encodinggenes, we extracted the hypothetical minimum numberof Deg/HtrA proteases present in plants. This “core set”represents conserved Deg/HtrA protease types found inevery organism examined here, in the lowest possible copynumber – for example, the genome of P. trichocarpacontains three Ptdeg7 genes, however, A. thaliana andO. sativa contain only one, therefore the “core set” con-tains one Deg7 protease. For plants, this conserved “coreset” consists of eight proteases (Table 6), such as Deg1,Deg5, and Deg8 detected in the thylakoid lumen [9-17],Deg2 and Deg7 in the chloroplast stroma [18,21], Deg9 inthe nucleolus [36], Deg15 in the peroxisome [8], andDeg10 is predicted to have a mitochondrial localization[14]. C. reinhardtii, for example, possesses only “core set”proteases as Deg/HtrA enzymes, although some arepresent in duplicates. This “core set” seems to provide allthe proteolytic potential of Deg/HtrA proteases that is ne-cessary for a hypothetical plant cell.ConclusionIn this study, we present the first detailed analysis of theDeg/HtrA protease family in green plants, including gen-omes from vascular plants, a moss, and a green alga.Based on phylogenetic analysis of the protease domainsand analysis of the domain arrangement in the full-length protease, we propose a standardized nomencla-ture for Deg/HtrA proteases in plants. Although bio-chemical data is only available for selected proteasesfrom A. thaliana, our data suggests (within the limits ofa sequence-only analysis) that proteases with the samename might indeed execute comparable physiologicalfunctions. Compared to animals and prokaryotes, thenumber of Deg/HtrA proteases encoded in plant gen-omes is much higher, which is partially due to genomeor gene duplications. However, the exact reasons areprobably different for every organism. A “core set” ofeight protease genes was identified for plants, of whichTable 6 Conservation of Deg/HtrA family membersamong photosynthetic organismsOrganismprotease nameAt Pt Os Pp CrDeg1 + + + 1.1, 1.2, 1.3, 1.4 1.1, 1.2, 1.3Deg2 + 2.1, 2.2 + + +Deg3 + - - - -Deg4 + - - - -Deg5 + + + + +Deg6 + - - - -Deg7 + 7.1, 7.2, 7.3 + 7.1, 7.2 +Deg8 + + + + +Deg9 + 9.1, 9.2 9.1, 9.2 9.1, 9.2 9.1, 9.2Deg10 + + + + +Deg11 + - - - -Deg12 + - - - -Deg13 + - - - -Deg14 + + + + -Deg15 + 15.1, 15.2 + + +Deg16 + - - - -Deg17 - 17.1, 17.2, 17.3 - - -The presence of a protease in a particular organism is indicated by +, itsabsence by -. If more than one isoform is present, the names are given.Proteases of the “core set” are depicted in bold. At, Arabidopsis thaliana; Cr,Chlamydomonas reinhardtii; Os, Oryza sativa; Pp, Physcomitrella patens; Pt,Populus trichocarpa.Schuhmann et al. BMC Plant Biology 2012, 12:52 Page 12 of 14http://www.biomedcentral.com/1471-2229/12/52at least one copy is present in every genome examinedhere. This seems to be the minimum number of Deg/HtrA proteases necessary for plants. We are confidentthat the work presented here will be a valuable tool andguide-line for future research on plant Deg/HtrA pro-teases that will allow easy communication between re-search groups working with different photosyntheticorganisms.MethodsDatabase researchWe performed BLAST searches with a peptide queryagainst translated nucleotide collections (tBLASTn) [37]in the National Center for Biotechnology Informationdatabase (NCBI, http://blast.ncbi.nlm.nih.gov/Blast.cgi),the Phytozome 7.0 database at the DOE Joint GenomeInstitute (http://www.phytozome.net/), the EST-basedgene indices of the TIGR database [38] (http://compbio.dfci.harvard.edu/tgi/) and with a peptide query againstthe protein database of Uni24rot Knowledgebase (http://www.uniprot.org/). AtDeg1-AtDeg16 (see Table 1 foraccessions), E. coli DegP (UniProt: E0IYM0) and DegS(UniProt: E0J2L5), and human HtrA2 (UniProt: O43464)were used as query sequences.Analysis of sequencesThe secondary structure of the aa sequences (or thetranslation products of the DNA sequences) retrieved bythe BLAST searches was predicted using the HHpredplatform, which uses a library of published crystal struc-tures to detect domains within a given polypeptide [29].Additionally, aa sequences were aligned with well-studiedaa sequences of AtDeg1-AtDeg16 proteins using M-Coffee[39], to identify parts in the sequences derived from intronsequences in the gene model. If the presence of intronswas suspected, EST-data (if present) was analyzed to im-prove the gene model. See Tables 1, 2, 3, 4, 5 for infor-mation about specific gene models. If the model wascorrected, this improved model was again analyzed by theHHpred platform. If no Deg/HtrA protease domain wasdetected, and this was not due to the presence of intronsequences in the gene model, the sequence was rejectedfor this study.Alignment of protease domains and phylogenetic analysisThe aa sequences of active protease domains, as detectedby the HHpred platform, were aligned using DiALIGN[40], MAFFT [41], and Muscle [42]. From these initialalignments, a consensus alignment was created by resolv-ing discrepancies manually (Additional file 3). Gaps in thisalignment were removed manually, and these sequenceswere subjected to phylogenetic analysis with PhyML 3.0[43] using the ML method (default settings except 100bootstraps in nonparametric bootstrap analysis instead ofapproximate likelihood ratio test). To confirm the overalltopology of the obtained phylogenetic tree, the data wasalso analyzed by the programs Protpars (parsimonymethod) and Neighbor (neighbor-joining method) fromthe PHYLIP package [44].Additional filesAdditional file 1: Amino acid sequences of all proteins used in thisstudy. Active site residues of the catalytic triad are highlighted in red.Protease domains as identified using the HHpred platform arehighlighted in cyan, PDZ domains in yellow and green.Additional file 2: Maximum Likelihood tree of all Deg/HtrAproteases from this study containing intact catalytic triads. MLphylogenetic tree of all putative Deg/HtrA proteases with intact proteasesdomains from A. thaliana, O. sativa, P. trichocarpa, P. patens, C. reinhardtii,and the cyanobacterium Synechocystis sp. PCC6803 from the originalBLAST searches, using the original gene model names according toTables 1, 2, 3, 4, 5 column 1. Filled circles indicated a bootstrap support(100 replicates) of > 90%, empty circles indicate a bootstrap support of >70%.Additional file 3: Original input data for the phylogenetic analysis.Original aa alignment data file that was subjected to the phylogeneticanalysis process.Additional file 4: Maximum likelihood phylogenetic tree of Deg/HtrA proteases in selected plant species. Following species wereinvestigated: Arabidopsis thaliana, Oryza sativa, Populus trichocarpa,Physcomitrella patens, Chlamydomonas reinhardtii, and thecyanobacterium Synechocystis sp. PCC6803. Phylogenetic tree labeledwith original gene model numbers according to Tables 1, 2, 3, 4, 5,column 1. The proteases form 4 distinct groups, labeled I-V. Filled circlesindicate a bootstrap support (100 replicates) of >90%, empty circlesindicate a bootstrap support of >70%.Authors’ contributionsHS and PFH designed and carried out the database search and analysis, HSdrafted the manuscript. IA supervised the project and all authors edited andapproved the final manuscript.AcknowledgementsWe thank Sadok Legroune and Jaime Garcia-Moreno for their help andadvice in an early stage of this project. This work was supported by grantsfrom the Deutsche Forschungsgemeinschaft (AD92/8-2 and AD92/8-3), theKonstanz University (to I.A.) and a fellowship from the Alexander vonHumboldt Foundation (to H.S.).Author details1Department of Plant Physiology and Biochemistry, University of Konstanz,Universitätsstr. 10, 78457 Konstanz, Germany. 2School of Agriculture andFood Sciences, University of Queensland, St. Lucia, QLD 4072, Australia.3Centre for Blood Research, University of British Columbia, 2350 HealthSciences Mall, Vancouver, BC V6T 1Z3, Canada.Received: 1 November 2011 Accepted: 21 March 2012Published: 20 April 2012References1. Wickner S, Maurizi MR, Gottesman S: Posttranslational quality control:folding, refolding, and degrading proteins. Science 1999, 286:1888–1893.2. Gottesman S: Proteolysis in bacterial regulatory circuits. Annu Rev Cell DevBiol 2003, 19:565–587.3. Strauch KL, Beckwith J: An Escherichia coli mutation preventingdegradation of abnormal periplasmic proteins. Proc Natl Acad Sci U S A1988, 85:1576–1580.4. Lipinska B, Sharma S, Georgopoulos C: Sequence analysis and regulationof the htrA gene of Escherichia coli: a sigma 32-independent mechanismof heat-inducible transcription. Nucleic Acids Res 1988, 16:10053–10067.Schuhmann et al. BMC Plant Biology 2012, 12:52 Page 13 of 14http://www.biomedcentral.com/1471-2229/12/525. Rawlings ND, Morton FR, Kok CY, Kong J, Barrett AJ: MEROPS: the peptidasedatabase. Nucleic Acids Res 2008, 36:D320–325.6. Clausen T, Southan C, Ehrmann M: The HtrA family of proteases:implications for protein composition and cell fate. Mol Cell 2002, 10:443–455.7. Helm M, Luck C, Prestele J, Hierl G, Huesgen PF, Fröhlich T, Arnold GJ,Adamska I, Gorg A, Lottspeich F, et al: Dual specificities of theglyoxysomal/peroxisomal processing protease Deg15 in higher plants.Proc Natl Acad Sci U S A 2007, 104:11501–11506.8. Schuhmann H, Huesgen PF, Gietl C, Adamska I: The DEG15 serine proteasecleaves peroxisomal targeting signal 2-containing proteins in Arabidopsisthaliana. Plant Physiol 2008, 148:1847–1856.9. Sun X, Peng L, Guo J, Chi W, Ma J, Lu C, Zhang L: Formation of DEG5 andDEG8 complexes and their involvement in the degradation ofphotodamaged photosystem II reaction center D1 protein in Arabidopsis.Plant Cell 2007, 19:1347–1361.10. Kurochkin IV, Mizuno Y, Konagaya A, Sakaki Y, Schonbach C, Okazaki Y:Novel peroxisomal protease Tysnd1 processes PTS1- and PTS2-containingenzymes involved in beta-oxidation of fatty acids. EMBO J 2007, 26:835–845.11. Spiess C, Beil A, Ehrmann M: A temperature-dependent switch fromchaperone to protease in a widely conserved heat shock protein. Cell1999, 97:339–347.12. Walsh NP, Alba BM, Bose B, Gross CA, Sauer RT: OMP peptide signalsinitiate the envelope-stress response by activating DegS protease viarelief of inhibition mediated by its PDZ domain. Cell 2003, 113:61–71.13. Vande Walle L, Lamkanfi M, Vandenabeele P: The mitochondrial serineprotease HtrA2/Omi: an overview. Cell Death Differ 2008, 15:453–460.14. Huesgen PF, Schuhmann H, Adamska I: The family of Deg proteases incyanobacteria and chloroplasts of higher plants. Physiol Plant 2005,123:413–420.15. Tripathi LP, Sowdhamini R: Cross genome comparisons of serine proteasesin Arabidopsis and rice. BMC Genomics 2006, 7:200.16. Garcia-Lorenzo M, Sjödin A, Jansson S, Funk C: Protease gene families inPopulus and Arabidopsis. BMC Plant Biol 2006, 6:30.17. Itzhaki H, Naveh L, Lindahl M, Cook M, Adam Z: Identification andcharacterization of DegP, a serine protease associated with the luminalside of the thylakoid membrane. J Biol Chem 1998, 273:7094–7098.18. Haussuhl K, Andersson B, Adamska I: A chloroplast DegP2 proteaseperforms the primary cleavage of the photodamaged D1 protein in plantphotosystem II. EMBO J 2001, 20:713–722.19. Peltier JB, Emanuelsson O, Kalume DE, Ytterberg J, Friso G, Rudella A,Liberles DA, Soderberg L, Roepstorff P, von Heijne G, et al: Central functionsof the lumenal and peripheral thylakoid proteome of Arabidopsisdetermined by experimentation and genome-wide prediction. Plant Cell2002, 14:211–236.20. Schubert M, Petersson UA, Haas BJ, Funk C, Schröder WP, Kieselbach T:Proteome map of the chloroplast lumen of Arabidopsis thaliana. J BiolChem 2002, 277:8354–8365.21. Sun X, Fu T, Chen N, Guo J, Ma J, Zou M, Lu C, Zhang L: The stromalchloroplast Deg7 protease participates in the repair of photosystem IIafter photoinhibition in Arabidopsis. Plant Physiol 2010, 152:1263–1273.22. Friso G, Giacomelli L, Ytterberg AJ, Peltier JB, Rudella A, Sun Q, Wijk KJ: In-depthanalysis of the thylakoid membrane proteome of Arabidopsis thalianachloroplasts: new proteins, new functions, and a plastid proteomedatabase. Plant Cell 2004, 16:478–499.23. Pendle AF, Clark GP, Boon R, Lewandowska D, Lam YW, Andersen J, MannM, Lamond AI, Brown JW, Shaw PJ: Proteomic analysis of the Arabidopsisnucleolus suggests novel nucleolar functions. Mol Biol Cell 2005, 16:260–269.24. Schuhmann H, Adamska I: Deg proteases and their role in protein qualitycontrol in different subcellular compartments of the plant cell. Physiol Plant2011, in press.25. Sun X, Ouyang M, Guo J, Ma J, Lu C, Adam Z, Zhang L: The thylakoidprotease Deg1 is involved in photosystem-II assembly in Arabidopsisthaliana. Plant J 2010, 62:240–249.26. Tanaka T, Antonio BA, Kikuchi S, Matsumoto T, Nagamura Y, Numa H, SakaiH, Wu J, Itoh T, Sasaki T, et al: The Rice Annotation Project Database (RAP-DB):2008 update. Nucleic Acids Res 2008, 36:D1028–1033.27. Ouyang S, Zhu W, Hamilton J, Lin H, Campbell M, Childs K, Thibaud-NissenF, Malek RL, Lee Y, Zheng L, et al: The TIGR Rice Genome AnnotationResource: improvements and new features. Nucleic Acids Res 2007, 35:D883–887.28. Adam Z, Adamska I, Nakabayashi K, Ostersetzer O, Haussuhl K, Manuell A,Zheng B, Vallon O, Rodermel SR, Shinozaki K, et al: Chloroplast andmitochondrial proteases in Arabidopsis. A proposed nomenclature. PlantPhysiol 2001, 125:1912–1918.29. Soding J, Biegert A, Lupas AN: The HHpred interactive server for proteinhomology detection and structure prediction. Nucleic Acids Res 2005, 33:W244–248.30. Schuhmann H, Mogg U, Adamska I: A new principle of oligomerization ofplant DEG7 protease based on interactions of degenerated proteasedomains. Biochem J 2011, 435:167–174.31. Schaller A: A cut above the rest: the regulatory function of plantproteases. Planta 2004, 220:183–197.32. Huesgen PF, Miranda H, Lam XT, Perthold M, Schuhmann H, Adamska I,Funk C: Recombinant Deg/HtrA proteases from Synechocystis sp. PCC6803 differ in substrate specificity, biochemical characteristics andmechanism. Biochem J 2011, 435:733–742.33. Palmer JD: the symbiotic birth and spread of plastids: how many timesand whodunit? J Phycol 2003, 39:4–11.34. Tuskan GA, Difazio S, Jansson S, Bohlmann J, Grigoriev I, Hellsten U, PutnamN, Ralph S, Rombauts S, Salamov A, et al: The genome of black cottonwood,Populus trichocarpa (Torr. & Gray). Science 2006, 313:1596–1604.35. Rensing SA, Ick J, Fawcett JA, Lang D, Zimmer A, Van de Peer Y, Reski R: Anancient genome duplication contributed to the abundance of metabolicgenes in the moss Physcomitrella patens. BMC Evol Biol 2007, 7:130.36. Brown JW, Shaw PJ, Shaw P, Marshall DF: Arabidopsis nucleolar proteindatabase (AtNoPDB). Nucleic Acids Res 2005, 33:D633–636.37. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ:Gapped BLAST and PSI-BLAST: a new generation of protein databasesearch programs. Nucleic Acids Res 1997, 25:3389–3402.38. Lee Y, Tsai J, Sunkara S, Karamycheva S, Pertea G, Sultana R, Antonescu V,Chan A, Cheung F, Quackenbush J: The TIGR Gene Indices: clustering andassembling EST and known genes and integration with eukaryoticgenomes. Nucleic Acids Res 2005, 33:D71–74.39. Moretti S, Armougom F, Wallace IM, Higgins DG, Jongeneel CV, NotredameC: The M-Coffee web server: a meta-method for computing multiplesequence alignments by combining alternative alignment methods.Nucleic Acids Res 2007, 35:W645–648.40. Subramanian AR, Kaufmann M, Morgenstern B: DIALIGN-TX: greedy andprogressive approaches for segment-based multiple sequencealignment. Algorithms Mol Biol 2008, 3:6.41. Katoh K, Toh H: Recent developments in the MAFFT multiple sequencealignment program. Brief Bioinform 2008, 9:286–298.42. Edgar RC: MUSCLE: multiple sequence alignment with high accuracy andhigh throughput. Nucleic Acids Res 2004, 32:1792–1797.43. Guindon S, Gascuel O: A simple, fast, and accurate algorithm to estimatelarge phylogenies by maximum likelihood. Syst Biol 2003, 52:696–704.44. Felsenstein J: PHYLIP (Phylogeny Inference Package) version 3.6. Distributedby the author Department of Genome Sciences, University of Washington,Seattle 2005.doi:10.1186/1471-2229-12-52Cite this article as: Schuhmann et al.: The family of Deg/HtrA proteasesin plants. BMC Plant Biology 2012 12:52.Submit your next manuscript to BioMed Centraland take full advantage of: • Convenient online submission• Thorough peer review• No space constraints or color figure charges• Immediate publication on acceptance• Inclusion in PubMed, CAS, Scopus and Google Scholar• Research which is freely available for redistributionSubmit your manuscript at www.biomedcentral.com/submitSchuhmann et al. BMC Plant Biology 2012, 12:52 Page 14 of 14http://www.biomedcentral.com/1471-2229/12/52


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items