CHARACTERIZATION OF THE CRICKET PARALYSIS VIRUS 3C PROTEASE AND ITS SUBSTRATE SPECIFICITY by Ruhi Nichalle Brito B.Sc., Trent University 2016 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE in THE FACULTY OF GRADUATE AND POSTDOCTORAL STUDIES (Biochemistry and Molecular Biology) THE UNIVERSITY OF BRITISH COLUMBIA (Vancouver) December 2018 © Ruhi Nichalle Brito, 2018 ii The following individuals certify that they have read, and recommend to the Faculty of Graduate and Postdoctoral Studies for acceptance, a thesis/dissertation entitled: Characterization of the Cricket Paralysis Virus 3C Protease and its Substrate Specificity submitted by Ruhi Nichalle Brito in partial fulfillment of the requirements for the degree of Master of Science in Biochemistry and Molecular Biology Examining Committee: Dr. Eric Jan, Department of Biochemistry and Molecular Biology Supervisor Dr. Helene Sanfacon, Department of Botany Supervisory Committee Member Dr. Dieter Bromme, Department of Biochemistry and Molecular Biology Supervisory Committee Member Dr. Thibault Mayor, Department of Biochemistry and Molecular Biology Additional Examiner Additional Supervisory Committee Members: Dr. Chris Overall, Department of Biochemistry and Molecular Biology Supervisory Committee Member Supervisory Committee Member iii Abstract Many positive-sense single-stranded RNA (+ssRNA) viruses encode an open reading frame that is translated as a polyprotein. This viral polyprotein is subsequently cleaved by its virally encoded protease or in some instances with the aid of host proteases. It has been well established that +ssRNA viruses, such as poliovirus encode protease(s) that can cleave and target host protein substrates in order to facilitate viral infection. The Dicistrovirade family, are +ssRNA viruses that primarily infect arthropods such as honey bees, shrimp, and crickets and can have an impact on agriculture and the economy. Dicistroviruses encode a cysteine protease, 3C, that is responsible for the cleavage of its own polyprotein. To date little is known about dicistrovirus protease structure, catalytic efficiency, cleavage site specificity and substrate specificity. Cricket paralysis virus (CrPV), a dicistrovirus, has been well characterized within its family. CrPV has been characterized for its translation mechanism as well as a few of its encoded proteins such as 1A, thus making it a good model to study. Given that other +ssRNA viral 3C proteases, such as poliovirus, cleave host substrates during infection, it could be thought that the CrPV 3C protease cleaves target host proteins during infection. In order to better understand the fundamental processes that are regulated during infection, CrPV was chosen as a model. In this thesis CrPV 3C protease was purified to address two aims. 1) Purify and verify activity of CrPV 3C protease and 2) Determine cleavage site specificity of CrPV 3C protease. This will help give a better understanding of the catalytic efficiency and target substrate specificity of the purified protease. iv Lay Summary Viruses use resources available to them from the host they infect. This is because the virus is small and does not contain all the essential components for it to survive by itself. One way for the virus to trick the host into helping the virus, is by stopping regular functions in the host. This in turn limits the resources in the host. The way that the virus stops these functions is by this enzyme known as a protease. This protease cut host proteins, making them unusable to the host. Unfortunately, we do not know what these host proteins are and how some of these virus enzymes act. To understand this, we isolated a type of viral protease and then determined what it could possibly cut. v Preface All experiments were conducted by me. The fluorescent peptides were designed with the help of Dr. Eric Jan and made by Biomatik. The construction of the peptide libraries for Aim 2 were made with the help of Dr. Nestor Solis, as well as the analysis of the mass spectrometry data. The phylogenetic tree of +ssRNA virus cysteine proteases was made with the help of Dr. Marli Vlok. All experiments were designed by my supervisor Dr. Eric Jan and myself. I finished the Biological Safety Training Course [Certificate ID: 2016-qajCN] , Chemical Safety course [Certificate ID: 2018-Xa7BQ], and Radionuclide Safety and Methodology course [Certificate ID: 2017-Rc7NX] provided by Risk Management Services as required for this research. vi Table of Contents Abstract ................................................................................................................................... iii Lay Summary ........................................................................................................................... iv Preface ....................................................................................................................................... v Table of Contents ..................................................................................................................... vi List of Tables ............................................................................................................................ ix List of Figures ........................................................................................................................... x List of Symbols ........................................................................................................................ xii List of Abbreviations .............................................................................................................xiii Acknowledgements ................................................................................................................. xv Dedication ............................................................................................................................. xvii CHAPTER 1: INTRODUCTION ...................................................................................................... 1 1.1 General overview of RNA viruses ........................................................................... 1 1.1.1 Positive-sense single stranded RNA viruses ......................................................... 1 1.1.2 Viral life cycle ..................................................................................................... 5 1.2 Host substrates cleaved during +ssRNA viral infection ............................................ 8 1.2.1 Host translation shutoff ........................................................................................ 8 1.2.2 Host transcription shutoff ..................................................................................... 9 1.2.3 Immune response and stress granules ................................................................... 9 1.3 Proteases ............................................................................................................... 10 1.3.1 Protease families ................................................................................................ 10 vii 1.3.2 Cysteine proteases ............................................................................................. 12 1.3.3 Protease Kinetics ............................................................................................... 16 1.4 Dicistrovirus .......................................................................................................... 17 1.4.1 Classification and genome organization ............................................................. 17 1.4.2 Cricket Paralysis Virus ...................................................................................... 21 1.4.3 Dicistrovirus 3C protease and their cleavage specificity ..................................... 21 1.5 Approaches to identify candidate substrates ........................................................... 24 1.5.1 Classical and new approaches of identification................................................... 24 1.6 Thesis approach ..................................................................................................... 29 CHAPTER 2: MATERIALS AND METHODS ................................................................................. 31 2.1 Generation of plasmid GST-tagged CrPV 3C ......................................................... 31 2.2 Optimization of expression of CrPV 3C ................................................................. 31 2.3 GST CrPV 3C Purification and cleavage conditions............................................... 32 2.4 Determination of GST CrPV 3C stability ............................................................... 33 2.5 Determination of protease activity by Fluorescence quenching assay ..................... 33 2.6 In-vitro translation reaction .................................................................................... 34 2.7 Proteome Identification of Cleavage Site ............................................................... 35 CHAPTER 3: OPTIMIZATION, PURIFICATION AND KINETICS OF 3C PROTEASE .......................... 37 3.1 Background ........................................................................................................... 37 3.2 Results ................................................................................................................... 37 3.2.1 Expression of CrPV 3C ...................................................................................... 37 3.2.2 Purification of CrPV 3C and cleavage of GST tag .............................................. 42 3.2.3 Buffer conditions for CrPV 3C protease activity ................................................ 47 viii 3.2.4 CrPV 3C In-vitro translation of polyprotein ....................................................... 57 3.3 Discussion ............................................................................................................. 60 CHAPTER 4: DETERMINATION OF CLEAVAGE SITE USING PICS ............................................... 61 4.1 Background ........................................................................................................... 61 4.2 Results ................................................................................................................... 61 4.2.1 Cleavage site specificity of CrPV 3C ................................................................. 61 4.3 Discussion ............................................................................................................. 68 CHAPTER 5: CONCLUSION ........................................................................................................ 69 5.1 Discussion ............................................................................................................. 69 5.1.1 Purification of tagged and untagged CrPV 3C .................................................... 69 5.1.2 Characterization of GST CrPV 3C kinetics ........................................................ 77 5.1.3 CrPV 3C protease specificity ............................................................................. 79 5.2 Summary and Future directions ............................................................................. 82 Bibliography ............................................................................................................................ 84 Appendices ............................................................................................................................ 105 ix List of Tables Table 1.1 List of different cysteine protease superfamilies. ....................................................... 14 Table 4.1 List of possible cleavage sites .................................................................................... 67 Table A.1 Raw data used to make alignment of unrooted tree ................................................. 109 x List of Figures Figure 1.1 Family tree of +ssRNA virus. ..................................................................................... 3 Figure 1.2 Poliovirus genome organization and polyprotein processing. ...................................... 4 Figure 1.3 Viral entry and replication. ......................................................................................... 6 Figure 1.4 Mechanism of action of cysteine proteases. .............................................................. 15 Figure 1.5 Dicistrovirus capsid structure and general phylogeny. .............................................. 19 Figure 1.6 Sequence alignment of dicistrovirus 3C-like protease. .............................................. 23 Figure 1.7 Workflow of TAILS. ................................................................................................ 26 Figure 1.8 Workflow of PICS. .................................................................................................. 28 Figure 3.1 Vector map of fusion protein. ................................................................................... 38 Figure 3.2 Expression of recombinant fusion protein. ................................................................ 40 Figure 3.3 Purification of recombinant GST CrPV 3C and GST CrPV 3C (Cys211Ala). ........... 44 Figure 3.4 Purification of untagged CrPV 3C. ........................................................................... 46 Figure 3.5 Stability of GST CrPV 3C. ....................................................................................... 49 Figure 3.6 Buffer optimization of GST CrPV 3C....................................................................... 51 Figure 3.7 Determination of minimum amount of GST CrPV 3C. ............................................. 53 Figure 3.8 Michaelis-Menten kinetics of GST CrPV 3C. ........................................................... 56 Figure 3.9 In-vitro synthesis of CrPV-2 and CrPV-ORF1-STOP. .............................................. 58 Figure 4.1 GluC cleavage site specificity using PICS of a trypsin-digested E. coli library. ........ 65 Figure 4.2 GST CrPV 3C cleavage site specificity using PICS in a trypsin-digested E. coli library. ...................................................................................................................................... 66 Figure 5.1 In-Silico prediction of CrPV 3C structural fold......................................................... 73 Figure 5.2 Sequence alignment of CrPV 3C with other cysteine proteases. ................................ 74 xi Figure 5.3 Unrooted family tree of CrPV 3C with other cysteine proteases. .............................. 76 Figure 5.4 Binding of substrate inhibitor in binding cleft of coxsackie virus 3C protease. ......... 81 Figure A.1 Purification of preparation 2. ................................................................................. 105 Figure A.2 Purification of preparation 3. ................................................................................. 106 Figure A.3 GST CrPV 3C cleavage site specificity using PICS in a GluC-digested E. coli library. ............................................................................................................................................... 108 xii List of Symbols b beta % percent ° degree xiii List of Abbreviations (-) Negative sense (+) Positive sense 4E-BP1 Eukaryotic translation initiation factor 4E-binding protein 1 COFRADIC Combined fractional diagonal chromatography CoV Coronavirus CrPV Cricket paralysis virus DCV Drosophila C virus DNA-PK DNA-dependent protein kinase E. coli Escherichia coli eIF4A Eukaryotic initiation factor 4A eIF4E Eukaryotic initiation factor 4E eIF4G Eukaryotic translation initiation factor 4G G3BP1 Ras GTPase-activating binding protein 1 GST Glutathione S-transferase HIV-1 Human immunodeficiency virus 1 HPLC High performance liquid chromatography HRV Human rhinovirus IGR Intergenic region IPTG Isopropyl b-D-1-thiogalactopyranoside IRES Internal ribosome entry site kb Kilobases MAVS Mitochondrial antiviral signaling protein xiv MCA 7-Methoxycoumarin-4-acetic Acid N-succinimidyl Ester MS Mass spectrometery MW Molecular weight ORF Open reading frame PABP Poly(A)-binding protein PICS Proteomic identification of clavage sites PSIV Plautia stali intestine virus PV Poliovirus RdRp RNA dependent RNA polymerase RFU Random fluorescence units RNA Ribonucleic acid SDS-PAGE Sodium dodecyl sulfate polyacrylamide gel electrophoresis TAILS Terminal amine isotopic labeling of substrates TBP TATA-binding protein TDP2 Tyrosyl-DNA phosphodiesterase 2 VP Viral protein VPg Viral protein genome-linked VPg-pUpU Uridylylated viral protein genome linked xv Acknowledgements First and foremost, I would like to say thank you to my supervisor, Dr. Eric Jan. Working in your lab has been a wonderful experience. Your love of science is truly encouraging and never ceases to amaze me. Thank you for being a mentor to me, and encouraging me to grow as a scientist. I am truly grateful to have such a unique opportunity. I would also like to thank my committee members Dr. Dieter Bromme, Dr. Chris Overall, and Dr. Helene Sanfacon. Your insightful questions and suggestions are greatly appreciated. I would especially like to thank Dr. Nestor Solis from the Overall lab, and Dr. Pierre-Marie Andrault from the Bromme lab. To Dr. Solis, you have been nothing but patient and helpful, especially in the PICS experiments. Thank you for showing me how to do PICS, as well as running my samples. To Dr. Andrault, you have been such a great joy to work with. Thank you for answering my multitude of questions. I would also like to extend my thanks to the rest of the Overall lab and Bromme lab, both these labs have played a major role in my completion of this thesis and I want to say thank you for giving me the resources when it was not available to me. Additionally, I would also like to the labs located in the LSI for letting me use their equipment, especially the Duong and Yip lab. To the past and present members in the Jan lab, especially Dr. Craig Kerr, Dr. Marli Vlok, Dr. Keren Nevo Di Nur and Jibin Sadasivan, thank you for helping me go through the hurdles of graduate school as well as giving insightful suggestions when I came across a road block. To the past and present undergrads in the Jan lab: Helen Tran, Milagros Sempere, Liang Xu, Dora Xiong, Cindy Chen and Kathrin Meretes. Thank you for the various food adventures and company. I wish you all the best of luck in your future endeavors. xvi Finally, to my Mom, Sister, Boyfriend, and friends. Words cannot begin to explain how lucky I am to have such an amazing support group. Thank you for keeping me on track, motivating me when I felt lost, and comforting me during my sleepless night. To my boyfriend, Allan, I am truly grateful for your love and patience. xvii Dedication For my Family.1 Chapter 1: Introduction 1.1 General overview of RNA viruses 1.1.1 Positive-sense single stranded RNA viruses Viruses are obligate parasites that infect host cells, and can be classified by the genetic material that is packaged in the virus and on whether they are enveloped or not (Baltimore, 1971). Positive sense single stranded RNA (+ssRNA) viruses, classified by the Baltimore classification system, encompass a wide family of viruses that infect a broad range of host species including plants, animals, and humans. Their genomes range in size and structural complexity, from ~2.3 kilobases (kb)(i.e. narnavirus genome) to 32 kb (i.e. coronavirus (CoV) genome) (Denison, 2008; Swevers, Vanden Broeck, & Smagghe, 2013). +ssRNA viruses share a few similarities; 1) the genome functions as template for replication, 2) the +ssRNA genome is read as messenger RNA for viral protein synthesis, and 3) +ssRNA viruses exist as a quasispecies during infection, meaning they often undergo mutations that are generated upon replication of RNA (Andino & Domingo, 2015). Their virally encoded RNA dependent RNA polymerase lacks proofreading ability during replication, resulting in viral progeny that carry numerous mutations that are selected under pressures inflicted by the host defense mechanisms (Venkataraman, Prasad, & Selvarajan, 2018). Viral genomes encode a limited number of proteins, and thus must rely on host proteins and machinery for viral replication and translation (Ahlquist, Noueiry, Lee, Kushner, & Dye, 2003). The type of virus-host protein interactions can differ depending on the given host and virus. My thesis is focused on one type of essential viral protein, the viral protease, which is responsible for: 1) the cleavage of the viral polyprotein into the individual mature proteins 2 (Hillman & Cai, 2013), and 2) in some cases the cleavage of host protein substrates during viral infection (Jagdeo et al., 2018). +ssRNA viruses can encode a single or multiple open reading frames (ORFs). In many cases ORFs can be translated as a polyprotein. The polyprotein generally encodes viral nonstructural and structural proteins. The nonstructural proteins carry out essential roles in the replication process, and include but are not limited to a RNA dependent RNA polymerase (RdRp), protease, and in most cases a viral protein genome-linked (VPg). The structural proteins generally consist of capsid and envelope proteins that are responsible for packaging the viral genome, protecting the genetic material from being digested by host proteins, and transporting the genome to cells (Roos, Ivanovska, Evilevitch, & Wuite, 2007). For the purpose of this thesis, I will be focusing on the order Picornavirales which includes the families, Picornaviridae and Dicistroviridae. To best understand the polyprotein processing events, I will focus on poliovirus (PV) from the picornavirus family, as it is well characterized and it is thought to be similar the dicistroviruses family (Figure 1.1). The +ssRNA genome of PV encodes a single ORF that is translated as a polyprotein upon viral entry. Upon translation of the ORF, the polyprotein is processed into three precursors, P1, encoding the structural proteins, and P2, and P3, encoding the nonstructural proteins (Figure 1.2). The P1 region is initially cleaved by the 2Apro proteinase, and subsequent cleavages of the viral polyproteins is processed by the 3C/3CDpro (Figure 1.2) (Castello, Alvarez, & Carrasco, 2011; De Jesus, 2007). The cleavage of these viral precursor polyproteins is essential for the PV life cycle (Patil & Gupta, 2017). 3 Figure 1.1 Family tree of +ssRNA virus. Unrooted family tree of +ssRNA viruses based on the amino acid sequence of the RdRp domain of the viral nonstructural protein. Consists of viruses from the order Secoviridae, Iflaviridae, Caliciviridae, Picornaviridae, Marnaviridae, unassigned insect RNA viruses, and Dicistroviridae. Reproduced with permission from (Chen et al., 2012). 4 Figure 1.2 Poliovirus genome organization and polyprotein processing. The viral genome of poliovirus from picornaviruses with VPg protein covalently linked to the 5’ end, and a 3’ poly(A) tail. The genome contains a 5’ internal ribosome entry site (IRES), followed by an open reading frame (ORF) encoding the structural and nonstructural proteins. Reproduced and altered with permission from (Mutsvunguma et al., 2011). 5 1.1.2 Viral life cycle Most +ssRNA viruses, generally, have a similar life cycle. Because this thesis focuses on dicistroviruses which are thought to have a similar life cycle as picornaviruses, I will briefly review the life cycle of a well-known +ssRNA virus, PV. Moreover, the dicistrovirus life cycle has not been characterized in detail. Infection starts with the PV virion binding to the cell surface receptor CD155 (Brandenburg et al., 2007). The virus then enters the cell via endocytosis, where VP1 of the viral capsid inserts into the cell membrane, forming a pore and releasing the viral genomic RNA into the cytoplasm where it can then immediately be read by the ribosome and translated via its 5’ internal ribosome entry site (IRES) (Lévêque & Semler, 2015; J. Louten, 2016). The IRES recruits ribosomes to the viral genome using cap-independent mechanism (Lévêque & Semler, 2015). PV can halt-cap-dependent translation during viral infection by cleaving translation initiation factors, such as eukaryotic translation initiation factor 4G (eIF4G) and poly(A)-binding protein (PABP) (Kuyumcu-Martinez, Van Eden, Younan, & Lloyd, 2004). Host DNA repair enzyme 5’ tyrosyl-DNA phosphodiesterase 2 (TDP2) removes the covalently linked VPg from the 5’ end of PV RNA. Subsequently, the IRES recruits the 40S ribosome via initiation factors eIF4B, 4A, 4G, 3, and ITAF to direct translation of PV by the IRES (Maciejewski et al., 2016; Murray & Barton, 2003; Plank & Kieft, 2012). Translation of the PV genome results in a polyprotein that is subsequently cleaved by two virally encoded proteases, 2A and 3C/3CD, into mature viral proteins that then goes to perform their own functions within the infected cell (Figure 1.3). 6 Figure 1.3 Viral entry and replication. Schematic of viral entry and replication of +ssRNA viruses, specifically poliovirus. 1-3) The capsid structure attaches to the cell surface receptor, releasing its viral RNA into the cytoplasm by endocytosis. 4-5) The RNA is then translated and its polyprotein cleaved by the virally encoded proteinase. The virus replicates its RNA in membrane complexes and finally 6) releases the newly synthesized viral RNA in virions by cell lysis. Reproduced with permission from (J. Louten, 2016). 7 Following translation, the genomic RNA acts as a template for –ssRNA synthesis in a membrane-associated replication complex consisting of viral and cellular proteins, to produce a double-stranded RNA intermediate (Barton, O'Donnell, & Flanegan, 2001). Replication complexes are derived from the rearrangement of the endoplasmic reticulum membranes, and the Golgi apparatus (Belov et al., 2012). The virally encoded 2B, 2C, 2BC, 3A, and 3AB proteins are known to play essential roles in viral replication, where 2B, 2C, and 2BC disrupt the Golgi apparatus to facilitate the formation of replication complexes (Teterina, Gorbalenya, Egger, Bienz, & Ehrenfeld, 1997). The VPg (3B) functions as a primer for the synthesis of both the negative and positive-strand RNA (Vogt & Andino, 2010). Replication is initiated by the synthesis of the negative strand RNA with the 3’ poly(A) tail of the viral genome (Vogt & Andino, 2010). The +ssRNA is bound to the membrane by viral protein 3AB. The RdRp adds two uracil monophosphate (UMP) molecules to the hydroxyl group of the third tyrosine residue of VPg, the uridylylated VPg (VPg-pUpU) then acts as a primer for the RdRp to copy the RNA into –ssRNA (Plotch & Palant, 1995; Sun, Guo, & Lou, 2014; Vogt & Andino, 2010). The cis-acting replication element (CRE), a highly conserved RNA stem loop structural element located within the polyprotein-coding region, serves as a template for the uridylation of VPg , from which +ssRNA can be synthesized (Goodfello, Kerrigan, & Evans, 2003; Murray, Steil, Roberts, & Barton, 2004). The newly-synthesized +ssRNA interact with the capsid proteins to undergo encapsidation, generating progeny virions that will subsequently infect other cells upon release from the host cell by lysis. 8 1.2 Host substrates cleaved during +ssRNA viral infection Studies on enterovirus proteases have illustrated how viruses can strategically alter cellular processes to facilitate virus infection. Here in this section, I will highlight some of the major cellular processes that are targeted by the PV protease. 1.2.1 Host translation shutoff Given the limited number of proteins that viruses encode, viruses exploit host proteins to aid in virus infection by regulating fundamental cellular processes. This is accomplished by preventing the formation of stress granule, circumventing immune responses and transcriptional shutoff, to name a few (Jagdeo et al., 2018). The best-characterized regulation of host protein substrates is through inhibition of host translation during infection. PV, which encodes two proteases, 2A and 3C. Eukaryotic initiation factor 4G (eIF4G), is an essential scaffold protein that recruits the 43S complex to the 5’ end of mRNA for cap-dependent translation (Byrd, Zamora, & Lloyd, 2005; Imataka & Sonenberg, 1997). During PV infection, the encoded 2A protease, cleaves eIF4GI and eIF4GII, which causes a shutoff of cap-dependent translation, as the cleaved eIF4G cannot recruit eukaryotic initiation factor 4A (eIF4A) to unwind RNA around the AUG initiation codon (Avanzino, Fuchs, & Fraser, 2017; Kuyumcu-Martinez et al., 2004). However, the C-terminal eIF4G cleavage product can still bind to the PV IRES, shifting translation from host protein synthesis to viral protein synthesis (Byrd et al., 2005). PV infection also reduces the availability of eIF4E, by the dephosphorylation of 4E-BP1, which binds to eIF4E in its dephosphorylated state (Avanzino et al., 2017; Byrd et al., 2005). One common substrate cleaved by both 3C and 2A is the poly(A)-binding protein (PABP). 3C cleaves PABP when it is polysome-associated while 2A cleaves PABP when it is not associated to polysomes, the cleavage site for 2A and 3C on PABP do not overlap 9 (Bonderoff, LaRey, & Lloyd, 2008). PABP is a major mRNA-interacting protein that is found to stimulate translation initiation, when cleaved this results in inhibition of translation (Kuyumcu-Martinez et al., 2004). The cleavage of host proteins such as eIF4G and PABP, shuts off key regulatory processes in the host, shift the synthesis of host proteins to viral protein synthesis. 1.2.2 Host transcription shutoff Several key transcription factors are cleaved under PV infection, which indirectly affects translation. The TATA-binding protein (TBP), part of transcription factor II D, binds to a DNA sequences containing a TATA box, and cellular transcription is catalyzed by DNA-dependent RNA polymerase II. PV 3C protease cleaves TBP thereby shutting off RNA polymerase II transcription during infection (Kundu, Raychaudhuri, Tsai, & Dasgupta, 2005). PV 3C also targets RNA polymerase I and III. The primary role of TBP is to carry out low level transcription of many genes (Yalamanchili, Datta, & Dasgupta, 1997). Not all DNA sequences contain a TATA-box, and unsurprisingly PV 3C protease cleaves other substrates that associate with RNA pol II. Transcription factor IIIC, a DNA binding factor responsible for the preinitiation complex of a subset of genes is also cleaved during PV infection (Shen, Igo, Yalamanchili, Berk, & Dasgupta, 1996). Cleavage of substrates like these results in the shutoff of regulatory processes such as the innate immune response, allowing the virus to evade the immune system (Dotzauer & Kraemer, 2012). 1.2.3 Immune response and stress granules During viral infection, the immune response is activated upon recognition of the virus, however, viruses have evolved clever ways to downregulate the immune response. The PV 2A protease targets DNA-dependent protein kinase (DNA-PK) to modulate the host innate and adaptive antiviral response (Graham et al., 2004). DNA-PK is a nuclear serine/threonine kinase 10 protein which is activated by DNA double-stranded breaks (Amatya et al., 2012; Smith & Jackson, 1999). DNA-PK plays a role in DNA repair, lymphocyte repertoire formation, and various proinflammatory cytokines involved in the innate immune response (Dynan & Yoo, 1998; Graham et al., 2004). Another immune response protein, mitochondrial antiviral signaling protein (MAVS), is cleaved during PV infection (Feng et al., 2014). RNA viruses are detected by the host and signal the adaptor protein MAVS to induce expression of inflammatory cytokines (Luecke & Paludan, 2015). Cleavage of these innate immune response proteins circumvents the host immune response in order to promote virus infection. Finally, PV 3C protease cleaves Ras GTPase-activating protein-binding 1, which is a major component of stress granules, thus prevent stress granule formation late in PV infection (Reineke & Lloyd, 2015; White, Cardenas, Marissen, & Lloyd, 2007). The exact role of stress granules is not known but recent studies have shown that it plays a role in immune response activation (Ng et al., 2013). There are many other substrates that have been identified including nuclear pore complex 98, heterogeneous nuclear ribonucleoprotein M, just to list a few, thus highlighting how a viral protease can modulate several cellular pathways for virus infection (Jagdeo et al., 2018). 1.3 Proteases 1.3.1 Protease families Proteases are grouped into different families based on the residue that acts as a nucleophile for catalytic activity (López-Otín & Bond, 2008). All families of proteases likely emerged during the early stages of protein evolution, which was necessary during the time when catabolism of proteins and generation of amino acids emerged in primitive organisms (López-Otín & Bond, 2008). Proteases or peptidases are enzymes that degrade or cleave proteins through the hydrolysis of peptide bonds, some proteolytic enzymes act on peptides as substrates while 11 others act on whole proteins (A J Barrett & McDonald, 1986; Schauperl et al., 2015). Therefore, proteases that act on intact proteins are termed proteinases or endopeptidases, while proteases that act on the N- or C- terminal ends of proteins are known as exopeptidases (A J Barrett & McDonald, 1986). Specificity is usually profiled by the assignment of non-prime and prime amino acid preference at the site of cleavage annotated as P1¯P1’, where the down arrow indicates the site of cleavage (Schechter & Berger, 1967). Depending on the protease, substrate specificity may be contingent on recognition of neighboring amino acid sequence fitting into the subpockets (i.e. P1-P4) and accessibility to substrate, for example trypsin has a specificity for Lys or Arg at the P1 position (Schechter and Berger 1967; Diamond 2007; Wright 2018). In some cases, specificity for certain proteases are undefined as they are highly promiscuous, and can potentially cleave random substrates, i.e. HIV (Chaudhury & Gray, 2009). Proteases are defined by their nucleophile or catalytic type, which helps define the pH at which the protease will operate, in addition to the inhibitors needed to inactivate it (Brix, 2014). Based on the nucleophile for catalytic activity, there are 6 well-characterized protease families consisting of: serine, threonine, aspartic, glutamic, metaloproteases, and cysteine proteases (Rawlings 2013). The MEROPS database that is manually curated and contains information on these proteases with their inhibitors and substrates (Rawlings et al., 2018). Homologous proteases and inhibitors are grouped into protein species which is then grouped into families and then clans (Rawlings et al., 2018). Serine proteases possess a nucleophilic serine in the active site. The catalytic triad of these proteases consists of Asp, His, and Ser residues, or in some cases a dyad with either a Lys or His paired with a Ser (Di Cera, 2009; Hedstrom, 2002; Perona & Craik, 2018). Serine 12 proteases can be divided into 4 clans depending on their specificity: chymotrypsin, subtilisin, carboxypeptidase Y and Clp protease (Hedstrom, 2002). Threonine proteases, defined by their nucleophilic Threonine (Thr), possess overlapping chemical properties and a similar catalytic triad to that of serine proteases, however because Thr is bulkier than Ser due to the methyl group, the nucleophile is the hydroxyl group on the Thr residue at the N-terminus of the b-subunit of the protease (Dodson & Wlodawer, 1998; Hegde, 2010). Asparagine peptide lysases, greatly differ to that of Ser, and Thr proteases. Aspartic and glutamic proteases rely on acidic amino acids in their active site but their mechanism of action differ from each other. Unlike serine proteases, aspartic proteases do not form covalent intermediates, and can contain up to 2 aspartic residues that cleave substrates upon water binding, where the water acts as a nucleophile (Brix 2014; Rawlings, Barrett, and Bateman 2011). Conversely, glutamic proteases’ mechanism of action consists of a catalytic dyad, a nucleophilic glutamic acid and a glutamine (Brix, 2014). Distinctively, metalloproteases are the only proteases that require metal ions to cleave substrates. Usually requiring zinc or, to a lesser extent, cobalt, copper, nickel, or an iron metal ion, the protease coordinates cleavage by requiring histidine, glutamate, lysine, arginine, or aspartate, with a water acting as a nucleophile (Van Wart and Birkedal-Hansen 1990; Rawlings, Barrett, and Bateman 2011). Finally, cysteine proteases, contain a catalytic Cys works in a catalytic triad to cleave substrates, which will be explained in further detail in section 1.3.2. 1.3.2 Cysteine proteases The primary focus of this thesis is on the virally encoded 3C cysteine protease from dicistrovirus. As such, it is important to introduce the mechanism of action and substrate specificity of cysteine proteases in general. There are 14 characterized cysteine protease superfamilies such as CA, CD, and CE as well as 2 unclassified (refer to Table 1.1) (Sajid & 13 McKerrow, 2002). Cysteine proteases cleave substrates using a catalytic triad or dyad, which contains the nucleophilic cysteine, a base histidine, and a third residue of asparagine, glutamate, or aspartate may also be present for the tetrahedral intermediate stabilization. The histidine in the catalytic triad acts as a proton acceptor to the thiol on the cysteine, enhancing the nucleophilicity (Rzychon, Chmiel, & Stec-Niemczyk, 2004; Verma, Dixit, & Pandey, 2016). The cysteine then attacks the carbon of a reactive peptide bond, forming an intermediate tetrahedral thioester, which is stabilized by an acidic residue on the protease, and subsequently hydrolyzed to form a carboxylic acid moiety (Figure 1.4) (Rzychon et al., 2004; Sajid & McKerrow, 2002; Verma et al., 2016). 14 Table 1.1 List of different cysteine protease superfamilies. A table of the different cysteine protease superfamilies and an example of a protease from that family. Superfamily Examples Reference CA Papain (Atkinson, Babbitt, & Sajid, 2009; A J Barrett & Rawlings, 2001) CD Caspase-1 (Atkinson et al., 2009) CE Adenain (Atkinson et al., 2009) CF Pyroglutamyl-peptidase I (Atkinson et al., 2009; A J Barrett & Rawlings, 2001) CH Hedgehog protein (Atkinson et al., 2009) CL Sortase A (Atkinson et al., 2009) CM Heptatitis C virus peptidase 2 (Atkinson et al., 2009) CN Sindbis virus-type nsP2 peptidase (Atkinson et al., 2009) CO Dipeptidyl-peptidase VI (Atkinson et al., 2009) CP DeSI-1 peptidase (Nakada-Tsukui, Tsuboi, Furukawa, Yamada, & Nozaki, 2012) PA TEV protease (A J Barrett & Rawlings, 2001) PB Amidophosphoribosyltransferase precursor (A J Barrett & Rawlings, 2001) PC Gamma-glutamyl hydrolase (Alan J Barrett, Rawlings, Salvesen, & Fred Woessner, 2013) Unclassified (Sajid & McKerrow, 2002). 15 Figure 1.4 Mechanism of action of cysteine proteases. (A) General mechanism of action for cysteine proteases, where the nucleophilic cysteine attacks the peptide bond on the carbon of the peptide backbone and the (B) Poliovirus 3C protease crystal structure with indicated catalytic cysteine in yellow and histidine in blue (PDB code:1l1n). Reproduced with permission from (Erez, Fass, & Bibi, 2009). 16 The origins of the cysteine proteases can be traced to an ancestor of both bacteria and archaea. These cysteine proteases can be divided into clans based on the triad or dyad, and protein fold, each clan starts with a letter indicating catalytic type (A J Barrett & Rawlings, 2001). Clan CA primarily refers to papain-like proteases and are further divided into C1 and C2 families, while other pathogenic proteases belong to the CB, CC (viral proteases), and CD clan (legumain-like proteases), but proteases may belong to specific families based on their homology, structure, and various other characteristics (Sajid & McKerrow, 2002). The substrate specificities of these cysteine proteases generally differ from one clan to the next, with the best characterized cysteine protease being papain. Papain generally has a broad range of specificity; however, it shows preference for substrates containing a hydrophobic residue at the P2 position (De Jersey, 1970). Another well-characterized protease, PV 3C, cleaves at a preferential cleavage site of Q at P1 and G at P1’(Blom, Hansen, Blaas, & Brunak, 1996). Specificity for substrates is based off the size of the binding cleft which accommodates the substrate; the amino acid side chains are accommodated into a subpocket of the proteases allowing some amino acids to fit but not all (Neil D Rawlings, 2016; Schauperl et al., 2015). 1.3.3 Protease Kinetics The hydrolysis of peptide bonds by proteases involves kinetic steps consisting of the enzyme binding to the substrate in a reversible interaction to form an enzyme-substrate complex, and then an irreversible reaction to release the enzyme and products. The rate-determining step for the three classes of proteases that form a acyl-enzyme (serine, cysteine, threonine) is dependent on the acyl-enzyme formation or acyl-enzyme hydrolysis (Choe et al., 2006). The measurable rate of enzyme reaction is highly dependent on the concentration of substrate. When the substrate concentration is low the measurable rate of reaction is slow due to the decreased likelihood of 17 forming enzyme-substrate complexes. However, as the substrate concentration increases the enzyme becomes saturated with substrate. Adding additional substrate will not significantly affect the rate of reaction, as the formation of product depends on the catalytic efficiency of the enzyme. Classically, cleavage of substrates by protease were determined by monitoring the cleavage of peptides by high performance liquid chromatography (HPLC) analysis using an in-vitro reaction containing the peptide and enzyme; however these experiments often miss the crucial initial rate of reaction (Coradin, Karch, & Garcia, 2017; Louis, Wondrak, Kimmel, Wingfield, & Nashed, 1999). Fluorogenic substrates on the other hand allow for continuous measurement of substrates over time by using a fluorometer. As a result, fluorescence quenching assays are commonly used within the field to determine initial rates of reaction (Turunen, Rowan, & Blank, 2014). This assay is typically based on a double-labeled fluorogenic substrate, containing a peptide with the cleavage site of the enzyme. The concept is that the distance between intact peptide with fluorophores result in quenching upon excitation of a fluorophore due to the fluorophore and quencher being in close proximity (Karvinen et al., 2004). Cleavage of a substrate by a protease results in the observation of fluorescence as a measure of relative fluorescence units (RFU) over time. 1.4 Dicistrovirus 1.4.1 Classification and genome organization Initially thought to be “picorna-like”, Dicistroviridae are monopartite, linear, +ssRNA viruses, that range from 8-10 kilobases (Valles et al., 2017). Their genome contains two main non-overlapping ORFs giving the family its name due to the unique di-cistronic arrangement. The viral RNA genome also contains a 5’ VPg and 3’polyA tract (Bonning, 2009). The upstream ORF encodes the nonstructural polyproteins while the downstream ORF encodes the 18 structural polyproteins. There are three genera of Dicistroviridae, consisting of: Aparavirus, Cripavirus, and Triatovirus, all of which are defined by the phylogenetic analysis of their characteristic intergenic region (IGR) IRES and structural proteins (Figure 1.5) (Valles et al., 2017). Transmission of most dicistroviruses is by fecal-oral route or vertical transmission, with replication of the virus primarily occurring within the gut and eventually shedding virus particles into the gut lumen where the virus accumulates in the feces. Additionally, the virus may also replicate in the nervous tissue, epidermal cell, fat body, and gonads (Hertz & Thompson, 2011; Kuyumcu-Martinez et al., 2004; Valles et al., 2017). Virus infection may lead to asymptomatic or intestinal illness, paralysis and or eventually death. Drosophila C Virus (DCV) but not cricket paralysis virus (CrPV) infects smooth muscles. DCV may enter the cell, replicate and synthesize capsid protein, causing intestinal obstruction through consumption of contaminated food, whereas CrPV causes paralysis of the hind legs and eventually leads to death as a result of dehydration and starvation (Lautie´-Harivel, 1992; Reinganum, O’Loughlin, & Hogan, 1970). 19 Figure 1.5 Dicistrovirus capsid structure and general phylogeny. The (A) Illustration of packing of dicistrovirus surface proteins VP1, VP2, VP3. These proteins are arranged in a trimer that interlock with other timers to form the capsid. VP4 is located on the inner surface of the capsid. (B) X-ray crystral structure of CrPV virion (C) Negative stain electron microgram of Tratoma virus. (D) Phylogenetic tree of dicistrovirus genera based off of phylogenitic distance of structural proteins. Reproduced with permission from (Bonning, 2009) and (Valles et al., 2017) . ACBD20 Dicistroviruses are thought to enter the cell by clathrin-mediated endocytosis (Cherry et al., 2006). Upon entry, the virus uncoats and releases its genome into the cytoplasm. Release of the genome results in immediate translation via the recruitment of ribosomes to the 5’ IRES and production of a polyprotein that is subsequently cleaved by the virally encoded 3C proteinase (Nakashima & Ishibashi, 2010; Nakashima & Nakamura, 2008). The viral genome undergoes replication by the formation of a replication complex (Khong et al., 2016). This complex is composed of Golgi apparatus and cytosolic vesicles mediated by the coat protein complex I and fatty acid biosynthesis enzymes associated with the virus-induced vesicles for RNA replication (Cherry & Silverman, 2006). Using the genome as a template, a complementary replicative intermediate –ssRNA is synthesized and subsequently new genomic +ssRNA is synthesized from the –ssRNA template. Details of these events are still poorly understood. Translation of ORF 2 occurs by the IGR IRES, which is a compact RNA structure that binds directly to 40S ribosomal subunit and assemble 80S ribosomes without the aid for translation initiation factors (Wilson, Pestova, Hellen, & Sarnow, 2000; Wilson, Powell, Hoover, & Sarnow, 2000). In CrPV, it is known that pseudoknot I or domain III of the IGR IRES mimics the anticodon loop of a tRNA structure in order to initiate translation in the A site of the ribosome (Au & Jan, 2012). ORF 2 translation results in the production of the structural polyprotein, which is subsequently cleaved by 3C to form the capsid proteins. The capsids then associate with the newly synthesized +ssRNA to form progeny virions that are released by cell lysis. Some dicistroviruses are not lytic, such as DCV, and thus persistently infect the host, as the virus is in a state of equilibrium with the immune system, specifically the RNAi pathway (Bonning, 2009; Swevers, Liu, & Smagghe, 2018). Dicistroviruses are of interest as they are known to infect arthropods, some of which have impact on agriculture (Bonning & Miller, 2009). The CrPV is one of the most 21 studied dicistrovirus and has served as a powerful model for understanding virus host interactions in insect cells (Miller & Ball, 2012). 1.4.2 Cricket Paralysis Virus CrPV, belonging to the genus Cripavirus, was first isolated in 1970 from Australian field crickets Teleogyllus oceanicus and Telegyllus commodus. The crystal structure of the virus was solved in 1999, revealing a spheroidal non –enveloped particle that adopts a similar capsid conformation to that of picornavirus with the exception of the VP4 and VP2 domains (Tate et al., 1999; Wilson, Powell, et al., 2000). Like other dicistroviruses, CrPV contains two ORFs encoding the nonstructural and structural polyproteins that are processed by its virally encoded 3C proteinase. Its genome also encodes a conserved DXEXNPGP 2A peptide that “autocleaves” the 1A-2B precursor protein, only seen in a subset of dicistroviruses (Asgari & Johnson, 2010; Nakashima & Ishibashi, 2010). Additionally, CrPV encodes a 1A protein, which acts to counteract the host RNA interference silencing mechanism and the formation of stress granules (Khong et al., 2017; Nayak et al., 2010, 2018). Lastly, dicistroviruses encode four VPgs, generally known to prime the RNA during replication, the function of each VPg in CrPV is not known, nor how it is processed during viral infection (Nakashima & Shibuya, 2006). Other CrPV viral proteins, such as 2B, 2C, 3C, and RdRp, are thought to be similar in function to picornaviruses, however, they have not been fully characterized. 1.4.3 Dicistrovirus 3C protease and their cleavage specificity Like many +ssRNA viruses, dicistroviruses encode at least one viral proteinase, known as the 3C-like proteinase. The primary role of the dicistrovirus 3C proteinase is to process most of the viral polyproteins, based on phylogenetic analysis of the polyprotein 3C cleavage sites (Nakashima & Nakamura, 2008). However, the dicristrovirus 3C-like protease has not been 22 characterized extensively and its substrate specificity is not known other than the identified polyprotein cleavage site. The 3C proteases from dicistroviruses are closely related to picornavirus proteases which are thought to be chymotrypsin-like due to their similarities in predicted structure (Bazan & Fletterick, 1988). Like piconaviruses, dicristrovirus 3C proteinases have a conserved catalytic triad consisting of a Cys, His, and a third amino acid, aspartic acid, not common in picornavirus 3C proteinases (Figure 1.6) (Nakashima & Ishibashi, 2010; Nakashima & Nakamura, 2008). It is predicted that the CrPV 3C-like protease cleaves at Q/D, Q/G, Q/C, Q/A, or Q/V, based on the cleavage sites of dicistrovirus polyprotein in no particular order. To date, very little is known about the 3C-like dicistrovirus protease and will be explored in this thesis. 23 A) B) Figure 1.6 Sequence alignment of dicistrovirus 3C-like protease. The (A) viral genome of cricket paralysis virus with VPg protein covalently linked to the 5’ end, and a 3’ poly(A) tail. The genome is organized with a 5’ internal ribosome entry site (IRES), followed by the first open reading frame encoding for nonstructural proteins, an intergenic region IRES (IGR IRES), and second open reading frame encoding for structural proteins. The virally encoded 3C proteinase cleavage sites (red down arrow) processes most of the polyproteins from both ORF 1 and 2, while the 2A peptide (blue down arrow) cleaves the 1A protein upon translation, and VP4/VP3 are cleaved by a conserved Asp (yellow down arrow) on the VP1. (B) Sequence alignment of the Dicistrovirus, 3C-like protease, with catalytic cysteine indicated with an asterisk and histidine indicated in red box. Reproduced with permission from (Nakashima & Nakamura, 2008). 24 1.5 Approaches to identify candidate substrates 1.5.1 Classical and new approaches of identification It is apparent that viral proteases play an essential role in virus infection, by targeting substrates. It is therefore important to identify these substrates, as they play an important role in viral infection. The challenge is to identify neo-N-termini from a complex of protein lysates containing N-termini proteins. Prior to the advancement of proteomics, two-dimensional gel electrophoresis coupled with proteomics, candidate approaches, and bioinformatics were the classical methods of substrate identification. Unfortunately, these techniques have their limitations (Chandramouli & Qian, 2009). Substrates identified by bioinformatics approaches are based on the analysis of the preferential consensus cleavage site. This does not take into account the availability of the substrate (i.e. compartmentalization) nor its accessibility to the cleavage site (i.e. protein fold), and additionally neglects substrates that do not contain a consensus sequence. 2-D gel electrophoresis on the other hand, is time consuming (Person et al., 2006). As a result, these techniques are time consuming and often unreliable. Since then, novel unbiased techniques have subsequently been developed to identify host-substrates. The most common proteomic approach to identifying substrates is by N-terminomics techniques. The first technique in its field, combined fractional diagonal chromatography (COFRADIC), is a proteomic approach enabling the global analysis of protease cleavage sites (Van Damme et al., 2009). In this method, the protease cleaves substrates, the cysteines on proteins are acetylated, digested, and run through a strong cation exchange chromatography, thereby removing any pyroglutamyl residues, and analyzed by mass spectrometery (MS) (Van Damme et al., 2009; Vizovišek, Vidmar, Fonović, & Turk, 2016). For example the COFRADIC protocol identified host substrates of HIV-1 protease by transfecting an expression vector in cells 25 to express the protease (R. N. Wagner, Reed, & Chanda, 2015). This protocol revolutionized proteomics, however, there are still limitations, COFRADIC requires large amounts of starting material, and proteins containing a histidine at the neo-N-terminal or non-C-terminal arginine are lost due to the cation exchange step (Demon et al., 2009). To address these limitations, other techniques such as terminal amine isotopic labeling of substrates (TAILS) were developed. Unlike COFRADIC, TAILS does not require large starting samples, nor are samples with histidines at the neo-N-terminal or arginine at non-C-terminal lost. Its only limitation is that it cannot account for post-translational modification. TAILS uses an unbiased labelling technique that blocks the N-terminus (Kleifeld et al., 2010). Briefly, TAILS starts with cell lysate that contain cleaved substrates, that may be cleaved in-vivo or in-vitro. The samples are then labeled/blocked at the N-terminus, and digested. Any unblocked N-termini are then removed with a polyglyceraldehyde polymer (Figure 1.7) (Kleifeld et al., 2010) . 26 Figure 1.7 Workflow of TAILS. Schematic of TAILS work flow. Reproduced with permission from (Kleifeld et al., 2010). 27 A similar method of identification of protease cleavage sites (PICS), has been established. PICS requires the production of a peptide library prior to the addition of the protease. The peptide libraries are made by digesting cell lysates with either trypsin or GluC, reducing and blocking cysteines by acetylation. The primary amines are blocked at the N-terminus, and finally the samples are run through the MS to identify cleaved substrates (Figure 1.8) (Schilling & Overall, 2008). Libraries are then incubated with the protease of interest, resulting in neo-N-terminal peptides. Peptides that are not blocked can be biotinylated with sulfo-NHS-SS-biotin via a reactive moiety that crosslinks with biotin at primary amines. Biotinylated samples are pulled out using streptavidin beads and eluted by the reduction of the disulfide bond on the biotin linker. Samples are then run through the MS for quantification and detection of peptides, resulting in an IceLogo that details the cleavage site of the protease (Figure 1.8) (Schilling & Overall, 2008). The primary advantage of PICS is determining peptide specificity and quick profiling of substrate specificity. Unfortunately, some proteases require the full-length protein substrates and not peptides, thus this technique may not be compatible with all proteases (Barrett & McDonald, 1986). Recently, host substrates cleaved by the Zika viral proteases was determined using a technique known as subtiligase-mediated-N-terminomics (Hill et al., 2018). In this method cells were treated with light or heavy medium lysine/argining and subjected to Zika virus infection or buffer. Neo-N-termini are then labeled by subliligase with a biotinylated peptide, isolated by streptavidin beads, and run on the MS (Hill et al., 2018). Finally, proteomics of protease substrate identification is shifting toward label free methods that do not need to rely on the labelling of the peptide substrates (Byrum et al., 2018). 28 Figure 1.8 Workflow of PICS. PICS workflow reproduced and adapted with permission from (Schilling & Overall, 2008). 29 1.6 Thesis approach While studies have been conducted on viral proteases such as PV, and HIV to name a few, very little is known about dicistroviral 3C proteases. One study on dicistroviral 3C proteases has shown its role in the processing of their polyprotein (Nakashima & Ishibashi, 2010), but its substrate and cleavage site specificity have yet to be identified. Additionally, the kinetic properties of the CrPV 3C have not been characterized, as this dictates the cleavage efficacy of CrPV 3C. Given other +ssRNA viruses encode a viral protease that cleaves host substrates during infection, I want to address what the fundamental processes are being regulated during viral infection. To address this, I hypothesize that CrPV 3C protease cleaves host substrates during cricket paralysis virus infection to aid in viral replication. Thus, the objective of my thesis is to purify, and characterize the kinetics of the 3C protease, and determine its cleavage site specificity. To accomplish this, the following aims are address: Objective 1: Purify and verify activity of CrPV 3C protease Aim 1a-1) Clone and express recombinant CrPV 3C protease I first cloned the CrPV 3C protease into an Escherichia coli (E. coli) expression vector containing a soluble tag, and expressed it in a classical E. coli expression system. The objective of this aim is to find the optimal expression conditions that would result in soluble, folded, functional 3C proteins. Aim 1a-2) Purify and determine kinetic properties of CrPV 3C Once the optimal expression conditions were determined, we then set out to purify the protease and determine its activity. The activity was determined using 2 different methods. 1) a fluorogenic peptide containing the cleavage sites from the polyprotein and 2) an in-vitro reaction 30 to synthesize a polyprotein of the second ORF from the infectious clone, that could be cleaved by the purified protease. Objective 2: Determine cleavage site specificity of CrPV 3C protease Aim 2) Determine cleavage site specificity by PICS Finally, once the protease has been purified and determined to be active, a proteomic approach we employed, PICS, to determine the cleavage site specificity of the protease using two separate peptide libraries that were generated with E.coli lysates. 31 Chapter 2: Materials and Methods 2.1 Generation of plasmid GST-tagged CrPV 3C The full length open reading frame of cricket paralysis virus 3C protease nucleotides 3562-4355 from Accession number: KP974707.1 was PCR amplified using touchdown PCR from the pCrPV-3 infectious clone (Kerr et al., 2015) and cloned into the pGEX 6p1 vector using restriction sites BamHI and SalI to generate the plasmid pGEX-CrPV3C. The primer pairs for BamHI and SalI are 5’-CTAGGGATCCTGCAGCGACCCAGCAGCTCAT-3’, and 5’-CTAGGTCGACCTACACTGTAATGTTATTTACTGGG-3’ respectively. Catalytically-inactive mutant 3C protease Cys211Ala was generated by site directed mutagenesis using primers 5’-CGCCTACTCAAACAGGAGATGCGGGATCTATAGTAGGTCTTTACA-3’, and reverse complement 5’-TGTAAAGACCTACTATAGATCCCGCATCTCCTGTTTGAGTAGGCG-3’. All clones were sequence verified. 2.2 Optimization of expression of CrPV 3C pGEX-CrPV 3C was transformed into BL21, C41, and C43 E. coli bacteria strains and grown overnight at 37ºC in a incubator on a ampicillin plate. The next day, a colony was picked and an overnight liquid culture was grown in LB broth. The culture was then subcultured in a 1:100 dilution. The bacterial culture was grown to OD600 0.9-1.0 in LB broth and induced with 1 mM IPTG at 25ºC, 30ºC, or 37ºC and, grown for 2 or 4 hours post induction and lysed by French press or sonication in 1x PBS, and 20% glycerol. GST CrPV 3C expression and solubility was monitored by loading whole cell or lysates onto 12% SDS-PAGE followed by Coomassie blue staining. 32 2.3 GST CrPV 3C Purification and cleavage conditions GST CrPV 3C and pGEX-CrPV 3C (Cys211Ala) were purified using expression plasmids pGEX6P1 glutathione beads. Plasmids were transformed into C41 cells, subcultured, diluted 1:100 and grown 1L of culture grown in a 4L flask until OD 0.9-1.0 and then induced with 1 mM IPTG for 4 hours at 25℃. Cells were harvested and centrifuged at 6500 g for 7 min at 4℃ and stored at -80℃ until ready for lysis. Cells were thawed on ice, resuspended in cold PBS with 20% glycerol, and then lysed by microfluidizer at a presser of 15k with a total of three passages. Lysates were centrifuged at 45000 g for 45-50 min at 4℃. The supernatant was loaded onto a glutathione S-transferase (GST) column equilibrated with lysis buffer. The column was washed with 5 column volumes of cold PBS with 20% glycerol and then eluted with cold 50 mM Tris pH 7.5, 20% glycerol, and 10 mM reduced glutathione. Elution fractions were monitored by SDS-PAGE and Coomassie blue staining for protein purification. Fractions containing purified protein were dialyzed in 3L of 20 mM HEPES pH 7.5, 100 mM NaCl, and 20% glycerol in a 3kDa weight cut dialysis membrane for 2 hours and repeated two more times, then finally dialyzed in 5L of buffer overnight with stirring. Dialyzed protein was then tested for its purity and concentration was determined by SDS-PAGE gels comparing to increasing concentrations of BSA and by Bradford assay. All purification steps were conducted at 4℃,unless otherwise stated. Samples were aliquoted, flash frozen and stored at -80℃. To remove the GST tag, purified GST CrPV 3C was incubated with GST-HRV 3C protease in 150 mM NaCl, and 50 mM Tris pH 7.5 at 4℃ overnight (Yuan et al., 2007). Lysates were incubated with GST beads to retrieve the GST tag and GST-HRV 3C. The flow-through was collected, then dialyzed twice in a 3kDa weight cut off dialysis membrane with 20 mM HEPES pH 7.5, 100 mM NaCl, and 20% glycerol for at least 2 hours with stirring. All 33 purification steps were conducted at 4℃,unless otherwise stated. Samples were aliquoted , flash frozen and stored at -80℃. 2.4 Determination of GST CrPV 3C stability 1mg/ mL of purified GST CrPV 3C in buffer containing 20 mM HEPES pH 7.5, 100 mM NaCl, and 20% glycerol and sypro orange was incubated with increasing temperature, 1℃ per minute from 25℃ to 65℃, using a real-time thermocycler to detect fluorescence (AB applied biosystems, Brown lab). Stability of protease was also tested by incubating 3C protease in dialysis buffer at 30℃ and an aliquot taken every five minutes. Aliquots were spun for 10 minutes at 15 g and the supernatant loaded on a 12% SDS-PAGE followed by Coomassie blue staining. 2.5 Determination of protease activity by Fluorescence quenching assay Two 10-amino acid peptide sequences consisting of wild-type RIVAQVMGED and mutant ARIVAEPMGED were chemically synthesized by Biomatik containing the N-terminal flurophore 7-methoxycoumarin and C-terminal quencher 2,4-Dinitrohenol. Peptides have a purity of 97.63% and 95.71% respectively, and were dissolved in 100% DMSO prior to use. Experiments were carried out at an excitation of 335nm and emission of 395nm in a fluorometer (Perkin Elmer Spectrometer LS50 B, Dieter Bromme lab) and measured over 500 seconds at room temperature. Total reaction was 1 mL in a plastic cuvet. Random fluorescence unit was standardized for all kinetic experiments with trypsin-treated wild-type fluorogenic peptide substrate, overnight at 4°C with 1:100 of trypsin to 20 µM of peptide. Emission of trypsin cleaved peptides was read the next day. Purified 3C protease tagged or untagged (0.1 µM) in buffer containing 20 mM HEPES pH 7.5 and 100 mM NaCl was incubated with 5 µM wild-type fluorogenic peptide substrate. Optimal buffer conditions were determined. HEPES 34 ranging in pH from 5.5-8 was used to determine optimal pH or additives of 1 mM DTT, 5 mM EDTA, 0.01% Brij35, 0.01% TritonX, and 0.01% Tween. Minimum concentration of enzyme was determined by varying the concentration from 0, 0.05, 0.1, 0.2, or 0.5 µM of the GST CrPV 3C. Kinetic activity of GST CrPV 3C was determined by incubating increasing concentrations of wild-type peptide substrate with purified GST-3C (0.05 µM) in buffer consisting of 20 mM HEPES pH 7.5, 10 0mM NaCl, 0.01% Brij 35, and 1 mM DTT. The initial rate of enzyme reaction value was determined by Perkin Elmer Spectrometer LS50 B by determining the slope of the linear rate, which is read from the time the slope is in the linear range. Data of initial rates was analyzed using Prism 6, “Enzyme kinetics- Michaelis-Menten” or a general bar graph generated for optimization of buffers. Initial Rate was plotted against varying enzyme or substrate concentration on Prism, and a graph generated for Michaelis-Menten plots, “Enzyme kinetics- Michaelis-Menten” was used. 2.6 In-vitro translation reaction pCrPV-3, and pCrPV-3ORF1-STOP infectious clones (Kerr et al., 2015) were linearized overnight with Ecl136II. In-vitro transcription reactions using the linearized DNA as template was performed as described (Wang & Jan, 2014). Reactions were DNase I treated and the in-vitro transcribed RNA was purified using RNA cleanup columns (RNeasy kit, Qiagen), and the integrity of RNA was determined by visualization on an agarose gel with safeview dye. In-vitro translation reactions were performed in a 10µL reaction consisting of 2-3 µg of RNA, 6.5 µL of Sf21 cell extract and 0.3 µL of [35S]-Met/Cys at 30℃ for 2 hours. To monitor cleavage, 0.05 µg of WT GST CrPV 3C or mutant GST CrPV 3C was added to the completed in-vitro translation reactions and incubated at 30 ℃ for 1 hour. Reactions were then loaded on a 12% SDS-PAGE gel, dried and analyzed by phosphoimager (Amersham Typhoon). 35 2.7 Proteome Identification of Cleavage Site PICS was performed on E. coli K12 cell lysate, following the published protocol (Schilling, Huesgen, Barré, auf dem Keller, & Overall, 2011) . Briefly cells were lysed in 1% SDS, and 200 mM HEPES pH 7.5 and tip probe sonicated. Lysates were incubated with 10 mM DTT for 60 min at 25℃. Blocked with iodoacetamide in the dark for 1 hour at 25℃ and precipitated with 3 mL ice water, 4 mL ice methanol, and 1 mL chloroform. Pellet was washed with cold methanol and let to dry briefly and resuspended with 100 mM NaOH and brought to 200 mM HEPES pH 7.5. Samples were then spun for 10min at 4℃ at 20,000 g. Protein concentration was then determined at absorbance of 280 nm (Nanodrop) and samples digested with 1:100 (w/w) with 1 mg/mL of trypsin or GluC overnight at 37℃. The next morning 1 mM PMSF was added to abolish digestion and labeled with final concentration of 30 mM formaldeldehyde light and 30 mM sodium cyanoborohydride for 4 hours at 25℃. The reaction was then stopped with final concentration of 100 mM Tris pH 8. Samples were then acidified with 10% formic acid and stage tipped and run through MS/MS. Samples were then resuspended in 20 mM HEPES and 100 mM NaCl and aliquoted into 200 µg each to generate the peptide library. 6 µg of GST CrPV 3C WT or mutant recombinant protein is added to each respective library and digested overnight at 30℃. Samples were then biotinylated at N-terminal ends with 0.05 mM sulfo-NHS-SS-biotin for 2 hours at 22℃ and incubated with 300 µL of streptavidin sepharose for 30 min at 22℃. Samples were washed multiple times with 50 mM HEPES, 150 mM NaCl pH 7.5 and eluted with 1 mM DTT, 50 mM HEPES, 150 mM NaCl pH 7.5. Samples were then acidified and stage tipped. Protocol was followed as described in publication (Schilling, auf dem Keller, & Overall, 2011). 36 Data was analyzed on Mascot using the E.coli database “eco_scaffold”. Parameters were as follows, unless stated everything else uses standard parameters: enzyme chosen was either “semi V8-DE” or “semi Tryp”, “carbamidomethyl, dimethyl” were chosen, and variable modification consisted of dimethyl (N-term), oxidation, thioacyl (N-term). Results from Mascot were then analyzed on Scaffold. Samples were sorted by thioacyl and biotinylation and exported into an excel sheet. Exported data was sorted for thioacyl and its duplicates removed, resulting in total number of peptides, and biotinylated peptides denoted by thioacyl containing and biotinylated containing respectively. Peptides were then sorted into webserver for analysis in the Overall lab webpics to generate the Icelogo. Using webpics, clip-pics were chosen and then analyzed. Library chosen was E.coli with its corresponding enzyme used for digestion. A normalized heatmap and Icelogo was then generated. 37 Chapter 3: Optimization, purification and kinetics of 3C protease 3.1 Background Viral proteases target and cleave substrates in order to modulate cellular processes to promote infection (Chase, Daijogo, & Semler, 2014; Jagdeo et al., 2018). As a first step to identify these host targets, I sought out to purify the CrPV 3C protease and characterize its activity. I purified the full length tagged and untagged CrPV 3C protease, as well as the mutant CrPV Cys211Ala. I monitored CrPV 3C activity via a fluorogenic peptide cleavage assay and through an in-vitro cleavage assay as in-vitro synthesized polyprotein of CrPV. 3.2 Results 3.2.1 Expression of CrPV 3C The first objective is to express and purify a recombinant CrPV 3C protease. Because recombinant 3C proteases from other RNA viruses such as PV 3C have been purified using an E. coli expression system (Nicklin, Harris, Pallai, & Wimmer, 1988), I reasoned that the CrPV 3C protease could also be expressed in a well-established E. coli expression system. The CrPV 3C protease was PCR amplified from the CrPV-3 infectious clone (Kerr et al., 2015) and cloned into a pGEC6p1 vector. The resulting vector, pGEX6p1-CrPV 3C, contains a GST tag on the N-terminus, proceeded by a human rhinovirus (HRV) 3C cut site containing the sequence LEVFQ/GP, and then the CrPV 3C (Figure 3.1). I also generated a catalytically-inactive CrPV 3C by mutating cysteine 211, which is the nucleophile of the catalytic triad, to alanine by site-directed mutagenesis. Alanine was chosen due to its small size and because it contains a chemically inert methyl side chain. The goal is to purify a recombinant GST-tagged CrPV 3C protease and remove the GST tag after purification to achieve an untagged CrPV 3C. 38 Figure 3.1 Vector map of fusion protein. The vector map of recombinant expression plasmid pGex-6P-1 GST CrPV 3C. This plasmid contains the Tac-Promoter (Ptac), GST tag, HRV 3C cut site (indicated in red arrow), and the CrPV 3C protease. 39 pGEX6p1-CrPV 3C, was transformed into BL21, C41, and C43 E. coli cells. These bacterial strains were chosen in order to determine the optimal expression system to obtain soluble protein. BL21(DE3) is commonly used for recombinant protein expression, and expresses the classical T7 RNA polymerase gene from bacteriophage, resulting in robust expression of proteins driven by the T7 promoter (Dumon-Seignovert, Cariot, & Vuillard, 2004). In comparison, the C41 cells, derived from BL21, contain a mutation in the LacUV5 promoter which is responsible for controlling the T7 RNA polymerase by slowing the expression of T7 RNA polymerase. This allows toxic proteins to be produced at a lower rate, preventing cell death from toxic protein accumulation (Dumon-Seignovert et al., 2004; Kwon, Kim, Lee, & Kim, 2015; S. Wagner et al., 2008). C41 primarily contains mutations in genes proY, melB, ycgO, and yhhA which are not present in C43, while C43 has mutations in genes such as ducS, fur, lacI, and Ion, etc(Kwon et al., 2015). pGEX6p1-CrPV 3C was transformed into all three strains and induced with isopropyl b-D-1 thiogalactopyranoside (IPTG), to triggers transcription of the lac operon. Protein expression was induced at 25, 30 or 37ºC for 2 or 4 hours (Figure 3.2). Cells were harvested and lysed by microfluidizer and the lysates run on a 12% SDS-PAGE followed by Coomassie staining. The molecular mass of the fusion GST CrPV 3C is 59kDa. The GST CrPV 3C observed to be expressed at all time points after induction with expression highest at 4 hours in all conditions. 40 Figure 3.2 Expression of recombinant fusion protein. (A) Coomassie stained gels of E. coli whole cell lysates. Recombinant expression plasmid pGEX-6P-1 GST CrPV 3C protease was transformed in BL21, C41, and C43 E. coli and induced at OD= 0.6 with 1 mM IPTG for 2 or 4 hours at 25ºC, 30ºC, and 37ºC. “0” Indicates uninduced 41 protein. Whole cells were harvested and run on 12% SDS-PAGE and visualized by Coomassie Blue staining. (B) Fraction of recombinant GST CrPV 3C protease in supernatant (S) and pellet (P) after French press lysis of E. coli cells. Shown are Coomassie stained gels of lysates of the indicated E. coli strains grown at 25ºC, 30ºC, and 37ºC at 4 hours after induction with IPTG. Cells were harvested, lysed by French Press, and centrifuged at 16 RCF to remove aggregates and lysates run on 12%SDS-PAGE and visualized by Coomassie Blue Staining. (C) Fraction of recombinant GST CrPV 3C protease or GST-DCV (Drosophila C virus)3C in supernatant (S) and pellet (P) after sonication of E. coli cells. Shown are Coomassie stained gels of lysates of BL21 grown at 25ºC, 30ºC, and 37ºC at 4 hours after induction with IPTG. Cells were harvested, lysed by sonication, and centrifuged at 16 RCF to remove aggregates and lysates analyzed by 12%SDS-PAGE and visualized by Coomassie Blue Staining. 42 To determine the optimal lysis method to achieve soluble GST CrPV 3C3C protein, we used two approaches: sonication and french press. The sonication method resulted in approximately, ~90% of GST CrPV 3C protein in the supernatant fraction as opposed to in the pellet suggesting that GST CrPV 3C is soluble using this lysis approach (Figure 3.2C, lane 1). However, sonication is known to cause certain proteins to aggregate and may cause protein unfolding (Stathopulos et al., 2004). By contrast, lysis by the French press method led to approximately 50% of protein in the supernatant fraction (Figure 3.2B C41 lane 1). When comparing the two different lysis methods, protease expressed in BL21 cells lysed by French press were completely insoluble, but when lysed by sonication ~90% of the protein was in the soluble fraction (Figure 3.2B BL21 lane 1). C41 E. coli cells with induction at 25ºC were chosen as the optimal cell expression system as it provided the most soluble GST CrPV 3C protein by the French press method (Figure 3.2B, C41 lane 1 and 2). Given the discrepancy between the two lysis methods, French press was chosen as the primary lysis method, as solubility of protein should not change depending on the method of lysis. Drosophila C Virus (DCV), another dicistrovirus closely related to CrPV, purification was also attempted, however the protease was always present in the insoluble fraction (Figure 3.2C). 3.2.2 Purification of CrPV 3C and cleavage of GST tag Both the wild-type and catalytically inactive GST CrPV 3C (Cys211Ala) proteases were purified using glutathione beads. The eluted fractions resulted in the wild-type 59 kDa GST CrPV 3C protease fusion protein and an additional protein at ~25 kDa. The 25kDa protein is likely the GST tag protein, due to the HRV 3C protease site utilized to produce GST tag (26 kDa) (Figure 3.3A). As CrPV 3C has partial cleavage activity towards the HRV 3C cut site. The cut site for HRV 3C is between a Q/G, this P1-P1’ cleavage site is also present in the CrPV 43 polyprotein. In support of this, purification of the catalytically inactive GST CrPV 3C only resulted in the 59kDa protein (Figure 3.3B). Purity of the proteins was determined by SDS-PAGE analysis and protein concentration determined by Bradford assay (Figure 3.3C). 44 Figure 3.3 Purification of recombinant GST CrPV 3C and GST CrPV 3C (Cys211Ala). (A) Wild-type (WT) GST CrPV 3C and (B) GST CrPV 3C (Cys211Ala) were purified by glutathione affinity chromatography from C41 E. coli after induction with 1 mM IPTG at 25ºC for 4 hours. Supernatant (S), pellet (P), flowthrough (FT), and elution fractions from GST-tag purification were analyzed by SDS-PAGE and visualized by Coomassie blue staining. GST CrPV 3C was eluted with 50 mM Tris pH 7.5, 10 mM glutathione reduced. Elution fractions 1-6 were pooled and dialyzed against 100 mM NaCl, 20 mM HEPES pH 7.5, and 20% glycerol. (C) Purified protein was analyzed for purity and concentration by 12% SDS-PAGE gel against varying concentrations of BSA. A BC200120605070203025Sup Pellet FT 1 2 3 4 1 2 3 4 5 6Wash ElutionskDaWT GST CrPV 3C Sup Pellet FT 1 2 3 4 1 2 3 4 5 6Wash Elutions GST CrPV 3C (Cys211Ala) 200100605070203025kDaGST CrPV 3CGST CrPV 3CGST tagGST tag20010060507040203025WT GST CrPV 3CGST CrPV 3C (Cys211Ala) BSA mg/mLkDa 7 5 3 1GST CrPV 3CGST tag45 To determine whether the GST tag may impede 3C protease activity, the GST CrPV 3C was incubated with recombinant GST HRV 3C in order to induce cleavage at the HRV 3C site. The cleaved GST and GST HRV 3C was pulled out by incubation with glutathione beads, resulting in untagged CrPV 3C in the flow through (Figure 3.4A). Removal of the GST tag did lead to loss of purified untagged CrPV 3C, which may be due to the untagged protein being unstable (Figure 3.4B), where stability in this case is defined as the protein precipitating upon cleavage of the GST tag. Despite GST pull down, small quantities of cleaved GST were still present in the fractions containing the purified untagged CrPV 3C (Figure 3.4C). In summary both wild-type GST CrPV 3C and GST CrPV 3C (Cys211Ala) mutant were purified using glutathione beads. HRV 3C cleavage of fusion protein resulted in the cleavage of the GST tag and near isolation of the untagged CrPV 3C wild-type and mutant protease. Traces of GST are still present in the purified untagged CrPV 3C. 46 Figure 3.4 Purification of untagged CrPV 3C. (A) Coomassie stained SDS-PAGE gel of recombinant wild-type GST CrPV 3C and mutant GST CrPV 3C (Cys211Ala) after incubation with 100 µg of HRV 3C overnight at 4ºC with gentle rocking. An HRV 3C protease site is located between the GST and CrPV 3C. Arrows indicate proteins post cleavage. (B) Coomassie stained gel of cleaved proteins dialyzed against 20 mM HEPES pH 7.5, 100 mM NaCl, and 20% glycerol. (C) Purified wild-type and mutant GST CrPV 3C (Cys211Ala) protease were analyzed on 12% SDS PAGE and concentration was determined compared to increasing concentrations of BSA. GST tag and HRV 3C were removed using GST beads and the 3C protease was collected in the flowthrough. 5020010070604030252015kDa WT GST CrPV3CGST CrPV 3C (Cys211Ala) 3C CrPVGST tag2001208570605040302515kDa WT GST CrPV3C Pre-Dialysis WT GST CrPV3C Post-Dialysis GST CrPV 3C (Cys211Ala) Pre-Dialysis GST CrPV 3C (Cys211Ala) Post-Dialysis 200100605070403025BSA mg/mLkDaABC5 3 1 0.53C CrPVGST tagWT GST CrPV3CGST CrPV 3C (Cys211Ala) 3C CrPVGST tag47 3.2.3 Buffer conditions for CrPV 3C protease activity It is important to establish that the protease is able to maintain stability at physiological temperature over time, as well as to determine if site directed mutagenesis changes the overall fold of the protease. I first tested the thermal stability of the purified protein GST CrPV 3C and the GST CrPV 3C (Cys211Ala). This was achieved by incubating the purified GST CrPV 3C or GST CrPV 3C (Cys211Ala) in a buffer containing sypro orange, at increasing temperatures and detecting fluorescence using a thermocycler (Y. Liu et al., 2014). Proteins fold in specific orientations that are thermodynamically favorable as well as to minimize the exposure of hydrophobic side chains to water. Sypro orange binds to these hydrophobic regions nonspecifically leading to fluorescence. In the presence of water, the fluorescence of sypro orange is quenched (Ciulli, 2013). Thus, as the protease is being heated, the protein starts to unfold, exposing these hydrophobic regions and allowing sypro orange dye to bind to the hydrophobic regions of the protein, thus resulting in increased fluorescence with the exclusion of water (Figure 3.5). For both the wild-type and Cys211Ala mutant proteins, fluorescence starts to increase at ~40ºC, and peaks at 55ºC, suggesting that the GST CrPV 3C protease starts to unfold at 40ºC and is completely unfolded at 55ºC. Importantly, the point mutation Cys211Ala did not significantly change the thermal profile, as the protein unfolds in a similar pattern, indicating that this mutation does not change the overall structure of the protease. To determine whether the protease is stable over time, 3C protease was incubated at 30ºC, and an aliquot taken every 5 minutes for 30 minutes. This was done to determine if protein precipitated over time, as there would be a gradual decrease in intensity of the band, additionally this also shows if GST CrPV 3C undergoes autocleavage. I chose the incubation temperature at 30ºC because the host for CrPV infection, Drosophila melanogaster, prefers a temperature range of ~25-28ºC (Dillon, 48 Wang, Garrity, & Huey, 2009). Aliquots were then run on an SDS-PAGE gel and stained with Coomassie dye. As shown in Figure 3.5B, the protease is relatively stable for 30min at 30 ºC. All proceeding kinetic reactions were done at room temperature, as the fluorometer used is unable to manipulate or maintain temperature. 49 Figure 3.5 Stability of GST CrPV 3C. (A) Wild-type and mutant GST CrPV 3C were incubated with sypro orange in 20 mM HEPES pH 7.5, 100 mM NaCl, and 20% glycerol, and its random fluorescence assessed by increasing the temperature at 1ºC per min increments. (B) Protein stability of GST CrPV 3C over time was assessed by incubating the protease at 30ºC for 30 min in 20 mM HEPES pH 7.5, 100 mM NaCl and aliquots removed at 5 min increments and analyzed on 12% SDS-PAGE gel. 50 To monitor 3C protease activity, we used two approaches. For the first, fluorogenic peptide substrates were synthesized consisting of the 3C cleavage site of the CrPV ORF 2 VP3/VP1 cleavage site, amino acids ARIVAQ/VMGEDL, with a conjugated N-terminal fluorophore MCA and a C-terminal chromophore DNP. VP3/VP1 is thought to be cleaved first during the processing of ORF 2 polyprotein (Reavy & Moore, 1983), therefore that cleavage site was utilized in the determination of CrPV 3C catalytic efficiency. Moreover, cleavage of VP3/VP1 is also conserved in DCV, ALPV (Aphid lethal paralysis virus), and TrV (Triatoma virus) 3C proteases, other viral proteases in the dicistrovirus family that is similar to the CrPV 3C protease (Nakashima & Nakamura, 2008). We also synthesized a cleavage-resistant peptide, ARIVAE/PMGEDL, that should not be cleaved by CrPV 3C. The cleavage resistant peptide was designed based on the substrate specificity of the PV 3C protease. Using the software program Phyre2, CrPV 3C is thought to have a similar fold and mechanism of action as PV 3C (Figure 5.1A). Changing the Q/V to E/P is cleavage resistant by PV 3C, thus we altered these same amino acids within the cleavage-resistant peptide (Jagdeo et al., 2018). Incubating 0.1 µM of purified GST CrPV 3C or CrPV 3C protease with 5 µM of the wild-type fluorogenic substrate, resulted in increasing fluorescence over time, as enzyme concentration is increased initial velocity increases (Figure 3.7). Higher concentrations of enzyme with substrate results in eventual plateau as all substrate has been converted to product (Figure 3.7, shown in purple). Cleavage was specific as incubation of the catalytically-inactive Cys211Ala 3C with the wild-type peptide substrate or, conversely, of the wild-type 3C with the cleavage resistant peptide substrate resulted in minimal or no fluorescence (Figure 3.6A). To further confirm that activity was specific, incubation of the HRV 3C protease with the wild-type peptide substrate resulted in negligible fluoresence. In summary, CrPV 3C protease was purified and shown to be active. 51 Figure 3.6 Buffer optimization of GST CrPV 3C. Optimization of GST CrPV 3C cleavage activity. (A) GST CrPV 3C, untagged 3C CrPV, HRV 3C or mutant GST 3C (Cys211Ala) at 0.1 µM concentration was incubated with 5 µM of WT fluorogenic substrate or 20µM of cleavage resistant (CR) peptide in 20 mM HEPES pH 7.5, 100 mM NaCl buffer and its fluorescence at excitation 335 nm and emission 395 nm was detected using Perkin Elmer Spectrometer LS50, (B) GST CrPV 3C protease at a concentration of 0.1 µM and 5 µM of WT fluorogenic substrate was incubated in 100 mM NaCl and 20 mM HEPES with pH ranging from 5.5-8.0. (C) The indicated additives incubated with 5 µM WT ABCV 0µM/sV0µM/sV 0µM/sGST CrPV 3C Cys211Ala GST CrPV 3CBuffer1mM DTT 5mM EDTA0.01% Brij 350.01% TritonX0.01% Tween0.01% Brij 35 +1mM DTTpHCrPV 3CHRV 3C Cys211Ala CrPV 3CWT peptide CR peptide+ - - - - -+ - - - - -+ - - - - -+ - - - - -++ - - - - -+ - - -52 fluorogenic substrate and 0.1 µM of GST CrPV 3C in 20 mM HEPES pH 7.5, 100 mM NaCl (Buffer) with either 1 mM DTT, 5 mM EDTA, 0.01% Brij 35, 0.01% TritionX, or 0.01% Tween. Shown are averages from at least three technical replicates (N=1). Error bars represent standard deviation. 53 Figure 3.7 Determination of minimum amount of GST CrPV 3C. Optimization of minimum amount of GST CrPV 3C required to observe cleavage of fluorogenic peptide. A) 5 µM of wild-type fluorogenic substrate in 20 mM HEPES pH 7.5 and 100 mM NaCl was incubated with 0, 0.05, 0.1, 0.2 or 0.5 µM of GST CrPV 3C, N=1. B) linear relation of enzyme concentration and initial rate. 54 To determine the kinetic parameters of the 3C protease, I measured the initial rates of the 3C protease cleavage activity by incubating increasing amounts of fluorogenic substrate with 3C protease. GST-tagged CrPV 3C resulted in an initial rate V0 of 0.023 µM/s while the untagged 3C resulted in only V0 of 0.014 µM/s. This result suggests that the GST tag does not interfere with the 3C protease and that the inclusion of the GST tag may be enhancing the stability and/or specificity of CrPV 3C (Figure 3.6A). This could be determined by the thermal stability of the untagged CrPV 3C. Using the tagged GST CrPV 3C, I next optimized buffer conditions such as pH, detergents and reducing agents for maximal cleavage reaction conditions. Varying the pH, I found that a pH between 7 and 7.5 is optimal for 3C protease activity (Figure 3.6B). pH 7.4 was chosen as that is the physiological pH of Drosophila (Massie, Williams, & Colacicco, 1981), it should be noted that pH 7 appears to have a better initial rate. Moreover, we tested different detergent conditions as it has been shown to affect protease recoverability (Ezgimen, Mueller, Teramoto, & Padmanabhan, 2009). Adding 0.01% Brij35, Triton X or 0.01% Tween resulted in a V0 of 0.019, 0.012, and 0.019 µM/s respectively (Figure 3.6C). Finally, as 3C is a cysteine protease, we also tested the addition of the reducing agent, 1 mM DTT, in order to ensure that the catalytic cysteine is reduced (Wilkesman, 2017). Addition of DTT resulted in an initial rate V0 of 0.02 µM/s, an approximately 1.2-fold increase in 3C activity compared to no DTT addition (Figure 3.6C). I also tested the addition of EDTA to see if it enhanced stability of the protease. Adding EDTA to the reaction resulted in an initial rate V0 of 0.022 µM/s (Figure 3.6C). In summary, the optimal conditions for 3C cleavage activity is 20 mM HEPES pH 7.5, 100 mM NaCl, 0.01% Brij35, and 1 mM DTT, and I used these conditions for all subsequent experiments. I next varied the concentration of enzyme in order to determine optimal concentration of enzyme to use in a reaction. It was determined that 0.05 µM of enzyme is sufficient to obtain a 55 slope that did not result in complete conversion of substrate to product (Figure 3.7). Finally, varied substrate concentration was used to ascertain the optimal range required to reach Vmax. I incubated the fluorogenic peptide (1 µM – 5 µM, 10, 15, 20 µM) with 0.05 µM of GST CrPV 3C and fluorescence measured over time. To standardize for the relative fluorescence units WT fluorogenic peptide was incubated with trypsin overnight, which would cleave after the arginine within the WT peptide. Thus , a standard curve was made from the average emission from each concentration (Figure 3.8B). By varying the concentration of substrate, the initial velocity increases with increasing concentration of substrate, however, a point is reached when the initial velocity will not depend on the substrate concentration. I used purified GST CrPV 3C from three preparation in order to compare reproducibility. In summary, protein preparations 1, 2, and 3 resulted in a Km of 2.2- 7.3 µM. With a kcat of 0.32- 1.2 s-1 and a kcat / Km of 1.4- 2.4x105 M-1s-1. Specifically, the GST CrPV 3C had a Km of 2.6 µM and a kcat of 0.65 s-1 for preparation 2, while preparation 1 has a Km of 2.2 µM and a kcat of 0.32 s-1 This allowed a determination of the GST CrPV 3C catalytic efficiency kcat / Km, of 1.4 and 2.4x105 M-1s-1 for preparation 1 and 2, respectively. While preparation 3 resulted in a Km of 7.3 µM and a kcat of 1.2 s-1. The kcat / Km for preparation 3 was 1.7x105 M-1s-1, which is comparable to preparation 1. A titration curve with a tight irreversible inhibitor that fits into active site of the protease is needed to determine the actual number of active protease. 56 Figure 3.8 Michaelis-Menten kinetics of GST CrPV 3C. Determination of enzyme kinetics of GST CrPV 3C. (A) Increasing concentrations of wild-type fluorogenic substrate was incubated with 0.05 µM of GST CrPV 3C in 20 mM HEPES pH 7.4, 100 mM NaCl 0.01% Brij 35, and 1 mM DTT. Shown are averages from at least two technical replicates over 3 days using three separate preparations (N=3, for each protein purification). (B) Standard curve of relative fluorescence unit against its MCA concentrations at 1-5 µM and 10 µM, used to calibrate data. Shown are averages from at least three technical replicates for each preparation (N=1). 57 3.2.4 CrPV 3C In-vitro translation of polyprotein The second approach for testing the activity of the GST CrPV 3C protease was to use in an in-vitro translation approach and test whether the 3C protease can cleave the CrPV polyprotein. Briefly, using an infectious clone developed by the Jan lab (Kerr et al., 2015), we expressed a CrPV ORF2 polyprotein in an in-vitro Sf21 insect translation lysate. Specifically, we used a mutant CrPV infectious clone containing a stop codon insertion in ORF1, thus preventing expression of the 3C protease (Figure 3.9 for schematic). We also incorporated radioactive [S35]-met/cys for detection of the CrPV ORF2 polyprotein (Figure 3.9, lane 2). Incubation of purified GST CrPV 3C protease to the extract containing the [S35]-met/cys-labeled ORF2 polyprotein resulted in smaller MW protein bands and loss of polyprotein, suggesting cleavage of the polyprotein by GST CrPV 3C (Kerr et al., 2015) (Figure 3.9, lane 3). In lane 1, both ORFs are translated and processed, resulting in the production of nonstructural and structural proteins. ORF 1 stop prevents the production of nonstructural proteins, but ORF2 can still be translated (Figure 3.9 lane 2). Specifically, we observed four bands at ~60, 37, 32, and 29 kDa, which when compared to known MW of CrPV proteins, suggests that the bands correspond to the precursor VP1-4, VP3+VP4 (VP0), VP2/VP3, and the mature VP2 and VP1 (Figure 3.9, lane 3). In support of this, the MW of these bands are similar to that observed in extracts containing the wild-type infectious clones, which undergoes 3C protease-mediated polyprotein processing by the 3C protease producing the precursor and mature proteins (Kerr et al., 2015) (Figure 3.9). To test the specificity, incubation of the catalytically inactive mutant 3C protease did not result in cleavage of the polyprotein (Figure 3.9 lane 4). In summary, purified GST CrPV 3C is able to cleave the CrPV ORF2 polyprotein, but not the catalytically inactive mutant. 58 Figure 3.9 In-vitro synthesis of CrPV-2 and CrPV-ORF1-STOP. A schematic of in-vitro translation reaction of CrPV-2 or CrPV-ORF1-STOP infectious clone in Sf21 cell extract and [35S]-Met/Cys (shown in red). In-vitro translation reaction is incubated for 1 59 hour at 30ºC to produce a polyprotein that is cleaved by CrPV 3C, lane 1, and ORF 1 polyprotein. Purified GST CrPV 3C or GST CrPV 3C (Cys211Ala) was added to the completed in-vitro translation reaction for 1 hour at 30ºC resulting in an in-vitro polyprotein cleavage of CrPV-ORF1-STOP with GST CrPV 3C, lane 3, or no cleavage of CrPV-ORF1-STOP with GST CrPV 3C (Cys211Ala) respectively, lane 4. 60 3.3 Discussion Previous studies on virally encoded proteases have shown that the protease plays a role in aiding viral infection by cleaving host substrates (Bonderoff et al., 2008; Jagdeo et al., 2018). In order to identify substrates of CrPV 3C, an in-vitro approach is one option, as it allows for the determination of direct candidate substrates. Although little is known about this protease, as it has never been fully characterized before, the cleavage site specificity can be inferred by the cleavage sites in the CrPV polyprotein. Using an E.coli expression system, CrPV 3C protease was purified with a GST tag on the N-terminus in C41 cells. The primary aim is to purify soluble active protein,. Under 4 hour expression at 25ºC, the GST CrPV 3C proteases present in the supernatant. To further determine if the GST tag affecting the activity of the protease, GST was removed using the HRV 3C cut site present between GST and CrPV (Figure 3.1). Removal of the GST tag resulted in pure CrPV 3C, however there are trace amounts of GST tag in the purified protein. Additionally, tag removal often resulted in loss in protein yield due to precipitation of protein, indicating the protease may be unstable without the tag (Figure 3.3C, 3.4C). On bead tag removal was attempted (data not shown), however this resulted in complete precipitation of protein. Both tagged and untagged versions of purified protease were shown to be active (Figure 3.6A), indicating that the GST tag does not impede the activity of the protease, and suggested that the GST tagged version may have a better activity. Upon further analysis of kinetic ability of this viral protease, there is great variability from preparation to preparation, which may be a result of two things. 1) The number of active protein in each preparation is different or 2) the concentration of protein used in each prep varies, resulting in a variable range (refer to Appendix A.1 and A.2 for other purifications). 61 Chapter 4: Determination of cleavage site using PICS 4.1 Background Viral proteases play essential roles in the cleavage of not only the viral polyprotein but also host substrates during viral infection (Jagdeo et al., 2018; Pacini et al., 2000). The cleavage site specificity of well characterized proteases such as MMP-2 (matric metalloprotease-2), has been determined using novel proteomic techniques such as PICS (Schilling & Overall, 2008). The identification of the cleavage site specificity may provide insight into how these viral proteases recognize substrates. The cleavage site specificity of viral proteases for the polyprotein and target substrates is usually similar (Wei, Meller, & Jiang, 2013). But in some instances, viruses employ host proteases to aid in the cleavage of their polyprotein, so only cleavage sites of the polyprotein cleaved by the viral protease should be considered. Previous studies in the determination of cleavage site specificities employed peptides, however this is time consuming. In this chapter, I address Aim 2 by determining cleavage site specificity using PICS. We employed an in-vitro approach to identify the cleavage site specificity of the CrPV 3C protease using PICS, using trypsin-digested E. coli library to identify the P1 and P1’ cleavage site. 4.2 Results 4.2.1 Cleavage site specificity of CrPV 3C The final objective in the characterization of CrPV 3C protease is to characterize the CrPV 3C cleavage site specificity in an in-vitro system. PICS is an in-vitro approach that allows for the determination of direct candidates of cleavage site specificities (Schilling, auf dem Keller, et al., 2011). The PICS workflow is shown in Figure 1.8. PICS utilizes a peptide library that is proteome-derived, allowing for screening (Schilling, auf dem Keller, et al., 2011). Screening is 62 achieved by profiling the prime and non-prime specificity of proteases by identifying 10-100s individual cleavage products (Schilling, auf dem Keller, et al., 2011). Peptide libraries were generated by digestion with trypsin which cleaves after lysine or arginine in or GluC which cleaves after a glutamic or aspartic acid in (Schilling, auf dem Keller, et al., 2011). The use of two different proteases increases the peptide coverage, allowing profiling of residues that are present at the C-terminal end in libraries from a particular endoprotease (Schilling, auf dem Keller, et al., 2011). We chose E. coli (K12) cells because the proteome has been well characterized in terms of its proteome (Han & Lee, 2006). Samples are then labeled at the N-terminal ends with formaldehyde light. The resulting labeled peptides represent the peptide library. Purified protease is added to the labeled peptide library, generating Neo-N-termini, which is then biotinylated by sulfo-NHS-SS-biotin. N-terminal labeled peptides will not be biotinylated as neo-N-terminal ends of the prime side generated by the CrPV 3C are not protected by light formaldehyde. Peptides that are not protected will be biotinylated. Samples are then incubated with streptavidin to isolate cleaved peptides, eluted and run through the MS to identify the cleavage site specificity. The corresponding nonprime-side sequence are then derived bioinformatically (Schilling, Huesgen, et al., 2011; Schilling & Overall, 2008). I reasoned that the E. coli peptide library would be best in the determining the cleavage site specificity. At the peptide level, the protease cannot differentiate between peptides generated from E. coli or Drosophila. E. coli. Peptide libraries have been used successfully in cleavage site specificity for proteases such as chlamydial protease-like activity factor (CPAF), and human endogenous retrovirus (Biniossek et al., 2016; Schilling, Huesgen, et al., 2011). In some instances however, proteases require specific modifications on peptides in order for cleavage to 63 occur, this may not be present in the E. coli peptides (i.e. glycosylation) (Schilling, Huesgen, et al., 2011; Schilling & Overall, 2008). PICS was first performed using the well-characterized protease, GluC, incubated with the trypsin-digested E. coli library in order to ensure that the assay works correctly and validates the cleavage site specificity. From the identification of the peptide using PICS, the corresponding amino acids surrounding the cleavage sites can be determined and the data is then analyzed by IceLogo in order to identify the enriched amino acids at P1-P4 and P1’-P4’ positions. The GluC protease with the trypsin-digested E. coli peptide library generated 4,961 total identified peptides, of which 3,952 were biotinylated, thus indicating a 79.7% enrichment efficiency. The normalized heat map generated, was compiled from 843 cleaved peptides (Figure 4.1A). Peptides were identified by the modification thioacyl from the program Scaffold, from which biotinylated peptides were selected (Schilling, auf dem Keller, et al., 2011). Biotinylated peptides were then analyzed using the webserver Webpics to generate 843 cleaved peptides were generated (http://clipserve.clip.ubc.ca/pics/index.html, (Schilling & Overall, 2008). The Webpics uses bioinformatic analysis to search for multiple sequence alignments and analyzed as a sequence logo, exclusion of small peptides where peptides corresponds to mature protein termini or internal peptides, and repeat sequences are constructed as minimum consensus (Schilling & Overall, 2008). For each cleavage site, the preceding 10 amino acids are identified. The Icelogo generated for GluC specificity was generated using ICEPICS and shows the cleavage site specificity of GluC from the P4-P4’. As reported GluC cleaves after a glutamic acid and to a lesser extent aspartic acid (Figure 4.1) (Schilling & Overall, 2008). I conclude that the control experiments worked thus validating that the PICS protocol is operational. For this, I can perform PICS using the protease, CrPV 3C, to determine its cleavage site specificity. 64 Given the control worked, I next proceeded to incubate the purified GST CrPV3C 3C protease with either the trypsin or GluC library (refer to appendix figure A.3). For the trypsin library, 3,135 total identified peptides were detected, of which 92 were biotinylated, indicating an enrichment efficiency of 2.9%. Biotinylated peptides were then analyzed using the webserver Webpics to identify 24 cleaved peptides, from which the preceding 10 amino acids are identified. Using these 24 cleavage sites a normalized heat map generated from ICEPICS, was compiled (Figure 4.2A). The IceLogo generated showed that the CrPV 3C has a preferential cleavage between glutamine at P1 and alanine, threonine, or asparagine at P1’ (Figure 4.2B, Table 4.1). 65 Figure 4.1 GluC cleavage site specificity using PICS of a trypsin-digested E. coli library. (A) Normalized heat map of the GluC cleavage site specificity from P6-P6’ and (B) its generated iceLogo from P3-P3’. Cleavage site specificity was determined using PICS with a trypsin-digested E. coli library which cleaves at R or K at the P1 position. Specificity of GluC is indicated, with E, D the preferred cleavage site at P1. AB66 Figure 4.2 GST CrPV 3C cleavage site specificity using PICS in a trypsin-digested E. coli library. (A) Normalized heat map of the CrPV 3C cleavage site specificity from P6-P6’ and (B) its generated iceLogo from P3-P3’. Cleavage site specificity indicated was determined using PICS with a trypsin-digested E. coli library which cleaves at R or K at the P1 position. Specificity of CrpV 3C is indicated with Q the preferred amino acid at P1 and A, T, or N at P1’. 67 Table 4.1 List of possible cleavage sites List of all possible cleavage sites mediated by CrPV 3C in both libraries (refer to appendix Figure A.3 for GluC), and the known cleavage sites within the CrPV polyprotein. “-” refers to no preference, “/” refers to preference for either amino acid. Cleavage site P6’ P5’ P4’ P3’ P2’ P1’ P1 P2 P3 P4 P5 P6 PICS trypsin library - W M/V V A/F Q A/T/N Y P E P C/E PICS GluC library - - - K/P/R A/T/Y G/R A/G/K/H - K/V P N P Polyprotein 2B/2C K I P G K Q D W D N Y I Polyprotein 2C/3A S T T V A Q G G S E T S Polyprotein VPg/3C K E A E T Q G C S D P A Polyprotein 3C/RdRp N N I T V Q C C F E P P Polyprotein VP2/VP4 A R I Y A Q A A K E L K Polyprotein VP3/VP1 S R I V A Q V M G E D Q 68 In summary, using PICS to determine the cleavage site specificity of a well characterized protease, GluC, on a trypsin-digested E. coli library, resulted in the anticipated preferential cleavage site of E or D at the P1 position, thus validating the approach in my hands. It was also determined that CrPV 3C protease has a preferential cleavage of glutamine at the P1 position and a preferential cleavage site for threonine, asparagine, or alanine P1’, in the trypsin-digested E. coli library which is similar to the CrPV polyprotein cleavage site preference (Table 4.1).Unfortunately, the GluC-digested E. coli library with CrPV 3C resulted in inconsistent data (Table 4.1, Appendix A.3). The preferential cleavage site did not overlap with that in the trypsin-digested E. coli library (Figure A.1 C). More replicates are required to make a more conclusive statement and the GluC-digested E. coli library should be revisited. 4.3 Discussion Previously, it was shown that the virally encoded CrPV 3C protease cleaves its own polyprotein during infection (Nakashima & Ishibashi, 2010), however its candidate substrates and cleavage site specificity are not known (Figure. 1.6A). From the polyprotein cleavage, there is a strong preference for Q at the P1 position, however this cleavage site specificity for the polyprotein has been optimized for the virally encoded protease. PICS data suggests that the CrPV 3C protease prefers cleavage at Q in the P1 position and A, T, or N at the P1’ position in the trypsin library. This appears to be in line with the CrPV polyprotein cleavage specificity (Table 4.1). With regard to PICS using the GluC library (refer to appendix Figure A.3), the 3C cleavage site specificity did not align with that observed with the trypsin library. Without more replicates, it is difficult to assess whether these represent bona fide cleavage specificity of CrPV 3C. 69 Chapter 5: Conclusion 5.1 Discussion 5.1.1 Purification of tagged and untagged CrPV 3C Many +ssRNA viruses encode one or several proteases that cleave its own viral polyprotein during viral infection, with an exception of a few viruses that also require both host and viral proteases to cleave the polyprotein (i.e. HCV). The viralproteases have also been found to cleave host substrates in order to facilitate viral infection (Jagdeo et al., 2018; Lévêque & Semler, 2015). Dicistrovirus proteases have yet to be fully characterized. Previous studies on dicistrovirus proteases identified the cleavage sites in the viral polyprotein of Plautia stali intestine virus (PSIV) (Nakashima & Ishibashi, 2010; Nakashima & Nakamura, 2008). Extensive studies on the substrate specificity and kinetics on the viral protease from dicistroviruses have not been pursued. In this thesis, I have purified the CrPV 3C protease, determined its catalytic efficiency, and its cleavage site specificity. In order to determine the specificity, the protease must first be purified. In Chapter 3, I purified CrPV 3C protease using C41(DE3) cells, at 25°C for 4 hours after IPTG induction. Lysis of induced cells resulted in a 1:1 ratio of soluble to insoluble protein and purification using glutathione bead pulldowns resulted in active protein with a rough yield of 10 mg. Cleavage of the GST tag was possible however there was significant loss of the untagged CrPV 3C, likely due to the GST tag enhancing stability (Young, Britton, & Robinson, 2012). In previous studies, C41 (DE3) strain has been used when BL21 (DE3) strain cells cannot over-produce toxic protein, which may cause bacterial cell death when overexpressed (Dumon-Seignovert et al., 2004). The use of BL21(DE3) cells in the induction of the recombinant CrPV 3C protease resulted in 100% of the protein in the pellet upon lysis (Figure 3.3C), which is likely because 70 high levels of expression of the tagged-3C results in the formation of aggregated proteins otherwise known as inclusion bodies (Palmer & Wingfield, 2004). Inclusion bodies form under conditions such as high temperatures during protein expression, thus resulting in expression of a desired protein at high translational rate that exhausts the quality control system and leading to partially folded and misfolded protein aggregate (Palmer & Wingfield, 2004; Singh, Upadhyay, Upadhyay, Singh, & Panda, 2015). The recombinant protein can be purified from inclusion bodies, however, the challenge is to solubilize and fold the protein into its native and biologically active state (Palmer & Wingfield, 2004). C41 cells on the other hand, contain a mutation in the LacUV5 promoter that directs transcription of the T7 RNA polymerase ORF, thus resulting in the lower expression of toxic proteins (Dumon-Seignovert et al., 2004; Kwon et al., 2015; S. Wagner et al., 2008). It could be that this lowered expression allows for protein to be in the soluble fraction. A Glutathione S-transferase (GST) tag was chosen as it has been shown that the GST tag can protects the recombinant fusion protein from intracellular proteolysis and stabilizes the protein as monomer or homodimer (Young et al., 2012). I found that the GST-tagged CrPV 3C protease was soluble in a ratio of 1:1, of soluble protein to pellet. Cleavage of the GST tag was attempted in order to purify the CrPV 3C protease, however this often resulted in significant loss of untagged protein as indicated by the formation of a precipitate in the cleavage reaction solution, therefore the tag was left on for downstream analysis (Figure 3.4). In Chapter 3, I found that upon optimization of the kinetic parameters, that the GST tagged CrPV 3C resulted in more active protease activity. Buffer conditions were optimized for maximal protease activity (20 mM HEPES pH 7.4, 100 mM NaCl, 0.01% Brij35, and 1 mM DTT). The GST tag was also though to improve the solubility of the protease during the 71 determination of its peptide cleavage kinetics, however, the addition of the tag may lead to the formation of dimers (Lim et al., 1994). In previous studies where GST has been crystalized they found that 2 monomers form a dimer-paired asymmetric unit (Lim et al., 1994). To date, there is currently no structure of the dicistrovirus protease. An in-silico prediction of the viral CrPV 3C protease was determined using Phyre2 (Figure 5.1A), which shows a similar fold to PV 3C (refer to Figure 1.4B for PV 3C). However, this in-silico prediction structure may not be an accurate model of the CrPV 3C protease. From one study of the phylogenetic analysis, the dicistrovirus 3C protease utilizes a catalytic triad consisting of a conserved nucleophilic cysteine, histidine, which acts as a base that polarizes the nucleophile, and aspartic acid which acts as an acid that stabilizes the complex (Figure 5.2) (Nakashima & Ishibashi, 2010; Nakashima & Nakamura, 2008). I confirmed in Chapter 3, that mutating the cysteine to alanine resulted in a mutant protease that is catalytically inactive, confirming that the cysteine at position 211 is the nucleophile. It will be of interest to elucidate the structure of CrPV 3C protease in order to identify inhibitors for this protease as has been done with the HIV protease. Moreover, the structure may provide insight into the specificity for its substrates and possibly its closely related protease DCV 3C (Figure 5.3) (Thaisrivongs et al., 1996). Knowing how different the CrPV 3C protease is to that of DCV 3C is of interest as DCV infection is known to lead to persistently infections, meaning there is a possibility these proteases cleave different substrates (Nayak et al., 2010). However, structural determination often requires large quantities of the purified untagged protease, and given that yield of the protease upon cleavage of the tag is an issue, the CrPV 3C protease may need to be truncated to improve solubility of untagged CrPV 3C (Kim et al., 1996). Truncations at the 5’ end 10 amino acids in, or 10 amino acids in at the 3’ end could be constructed, based off the in-silico fold prediction using Phyre2 72 where there is an occurrence of disordered domains (Figure 5.1B). Additionally, both 5’ and 3’ truncations 10 amino acids into the sequence on both sides could be made. It may even be of interest to purify recombinant CrPV 3C protease attached to the RdRp. Studies on the PV 3C protease indicate that it may function while still attached to the RdRp (Chase et al., 2014), given that it is not known if there is a pro form of CrPV 3C, it could be possible that it functions while attached to RdRp, which may be determined by a 3C antibody that can detect a 95 kDa band by western blot corresponding to the 3C-RdRp. 73 A) B) Figure 5.1 In-Silico prediction of CrPV 3C structural fold. Using an in-silico prediction, (A) one possible structure of CrPV 3C without the GST tag and (B) possible secondary structure fold with disordered domains indicated with “?”. Structure was generated using Phyre2 software (Imperical College, London). 74 Figure 5.2 Sequence alignment of CrPV 3C with other cysteine proteases. 75 Sequence alignment of the CrPV 3C protease against various cysteine proteases such as Marnaviridae and Dicistroviridae proteases, and unclassified RNA viruses refer to appendix Table A-1 for sequences used in alignment. Highlighted region indicates conserved regions. 76 Figure 5.3 Unrooted family tree of CrPV 3C with other cysteine proteases. An unrooted tree of the CrPV 3C protease against other viral +ssRNA viral cysteine proteases, refer to appendix Table A-1 for specific virus name, family, and accession number. Bootstrap is indicated in green. 77 5.1.2 Characterization of GST CrPV 3C kinetics In Chapter 3, I determined that the kcat of each preparation was significantly different, this may be because there is no known inhibitor for the CrPV 3C cysteine proteases or due to improper quantification of protein concentration. Protein preparations 1, 2, and 3 resulted in a Km of 2.2- 7.3 µM. With a kcat of 0.65-1.2 s-1 and a kcat / Km of 1.4-1.6 ∙ 10%𝑀'(𝑠'( for preparations 1, 2, and 3. The protease activity of the purified GST CrPV 3C was determined using the cleavage site from CrPV polyprotein of ORF 2. To address the difference in activity, a cysteine protease inhibitor should be used, but in order to do a proper titration to determine the amount of active protease a titration of an inhibitor is needed (Alan J Barrett & Kirschke, 1981). The inhibitor should be irreversible and tight binding, which insures that the inhibitor does not dissociate from the active site to freely inactivate another protease. Classically E-64 has been used in the titration of the cysteine protease such as cathepsin B, H, and L (Alan J Barrett & Kirschke, 1981). E-64 is a tight binding cysteine protease inhibitor that is irreversible (Alan J Barrett & Kirschke, 1981). Incubation of excess E-64 with 3C protease did not inhibit protease activity (data not shown). Alternatively, n-ethylmaleimide (NEM) could be used to identify the amount of active protein. NEM reacts with thiol groups and thus should block the nucleophile Cys of the 3C protease. However, optimal conditions for the determination of 3C protease activity contains DTT, which reduces the cysteine on the protein. But DTT also competes with the active site cysteine for binding with NEM. Thus, for typical NEM reactions, TCEP is used in place of DTT. The protease activity reactions need to be re-optimized with TCEP in the buffer reactions. A few assumptions must also be made when using NEM. Acetylation of the cysteines on CrPV 3C and the GST tag aside from the active site on CrPV 3C is assumed to be negligible, 78 and does not affect the ratio of available inhibitor to active site of the protease. Adding TCEP to the buffer in place of DTT does not significantly change the initial rate of reaction. When comparing the catalytic efficiency of GST CrPV 3C protease to convert the substrate to product, to that of other viral proteases. A serine protease from HCV, NS3, has a considerably weaker affinity to the substrate in question at Km > 50 µM, than CrPV 3C. The catalytic efficiency however, is significantly better than that of CrPV 3C (Bianchi et al., 1996). Given that CrPV 3C is chymotrypsin-like in its fold, when comparing CrPV 3C to a well characterized serine protease, trypsin, the CrPV 3C protease has a lower catalytic efficiency compared to trypsin (Evnin, Vásquez, & Craik, 1990). It should be noted that a proper comparison of the CrPV 3C protease to that of other proteases may not be an accurate representation of the catalytic efficiency of this protease. In Chapter 3, I also found that the in-vitro synthesized CrPV polyprotein is cleaved with the addition of the wild-type CrPV 3C but not the mutant CrPV 3C (Cys211Ala). Thus showing that the purified CrPV 3C is able to process the polyprotein as expected (Kerr et al., 2015). One study has shown that the processing of the 1A protein from the polyprotein is thought to be mediated by the 2A peptide, but it is not known if there is an upstream 3C cleavage site that aids in the cleavage (Nakashima & Ishibashi, 2010). Additionally, it is not known if the VPgs are processed by the virally encoded protease or host cellular proteases. To determine if this is true, the catalytic cysteine in the infectious clone can be substituted into an alanine in an ORF 2 stop infectious clone. By doing so, this ensures that the cleavage of 1A is truly mediated by the 2A peptide, and not by a possible downstream 3C cleavage site, as well as to determine if the VPgs are cleaved by the 3C protease. 79 5.1.3 CrPV 3C protease specificity The proteases cleave substrates with varying specificity, some may be highly promiscuous while other may have a specific specificity for a substrate sequence (López-Otín & Bond, 2008). The specificity of the protease is determined by molecular interaction at the protein-protein interface of the protease and substrate with the binding cleft within the catalytic core of the protease, which accommodates the substrate (Marcotte et al., 2007; Mosimann, Cherney, Sia, Plotch, & James, 1997; Schauperl et al., 2015; Schechter & Berger, 1967). The ability of the protease to accommodate substrates into the cleft is determined by the active site size (Schechter & Berger, 1967). Amino acid side chains are accommodated within the subpocket of the protease. These protease subpockets are termed Sn-Sn’; amino acids with specific side chains can fit into the pocket while others cannot (Neil D Rawlings, 2016; Schauperl et al., 2015). One good example of a protease with the binding cleft occupied with an inhibitor, is coxsackie virus 3C protease with ethyl amide inhibitor in its binding pocket (Becker et al., 2016) (Figure 5.4). Proteases can also contain exosites, which are a non-active site interaction surface that can recruit substrates (Jabaiah, Getz, Witkowski, Hardy, & Daugherty, 2012). In Chapter 4, I determined the cleavage site specificity of the CrPV 3C protease in a trypsin-digested E. coli library using PICS. The cleavage site specificity of CrPV as determined by PICS is consistent with the polyprotein cleavage, where cleavage is mediated by a Q in the P1 position and an A, T, or N at the P1’ position. Based on the IceLogo, the preferential cleavage at P1’ is equally an A and T, and to a lesser extent N. When comparing the substrate specificity to that of PV 3C protease, the P1 position is the same, but the P1’ position has a slight variation to that of CrPV 3C. It should be noted and emphasized that this may be due to the lack of 80 enrichment of peptides picked up during the PICS protocol, more repetitions are needed in order to solidify conclusions. 81 Figure 5.4 Binding of substrate inhibitor in binding cleft of coxsackie virus 3C protease. Coxasckie virus bound with protease inhibitor, with inhibitor in subpockets S1’, S1, S2, and S4 annotated. Reproduced with permission from (Becker et al., 2016) license (https://creativecommons.org/licenses/by/4.0/). . 82 5.2 Summary and Future directions Viruses utilize host substrates in order to facilitate the replication of its genome (Chase et al., 2014; Jagdeo et al., 2018). The viruses encode for proteases that cleave substrates to evade host immune response, halt transcription, translation, and more (Bonderoff et al., 2008; Chase et al., 2014; Feng et al., 2014). In well-characterized viral proteases, these substrates have been identified, however, substrates cleaved by the CrPV 3C protease have yet to be identified. Using data from PICS, a potential list of candidates could be identified bioinformatically in Drosophila, however this is only a bioinformatics approach and may not consider the accessibility of the protease to the substrate, and its abundance in the cell. Using the purified protease, TAILS is the next logical step to identify substrates cleaved by the CrPV 3C protease in-vitro and in-vivo using a Drosophila library. TAILS has successfully been used in the identification of candidate substrates, but this will require downstream analysis, to examine the physiological relevance of the substrate (Jagdeo et al., 2018; Kleifeld et al., 2010). The cleavage site specificity obtained from PICS could also be used to generate fluorogeneic peptides to measure cleavageactivities and then compare these activities to the polyprotein cleavage sites. Additionally, given that the construct contains an HRV 3C cut site, which cleaves at a Q/G, this poses a few complications if the purified CrPV 3C fusion protease (isolated from the GST tag) is used in TAILS or PICS (Ullah et al., 2016). It is important that if HRV 3C were to be used, that the final purified CrPV 3C does not contain any traces of HRV 3C. CrPV 3C and HRV 3C are both viral cysteine proteases that recognize Q/G (Nakashima & Ishibashi, 2010; Ullah et al., 2016). It is noted that the HRV 3C has no cleavage activity for the fluorogenic peptide used to monitor CrPV 3C (Figure 3.6A). Cleavage of the GST tag may result in a loss of active purified CrPV 3C, however the extent of this loss is difficult to quantify in the supernatant. This could be avoided by 83 purifying the CrPV 3C protease with a His tag or other tags, but would require re-optimization of purification and may not result in soluble protein. It may be of interest to purify other dicistrovirus 3C proteases, such as DCV as it is known to persistently infect its host, or IAPV as it a contributing factor in the decline of honeybees (Bonning, 2009; Chen et al., 2014; Swevers et al., 2013). Determining the variability of the substrate specificity and target host substrates may show how divergent these dicistroviruses are in their host substrates targets. Finally, it is of interest to find a tight protease inhibitor for the titration of this protease, and possibly for the use of drugs in agriculture. 84 Bibliography Ahlquist, P., Noueiry, A. O., Lee, W.-M., Kushner, D. B., & Dye, B. T. (2003). Host Factors in Positive-Strand RNA Virus Genome Replication. Journal of Virology, 77(15), 8181–8186. http://doi.org/10.1128/JVI.77.15.8181-8186.2003 Amatya, P. N., Kim, H.-B., Park, S.-J., Youn, C.-K., Hyun, J.-W., Chang, I.-Y., … You, H. J. (2012). A role of DNA-dependent protein kinase for the activation of AMP-activated protein kinase in response to glucose deprivation. Biochimica et Biophysica Acta (BBA) - Molecular Cell Research, 1823(12), 2099–2108. http://doi.org/https://doi.org/10.1016/j.bbamcr.2012.08.022 Andino, R., & Domingo, E. (2015). Viral quasispecies. Virology, 479–480, 46–51. JOUR. http://doi.org/10.1016/j.virol.2015.03.022 Asgari, S., & Johnson, K. N. (2010). Insect virology. BOOK, Norfolk, UK: Caister Academic. Atkinson, H. J., Babbitt, P. C., & Sajid, M. (2009). The global cysteine peptidase landscape in parasites. Trends in Parasitology, 25(12), 573–581. http://doi.org/10.1016/j.pt.2009.09.006 Au, H. H. T., & Jan, E. (2012). Insights into Factorless Translational Initiation by the tRNA-Like Pseudoknot Domain of a Viral IRES. PLOS ONE, 7(12), e51477. Retrieved from https://doi.org/10.1371/journal.pone.0051477 Avanzino, B. C., Fuchs, G., & Fraser, C. S. (2017). Cellular cap-binding protein, eIF4E, promotes picornavirus genome restructuring and translation. Proceedings of the National Academy of Sciences of the United States of America, 114(36), 9611–9616. http://doi.org/10.1073/pnas.1704390114 Baltimore, D. (1971). Expression of animal virus genomes. Bacteriological Reviews, 35(3), 235–241. Retrieved from https://www.ncbi.nlm.nih.gov/pubmed/4329869 85 Barrett, A. J., & Kirschke, H. B. T.-M. in E. (1981). [41] Cathepsin B, cathepsin H, and cathepsin L. In Proteolytic Enzymes, Part C (Vol. 80, pp. 535–561). Academic Press. http://doi.org/https://doi.org/10.1016/S0076-6879(81)80043-2 Barrett, A. J., & McDonald, J. K. (1986). Nomenclature: protease, proteinase and peptidase. Biochemical Journal, 237(3), 935. Retrieved from http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1147080/ Barrett, A. J., & Rawlings, N. D. (2001). Evolutionary lines of cysteine peptidases. Biological Chemistry, 382(5), 727—733. http://doi.org/10.1515/bc.2001.088 Barrett, A. J., Rawlings, N. D., Salvesen, G., & Fred Woessner, J. (2013). Introduction. In N. D. Rawlings & G. B. T.-H. of P. E. (Third E. Salvesen (Eds.) (pp. li–liv). Academic Press. http://doi.org/https://doi.org/10.1016/B978-0-12-382219-2.00838-3 Barton, D. J., O'Donnell, B. J., & Flanegan, J. B. (2001). 5′ cloverleaf in poliovirus RNA is a <em>cis</em>-acting replication element required for negative-strand synthesis. The EMBO Journal, 20(6), 1439 LP-1448. JOUR. Bazan, J. F., & Fletterick, R. J. (1988). Viral cysteine proteases are homologous to the trypsin-like family of serine proteases: structural and functional implications. Proceedings of the National Academy of Sciences of the United States of America, 85(21), 7872–7876. JOUR. Becker, D., Kaczmarska, Z., Arkona, C., Schulz, R., Tauber, C., Wolber, G., … Rademann, J. (2016). Irreversible inhibitors of the 3C protease of Coxsackie virus through templated assembly of protein-binding fragments. Nature Communications, 7, 12761. Retrieved from https://doi.org/10.1038/ncomms12761 Belov, G. A., Nair, V., Hansen, B. T., Hoyt, F. H., Fischer, E. R., & Ehrenfeld, E. (2012). Complex Dynamic Development of Poliovirus Membranous Replication Complexes. 86 Journal of Virology, 86(1), 302 LP-312. JOUR. Bianchi, E., Steinkühler, C., Taliani, M., Urbani, A., Francesco, R. De, & Pessi, A. (1996). Synthetic Depsipeptide Substrates for the Assay of Human Hepatitis C Virus Protease. Analytical Biochemistry, 237(2), 239–244. http://doi.org/https://doi.org/10.1006/abio.1996.0235 Biniossek, M. L., Niemer, M., Maksimchuk, K., Mayer, B., Fuchs, J., Huesgen, P. F., … Schilling, O. (2016). Identification of Protease Specificity by Combining Proteome-Derived Peptide Libraries and Quantitative Proteomics. Molecular & Cellular Proteomics : MCP, 15(7), 2515–2524. http://doi.org/10.1074/mcp.O115.056671 Blom, N., Hansen, J., Blaas, D., & Brunak, S. (1996). Cleavage site analysis in picornaviral polyproteins: discovering cellular targets by neural networks. Protein Science : A Publication of the Protein Society, 5(11), 2203–2216. Retrieved from http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2143287/ Bonderoff, J. M., LaRey, J. L., & Lloyd, R. E. (2008). Cleavage of Poly(A)-Binding Protein by Poliovirus 3C Proteinase Inhibits Viral Internal Ribosome Entry Site-Mediated Translation . Journal of Virology, 82(19), 9389–9399. JOUR. http://doi.org/10.1128/JVI.00006-08 Bonning, B. C. (2009). The Dicistroviridae: An emerging family of invertebrate viruses. Virologica Sinica, 24(5), 415. JOUR. http://doi.org/10.1007/s12250-009-3044-1 Bonning, B. C., & Miller, W. A. (2009). Dicistroviruses. Annual Review of Entomology, 55(1), 129–150. http://doi.org/10.1146/annurev-ento-112408-085457 Brandenburg, B., Lee, L. Y., Lakadamyali, M., Rust, M. J., Zhuang, X., & Hogle, J. M. (2007). Imaging Poliovirus Entry in Live Cells. PLOS Biology, 5(7), e183. JOUR. Brix, K. (2014). Proteases: Structure and Function. Proteases: Structure and Function. BOOK, 87 Springer. Byrd, M. P., Zamora, M., & Lloyd, R. E. (2005). Translation of Eukaryotic Translation Initiation Factor 4GI (eIF4GI) Proceeds from Multiple mRNAs Containing a Novel Cap-dependent Internal Ribosome Entry Site (IRES) That Is Active during Poliovirus Infection. Journal of Biological Chemistry , 280(19), 18610–18622. http://doi.org/10.1074/jbc.M414014200 Byrum, S. D., Loughran, A. J., Beenken, K. E., Orr, L. M., Storey, A. J., Mackintosh, S. G., … Smeltzer, M. S. (2018). Label-Free Proteomic Approach to Characterize Protease-Dependent and -Independent Effects of sarA Inactivation on the Staphylococcus aureus Exoproteome. Journal of Proteome Research, 17(10), 3384–3395. http://doi.org/10.1021/acs.jproteome.8b00288 Castello, A., Alvarez, E., & Carrasco, L. (2011). The Multifaceted Poliovirus 2A Protease: Regulation of Gene Expression by Picornavirus Proteases. Journal of biomedicine & biotechnology (Vol. 2011). BOOK. http://doi.org/10.1155/2011/369648 Chandramouli, K., & Qian, P.-Y. (2009). Proteomics: Challenges, Techniques and Possibilities to Overcome Biological Sample Complexity. Human Genomics and Proteomics : HGP, 2009, 239204. JOUR. http://doi.org/10.4061/2009/239204 Chase, A. J., Daijogo, S., & Semler, B. L. (2014). Inhibition of Poliovirus-Induced Cleavage of Cellular Protein PCBP2 Reduces the Levels of Viral RNA Replication. Journal of Virology, 88(6), 3192 LP-3201. Retrieved from http://jvi.asm.org/content/88/6/3192.abstract Chaudhury, S., & Gray, J. J. (2009). Identification of structural mechanisms of HIV-1 protease specificity using computational peptide docking: implications for drug resistance. Structure (London, England : 1993), 17(12), 1636–1648. JOUR. http://doi.org/10.1016/j.str.2009.10.008 88 Chen, Y. P., Nakashima, N., Christian, P. D., Bakonyi, T., Bonning, B. C., Valles, S. M., & Lightner, D. V. (2012). Family--Iflaviridae. Chen, Y. P., Pettis, J. S., Corona, M., Chen, W. P., Li, C. J., Spivak, M., … Evans, J. D. (2014). Israeli Acute Paralysis Virus: Epidemiology, Pathogenesis and Implications for Honey Bee Health. PLOS Pathogens, 10(7), e1004261. Retrieved from https://doi.org/10.1371/journal.ppat.1004261 Cherry, S., Kunte, A., Wang, H., Coyne, C., Rawson, R. B., & Perrimon, N. (2006). COPI Activity Coupled with Fatty Acid Biosynthesis Is Required for Viral Replication. PLOS Pathogens, 2(10), e102. JOUR. Cherry, S., & Silverman, N. (2006). Host-pathogen interactions in drosophila: new tricks from an old friend. Nature Immunology, 7(9), 911–917. JOUR. Choe, Y., Leonetti, F., Greenbaum, D. C., Lecaille, F., Bogyo, M., Brömme, D., … Craik, C. S. (2006). Substrate Profiling of Cysteine Proteases Using a Combinatorial Peptide Library Identifies Functionally Unique Specificities. Journal of Biological Chemistry , 281(18), 12824–12832. JOUR. http://doi.org/10.1074/jbc.M513331200 Ciulli, A. (2013). Biophysical Screening for the Discovery of Small-Molecule Ligands. Methods in Molecular Biology (Clifton, N.J.), 1008, 357–388. JOUR. http://doi.org/10.1007/978-1-62703-398-5_13 Coradin, M., Karch, K. R., & Garcia, B. A. (2017). Monitoring proteolytic processing events by quantitative mass spectrometry. Expert Review of Proteomics, 14(5), 409–418. JOUR. http://doi.org/10.1080/14789450.2017.1316977 De Jersey, J. (1970). Specificity of papain. Biochemistry, 9(8), 1761–1767. http://doi.org/10.1021/bi00810a015 89 De Jesus, N. H. (2007). Epidemics to eradication: the modern history of poliomyelitis. Virology Journal, 4, 70. JOUR. http://doi.org/10.1186/1743-422X-4-70 Demon, D., Van Damme, P., Berghe, T. Vanden, Deceuninck, A., Van Durme, J., Verspurten, J., … Vandenabeele, P. (2009). Proteome-wide Substrate Analysis Indicates Substrate Exclusion as a Mechanism to Generate Caspase-7 Versus Caspase-3 Specificity. Molecular & Cellular Proteomics : MCP, 8(12), 2700–2714. JOUR. http://doi.org/10.1074/mcp.M900310-MCP200 Denison, M. R. (2008). Seeking Membranes: Positive-Strand RNA Virus Replication Complexes. PLoS Biology, 6(10), e270. JOUR. http://doi.org/10.1371/journal.pbio.0060270 Di Cera, E. (2009). Serine Proteases. IUBMB Life, 61(5), 510–515. JOUR. http://doi.org/10.1002/iub.186 Diamond, S. L. (2007). Methods for mapping protease specificity. Current Opinion in Chemical Biology, 11(1), 46–51. JOUR. http://doi.org/https://doi.org/10.1016/j.cbpa.2006.11.021 Dillon, M. E., Wang, G., Garrity, P. A., & Huey, R. B. (2009). Review: Thermal preference in Drosophila. Journal of Thermal Biology, 34(3), 109–119. JOUR. http://doi.org/10.1016/j.jtherbio.2008.11.007 Dodson, G., & Wlodawer, A. (1998). Catalytic triads and their relatives. Trends in Biochemical Sciences, 23(9), 347–352. JOUR. http://doi.org/https://doi.org/10.1016/S0968-0004(98)01254-7 Dotzauer, A., & Kraemer, L. (2012). Innate and adaptive immune responses against picornaviruses and their counteractions: An overview. World Journal of Virology, 1(3), 91–107. http://doi.org/10.5501/wjv.v1.i3.91 Dumon-Seignovert, L., Cariot, G., & Vuillard, L. (2004). The toxicity of recombinant proteins in 90 Escherichia coli: a comparison of overexpression in BL21(DE3), C41(DE3), and C43(DE3). Protein Expression and Purification, 37(1), 203–206. JOUR. http://doi.org/https://doi.org/10.1016/j.pep.2004.04.025 Dynan, W. S., & Yoo, S. (1998). Interaction of Ku protein and DNA-dependent protein kinase catalytic subunit with nucleic acids. Nucleic Acids Research, 26(7), 1551–1559. Retrieved from https://www.ncbi.nlm.nih.gov/pubmed/9512523 Erez, E., Fass, D., & Bibi, E. (2009). How intramembrane proteases bury hydrolytic reactions in the membrane. Nature, 459, 371. JOUR. Evnin, L. B., Vásquez, J. R., & Craik, C. S. (1990). Substrate specificity of trypsin investigated by using a genetic selection. Proceedings of the National Academy of Sciences of the United States of America, 87(17), 6659–6663. Retrieved from https://www.ncbi.nlm.nih.gov/pubmed/2204062 Ezgimen, M. D., Mueller, N. H., Teramoto, T., & Padmanabhan, R. (2009). Effects of Detergents on the West Nile virus Protease Activity. Bioorganic & Medicinal Chemistry, 17(9), 3278. JOUR. http://doi.org/10.1016/j.bmc.2009.03.050 Feng, Q., Langereis, M. A., Lork, M., Nguyen, M., Hato, S. V, Lanke, K., … van Kuppeveld, F. J. M. (2014). Enterovirus 2Apro targets MDA5 and MAVS in infected cells. Journal of Virology, 88(6), 3369–3378. http://doi.org/10.1128/JVI.02712-13 GOODFELLOW, I. A. N. G., KERRIGAN, D., & EVANS, D. J. (2003). Structure and function analysis of the poliovirus cis-acting replication element (CRE). RNA, 9(1), 124–137. JOUR. http://doi.org/10.1261/rna.2950603 Graham, K. L., Gustin, K. E., Rivera, C., Kuyumcu-Martinez, N. M., Choe, S. S., Lloyd, R. E., … Utz, P. J. (2004). Proteolytic cleavage of the catalytic subunit of DNA-dependent protein 91 kinase during poliovirus infection. Journal of Virology, 78(12), 6313–6321. http://doi.org/10.1128/JVI.78.12.6313-6321.2004 Han, M.-J., & Lee, S. Y. (2006). The Escherichia coli Proteome: Past, Present, and Future Prospects . Microbiology and Molecular Biology Reviews, 70(2), 362–439. http://doi.org/10.1128/MMBR.00036-05 Hedstrom, L. (2002). Serine Protease Mechanism and Specificity. Chemical Reviews, 102(12), 4501–4524. JOUR. http://doi.org/10.1021/cr000033x Hegde, A. N. (2010). 5.21 - Ubiquitin-Dependent Protein Degradation. In H.-W. (Ben) Liu & L. B. T.-C. N. P. I. I. Mander (Eds.) (pp. 699–752). Oxford: Elsevier. http://doi.org/https://doi.org/10.1016/B978-008045382-8.00697-3 Hertz, M. I., & Thompson, S. R. (2011). Mechanism of translation initiation by Dicistroviridae IGR IRESs. Virology, 411(2), 355–361. JOUR. http://doi.org/https://doi.org/10.1016/j.virol.2011.01.005 Hill, M. E., Kumar, A., Wells, J. A., Hobman, T. C., Julien, O., & Hardy, J. A. (2018). The Unique Cofactor Region of Zika Virus NS2B–NS3 Protease Facilitates Cleavage of Key Host Proteins. ACS Chemical Biology, 13(9), 2398–2405. http://doi.org/10.1021/acschembio.8b00508 Hillman, B. I., & Cai, G. (2013). Chapter Six - The Family Narnaviridae: Simplest of RNA Viruses. In S. A. B. T.-A. in V. R. Ghabrial (Ed.), Mycoviruses (Vol. 86, pp. 149–176). Academic Press. http://doi.org/https://doi.org/10.1016/B978-0-12-394315-6.00006-4 Imataka, H., & Sonenberg, N. (1997). Human eukaryotic translation initiation factor 4G (eIF4G) possesses two separate and independent binding sites for eIF4A. Molecular and Cellular Biology, 17(12), 6940–6947. Retrieved from 92 https://www.ncbi.nlm.nih.gov/pubmed/9372926 Jabaiah, A. M., Getz, J. A., Witkowski, W. A., Hardy, J. A., & Daugherty, P. S. (2012). Identification of protease exosite-interacting peptides that enhance substrate cleavage kinetics. Biological Chemistry, 393(9), 933–941. http://doi.org/10.1515/hsz-2012-0162 Jagdeo, J. M., Dufour, A., Klein, T., Solis, N., Kleifeld, O., Kizhakkedathu, J., … Jan, E. (2018). N-Terminomics TAILS Identifies Host Cell Substrates of Poliovirus and Coxsackievirus B3 3C Proteinases That Modulate Virus Infection. Journal of Virology, 92(8), e02211-17. http://doi.org/10.1128/JVI.02211-17 Karvinen, J., Laitala, V., Mäkinen, M.-L., Mulari, O., Tamminen, J., Hermonen, J., … Hemmilä, I. (2004). Fluorescence Quenching-Based Assays for Hydrolyzing Enzymes. Application of Time-Resolved Fluorometry in Assays for Caspase, Helicase, and Phosphatase. Analytical Chemistry, 76(5), 1429–1436. JOUR. http://doi.org/10.1021/ac030234b Kerr, C. H., Wang, Q. S., Keatings, K., Khong, A., Allan, D., Yip, C. K., … Jan, E. (2015). The 5′ Untranslated Region of a Novel Infectious Molecular Clone of the Dicistrovirus Cricket Paralysis Virus Modulates Infection. Journal of Virology, 89(11), 5919–5934. JOUR. http://doi.org/10.1128/JVI.00463-15 Khong, A., Bonderoff, J. M., Spriggs, R. V, Tammpere, E., Kerr, C. H., Jackson, T. J., … Jan, E. (2016). Temporal Regulation of Distinct Internal Ribosome Entry Sites of the Dicistroviridae Cricket Paralysis Virus. Viruses, 8(1), 25. http://doi.org/10.3390/v8010025 Khong, A., Kerr, C. H., Yeung, C. H. L., Keatings, K., Nayak, A., Allan, D. W., & Jan, E. (2017). Disruption of Stress Granule Formation by the Multifunctional Cricket Paralysis Virus 1A Protein. Journal of Virology, 91(5), e01779-16. JOUR. http://doi.org/10.1128/JVI.01779-16 93 Kim, J. L., Morgenstern, K. A., Lin, C., Fox, T., Dwyer, M. D., Landro, J. A., … Thomson, J. A. (1996). Crystal Structure of the Hepatitis C Virus NS3 Protease Domain Complexed with a Synthetic NS4A Cofactor Peptide. Cell, 87(2), 343–355. http://doi.org/https://doi.org/10.1016/S0092-8674(00)81351-3 Kleifeld, O., Doucet, A., auf dem Keller, U., Prudova, A., Schilling, O., Kainthan, R. K., … Overall, C. M. (2010). Isotopic labeling of terminal amines in complex samples identifies protein N-termini and protease cleavage products. Nature Biotechnology, 28, 281. JOUR. Kundu, P., Raychaudhuri, S., Tsai, W., & Dasgupta, A. (2005). Shutoff of RNA Polymerase II Transcription by Poliovirus Involves 3C Protease-Mediated Cleavage of the TATA-Binding Protein at an Alternative Site: Incomplete Shutoff of Transcription Interferes with Efficient Viral Replication. Journal of Virology, 79(15), 9702–9713. JOUR. http://doi.org/10.1128/JVI.79.15.9702-9713.2005 Kuyumcu-Martinez, N. M., Van Eden, M. E., Younan, P., & Lloyd, R. E. (2004). Cleavage of Poly(A)-Binding Protein by Poliovirus 3C Protease Inhibits Host Cell Translation: a Novel Mechanism for Host Translation Shutoff. Molecular and Cellular Biology, 24(4), 1779–1790. JOUR. http://doi.org/10.1128/MCB.24.4.1779-1790.2004 Kwon, S.-K., Kim, S. K., Lee, D.-H., & Kim, J. F. (2015). Comparative genomics and experimental evolution of Escherichia coli BL21(DE3) strains reveal the landscape of toxicity escape from membrane protein overproduction. Scientific Reports, 5, 16076. JOUR. http://doi.org/10.1038/srep16076 Lautie´-Harivel, N. (1992). Drosophila C virus cycle during the development of twoDorosphila melanogaster strains (Charolles and Champetie`res) after larval contamination by food. Biology of the Cell, 76, 151–157. JOUR. http://doi.org/https://doi.org/10.1016/0248-94 4900(92)90207-H Lévêque, N., & Semler, B. L. (2015). A 21st Century Perspective of Poliovirus Replication. PLOS Pathogens, 11(6), e1004825. JOUR. Lim, K., Ho, J. X., Keeling, K., Gilliland, G. L., Ji, X., Rüker, F., & Carter, D. C. (1994). Three-dimensional structure of Schistosoma japonicum glutathione S-transferase fused with a six-amino acid conserved neutralizing epitope of gp41 from HIV. Protein Science : A Publication of the Protein Society, 3(12), 2233–2244. http://doi.org/10.1002/pro.5560031209 Lin, Y., & Welsh, W. J. (1996). Molecular modeling of substrate-enzyme reactions for the cysteine protease papain. Journal of Molecular Graphics, 14(2), 62–72. http://doi.org/https://doi.org/10.1016/0263-7855(96)00028-8 Liu, Y., Wang, R., Sun, B., Mi, T., Zhang, J., Mu, Y., … Chen, S. R. W. (2014). Generation and Characterization of a Mouse Model Harboring the Exon-3 Deletion in the Cardiac Ryanodine Receptor. PLoS ONE, 9(4), e95615. http://doi.org/10.1371/journal.pone.0095615 López-Otín, C., & Bond, J. S. (2008). Proteases: Multifunctional Enzymes in Life and Disease. The Journal of Biological Chemistry, 283(45), 30433–30437. JOUR. http://doi.org/10.1074/jbc.R800035200 Louis, J. M., Wondrak, E. M., Kimmel, A. R., Wingfield, P. T., & Nashed, N. T. (1999). Proteolytic Processing of HIV-1 Protease Precursor, Kinetics and Mechanism. Journal of Biological Chemistry , 274(33), 23437–23442. JOUR. http://doi.org/10.1074/jbc.274.33.23437 Louten, J. (2016). Chapter 14 - Poliovirus. In J. B. T.-E. H. V. Louten (Ed.) (pp. 257–271). 95 CHAP, Boston: Academic Press. http://doi.org/https://doi.org/10.1016/B978-0-12-800947-5.00014-4 Luecke, S., & Paludan, S. R. (2015). Chapter Two - Innate Recognition of Alphaherpesvirus DNA. In K. Maramorosch & T. C. B. T.-A. in V. R. Mettenleiter (Eds.) (Vol. 92, pp. 63–100). Academic Press. http://doi.org/https://doi.org/10.1016/bs.aivir.2014.11.003 Maciejewski, S., Nguyen, J. H. C., Gómez-Herreros, F., Cortés-Ledesma, F., Caldecott, K. W., & Semler, B. L. (2016). Divergent Requirement for a DNA Repair Enzyme during Enterovirus Infections. MBio, 7(1). JOUR. Marcotte, L. L., Wass, A. B., Gohara, D. W., Pathak, H. B., Arnold, J. J., Filman, D. J., … Hogle, J. M. (2007). Crystal structure of poliovirus 3CD protein: virally encoded protease and precursor to the RNA-dependent RNA polymerase. Journal of Virology, 81(7), 3583–3596. http://doi.org/10.1128/JVI.02306-06 Massie, H. R., Williams, T. R., & Colacicco, J. R. (1981). Changes in pH with age in Drosophila and the influence of buffers on longevity. Mechanisms of Ageing and Development, 16(3), 221–231. http://doi.org/https://doi.org/10.1016/0047-6374(81)90098-1 Miller, L. K., & Ball, L. A. (2012). The insect viruses. Springer Science & Business Media. Mosimann, S. C., Cherney, M. M., Sia, S., Plotch, S., & James, M. N. (1997). Refined X-ray crystallographic structure of the poliovirus 3C gene product. Journal of Molecular Biology, 273(5), 1032—1047. http://doi.org/10.1006/jmbi.1997.1306 Murray, K. E., & Barton, D. J. (2003). Poliovirus CRE-Dependent VPg Uridylylation Is Required for Positive-Strand RNA Synthesis but Not for Negative-Strand RNA Synthesis. Journal of Virology, 77(8), 4739–4750. JOUR. http://doi.org/10.1128/JVI.77.8.4739-4750.2003 96 Murray, K. E., Steil, B. P., Roberts, A. W., & Barton, D. J. (2004). Replication of Poliovirus RNA with Complete Internal Ribosome Entry Site Deletions. Journal of Virology, 78(3), 1393–1402. JOUR. http://doi.org/10.1128/JVI.78.3.1393-1402.2004 Mutsvunguma, L. Z., Moetlhoa, B., Edkins, A. L., Luke, G. A., Blatch, G. L., & Knox, C. (2011). Theiler’s murine encephalomyelitis virus infection induces a redistribution of heat shock proteins 70 and 90 in BHK-21 cells, and is inhibited by novobiocin and geldanamycin. Cell Stress & Chaperones, 16(5), 505–515. http://doi.org/10.1007/s12192-011-0262-x Nakada-Tsukui, K., Tsuboi, K., Furukawa, A., Yamada, Y., & Nozaki, T. (2012). A novel class of cysteine protease receptors that mediate lysosomal transport. Cellular Microbiology, 14(8), 1299–1317. http://doi.org/10.1111/j.1462-5822.2012.01800.x Nakashima, N., & Ishibashi, J. (2010). Identification of the 3C-protease-mediated 2A/2B and 2B/2C cleavage sites in the nonstructural polyprotein precursor of a dicistrovirus lacking the NPGP motif. Archives of Virology, 155(9), 1477–1482. http://doi.org/10.1007/s00705-010-0723-z Nakashima, N., & Nakamura, Y. (2008). Cleavage sites of the “P3 region” in the nonstructural polyprotein precursor of a dicistrovirus. Archives of Virology, 153(10), 1955–1960. http://doi.org/10.1007/s00705-008-0208-5 Nakashima, N., & Shibuya, N. (2006). Multiple coding sequences for the genome-linked virus protein (VPg) in dicistroviruses. Journal of Invertebrate Pathology, 92(2), 100–104. JOUR. http://doi.org/https://doi.org/10.1016/j.jip.2006.03.003 Nayak, A., Berry, B., Tassetto, M., Kunitomi, M., Acevedo, A., Deng, C., … Andino, R. (2010). Cricket Paralysis Virus (CrPV) antagonizes Argonaute 2 to modulate antiviral defense in 97 Drosophila. Nature Structural & Molecular Biology, 17(5), 547–554. JOUR. http://doi.org/10.1038/nsmb.1810 Nayak, A., Kim, D. Y., Trnka, M. J., Kerr, C. H., Lidsky, P. V, Stanley, D. J., … Andino, R. (2018). A Viral Protein Restricts Drosophila RNAi Immunity by Regulating Argonaute Activity and Stability. Cell Host & Microbe, 24(4), 542–557.e9. JOUR. http://doi.org/10.1016/j.chom.2018.09.006 Ng, C. S., Jogi, M., Yoo, J.-S., Onomoto, K., Koike, S., Iwasaki, T., … Fujita, T. (2013). Encephalomyocarditis virus disrupts stress granules, the critical platform for triggering antiviral innate immune responses. Journal of Virology, 87(17), 9511–9522. http://doi.org/10.1128/JVI.03248-12 Nicklin, M. J., Harris, K. S., Pallai, P. V, & Wimmer, E. (1988). Poliovirus proteinase 3C: large-scale expression, purification, and specific cleavage activity on natural and synthetic substrates in vitro. Journal of Virology, 62(12), 4586–4593. Retrieved from http://www.ncbi.nlm.nih.gov/pmc/articles/PMC254243/ Pacini, L., Vitelli, A., Filocamo, G., Bartholomew, L., Brunetti, M., Tramontano, A., … Migliaccio, G. (2000). In vivo selection of protease cleavage sites by using chimeric Sindbis virus libraries. Journal of Virology, 74(22), 10563–10570. Retrieved from https://www.ncbi.nlm.nih.gov/pubmed/11044100 Palmer, I., & Wingfield, P. T. (2004). Preparation and extraction of insoluble (inclusion-body) proteins from Escherichia coli. Current Protocols in Protein Science, Chapter 6, Unit-6.3. http://doi.org/10.1002/0471140864.ps0603s38 Patil, V. M., & Gupta, S. P. (2017). Chapter 10 - Studies on Picornaviral Proteases and Their Inhibitors. In S. P. B. T.-V. P. and T. I. Gupta (Ed.) (pp. 263–315). Academic Press. 98 http://doi.org/https://doi.org/10.1016/B978-0-12-809712-0.00010-1 Perona, J. J., & Craik, C. S. (2018). Structural basis of substrate specificity in the serine proteases. Protein Science, 4(3), 337–360. JOUR. http://doi.org/10.1002/pro.5560040301 Person, M. D., Shen, J., Traner, A., Hensley, S. C., Lo, H.-H., Abbruzzese, J. L., & Li, D. (2006). Protein Fragment Domains Identified Using 2D Gel Electrophoresis/MALDI-TOF. Journal of Biomolecular Techniques : JBT, 17(2), 145–156. JOUR. Plank, T.-D. M., & Kieft, J. S. (2012). The structures of nonprotein-coding RNAs that drive internal ribosome entry site function. Wiley Interdisciplinary Reviews. RNA, 3(2), 195–212. http://doi.org/10.1002/wrna.1105 Plotch, S. J., & Palant, O. (1995). Poliovirus protein 3AB forms a complex with and stimulates the activity of the viral RNA polymerase, 3Dpol. Journal of Virology, 69(11), 7169–7179. JOUR. Rawlings, N. D. (2013). Protease Families, Evolution and Mechanism of Action BT - Proteases: Structure and Function. In K. Brix & W. Stöcker (Eds.) (pp. 1–36). Vienna: Springer Vienna. http://doi.org/10.1007/978-3-7091-0885-7_1 Rawlings, N. D. (2016). Peptidase specificity from the substrate cleavage collection in the MEROPS database and a tool to measure cleavage site conservation. Biochimie, 122, 5–30. http://doi.org/https://doi.org/10.1016/j.biochi.2015.10.003 Rawlings, N. D., Barrett, A. J., & Bateman, A. (2011). Asparagine Peptide Lyases: A SEVENTH CATALYTIC TYPE OF PROTEOLYTIC ENZYMES . Journal of Biological Chemistry , 286(44), 38321–38328. JOUR. http://doi.org/10.1074/jbc.M111.260026 Rawlings, N. D., Barrett, A. J., Thomas, P. D., Huang, X., Bateman, A., & Finn, R. D. (2018). The MEROPS database of proteolytic enzymes, their substrates and inhibitors in 2017 and a 99 comparison with peptidases in the PANTHER database. Nucleic Acids Research, 46(D1), D624–D632. Retrieved from http://dx.doi.org/10.1093/nar/gkx1134 Reavy, B., & Moore, N. F. (1983). Cell-free translation ofDrosophila C virus RNA: Identification of a virus protease activity involved in capsid protein synthesis and further studies onin vitro processing of Cricket paralysis virus specified proteins. Archives of Virology, 76(2), 101–115. http://doi.org/10.1007/BF01311694 Reineke, L. C., & Lloyd, R. E. (2015). The Stress Granule Protein G3BP1 Recruits Protein Kinase R To Promote Multiple Innate Immune Antiviral Responses. Journal of Virology, 89(5), 2575 LP-2589. Retrieved from http://jvi.asm.org/content/89/5/2575.abstract Reinganum, C., O’Loughlin, G. T., & Hogan, T. W. (1970). A nonoccluded virus of the field crickets Teleogryllus oceanicus and T. commodus (Orthoptera: Gryllidae). Journal of Invertebrate Pathology, 16(2), 214–220. JOUR. http://doi.org/https://doi.org/10.1016/0022-2011(70)90062-5 Roos, W. H., Ivanovska, I. L., Evilevitch, A., & Wuite, G. J. L. (2007). Viral capsids: Mechanical characteristics, genome packaging and delivery mechanisms. Cellular and Molecular Life Sciences , 64(12), 1484–1497. JOUR. http://doi.org/10.1007/s00018-007-6451-1 Rzychon, M., Chmiel, D., & Stec-Niemczyk, J. (2004). Modes of inhibition of cysteine proteases. Acta Biochimica Polonica, 51(4), 861–873. JOUR. http://doi.org/045104861 Sajid, M., & McKerrow, J. H. (2002). Cysteine proteases of parasitic organisms. Molecular and Biochemical Parasitology, 120(1), 1–21. JOUR. http://doi.org/https://doi.org/10.1016/S0166-6851(01)00438-8 Schauperl, M., Fuchs, J. E., Waldner, B. J., Huber, R. G., Kramer, C., & Liedl, K. R. (2015). 100 Characterizing Protease Specificity: How Many Substrates Do We Need? PLOS ONE, 10(11), e0142658. JOUR. Schechter, I., & Berger, A. (1967). On the size of the active site in proteases. I. Papain. Biochemical and Biophysical Research Communications, 27(2), 157–162. http://doi.org/https://doi.org/10.1016/S0006-291X(67)80055-X Schilling, O., auf dem Keller, U., & Overall, C. M. (2011). Protease Specificity Profiling by Tandem Mass Spectrometry Using Proteome-Derived Peptide Libraries BT - Gel-Free Proteomics: Methods and Protocols. In K. Gevaert & J. Vandekerckhove (Eds.) (pp. 257–272). Totowa, NJ: Humana Press. http://doi.org/10.1007/978-1-61779-148-2_17 Schilling, O., Huesgen, P. F., Barré, O., auf dem Keller, U., & Overall, C. M. (2011). Characterization of the prime and non-prime active site specificities of proteases by proteome-derived peptide libraries and tandem mass spectrometry. Nature Protocols, 6, 111. Retrieved from http://dx.doi.org/10.1038/nprot.2010.178 Schilling, O., & Overall, C. M. (2008). Proteome-derived, database-searchable peptide libraries for identifying protease cleavage sites. Nature Biotechnology, 26, 685. JOUR. Shen, Y., Igo, M., Yalamanchili, P., Berk, A. J., & Dasgupta, A. (1996). DNA binding domain and subunit interactions of transcription factor IIIC revealed by dissection with poliovirus 3C protease. Molecular and Cellular Biology, 16(8), 4163–4171. JOUR. Singh, A., Upadhyay, V., Upadhyay, A. K., Singh, S. M., & Panda, A. K. (2015). Protein recovery from inclusion bodies of Escherichia coli using mild solubilization process. Microbial Cell Factories, 14, 41. http://doi.org/10.1186/s12934-015-0222-8 Smith, G. C. M., & Jackson, S. P. (1999). The DNA-dependent protein kinase. Genes & Development , 13(8), 916–934. Retrieved from 101 http://genesdev.cshlp.org/content/13/8/916.short Stathopulos, P. B., Scholz, G. A., Hwang, Y.-M., Rumfeldt, J. A. O., Lepock, J. R., & Meiering, E. M. (2004). Sonication of proteins causes formation of aggregates that resemble amyloid. Protein Science : A Publication of the Protein Society, 13(11), 3017–3027. JOUR. http://doi.org/10.1110/ps.04831804 Sun, Y., Guo, Y., & Lou, Z. (2014). Formation and working mechanism of the picornavirus VPg uridylylation complex. Current Opinion in Virology, 9, 24–30. http://doi.org/https://doi.org/10.1016/j.coviro.2014.09.003 Swevers, L., Liu, J., & Smagghe, G. (2018). Defense Mechanisms against Viral Infection in Drosophila: RNAi and Non-RNAi. Viruses . EJOU. http://doi.org/10.3390/v10050230 Swevers, L., Vanden Broeck, J., & Smagghe, G. (2013). The possible impact of persistent virus infection on the function of the RNAi machinery in insects: a hypothesis. Frontiers in Physiology, 4, 319. http://doi.org/10.3389/fphys.2013.00319 Tate, J., Liljas, L., Scotti, P., Christian, P., Lin, T., & Johnson, J. E. (1999). The crystal structure of cricket paralysis virus: the first view of a new virus family. Nat Struct Mol Biol, 6(8), 765–774. JOUR. Teterina, N. L., Gorbalenya, A. E., Egger, D., Bienz, K., & Ehrenfeld, E. (1997). Poliovirus 2C protein determinants of membrane binding and rearrangements in mammalian cells. Journal of Virology, 71(12), 8962–8972. JOUR. Thaisrivongs, S., Skulnick, H. I., Turner, S. R., Strohbach, J. W., Tommasi, R. A., Johnson, P. D., … Watenpaugh, K. D. (1996). Structure-Based Design of HIV Protease Inhibitors:  Sulfonamide-Containing 5,6-Dihydro-4-hydroxy-2-pyrones as Non-Peptidic Inhibitors. Journal of Medicinal Chemistry, 39(22), 4349–4353. http://doi.org/10.1021/jm960541s 102 Turunen, P., Rowan, A. E., & Blank, K. (2014). Single-enzyme kinetics with fluorogenic substrates: lessons learnt and future directions. FEBS Letters, 588(19), 3553–3563. JOUR. http://doi.org/https://doi.org/10.1016/j.febslet.2014.06.021 Ullah, R., Shah, M. A., Tufail, S., Ismat, F., Imran, M., Iqbal, M., … Rhaman, M. (2016). Activity of the Human Rhinovirus 3C Protease Studied in Various Buffers, Additives and Detergents Solutions for Recombinant Protein Production. PloS One, 11(4), e0153436–e0153436. http://doi.org/10.1371/journal.pone.0153436 Valles, S. M., Chen, Y., Firth, A. E., Guérin, D. M. A., Hashimoto, Y., Herrero, S., … Consortium, I. R. (2017). ICTV Virus Taxonomy Profile: Dicistroviridae. The Journal of General Virology, 98(3), 355–356. JOUR. http://doi.org/10.1099/jgv.0.000756 Van Damme, P., Van Damme, J., Demol, H., Staes, A., Vandekerckhove, J., & Gevaert, K. (2009). A review of COFRADIC techniques targeting protein N-terminal acetylation. BMC Proceedings, 3(Suppl 6), S6–S6. JOUR. http://doi.org/10.1186/1753-6561-3-S6-S6 Van Wart, H. E., & Birkedal-Hansen, H. (1990). The cysteine switch: a principle of regulation of metalloproteinase activity with potential applicability to the entire matrix metalloproteinase gene family. Proceedings of the National Academy of Sciences, 87(14), 5578 LP-5582. JOUR. Venkataraman, S., Prasad, B. V. L. S., & Selvarajan, R. (2018). RNA Dependent RNA Polymerases: Insights from Structure, Function and Evolution. Viruses, 10(2), 76. JOUR. http://doi.org/10.3390/v10020076 Verma, S., Dixit, R., & Pandey, K. C. (2016). Cysteine Proteases: Modes of Activation and Future Prospects as Pharmacological Targets. Frontiers in Pharmacology, 7, 107. JOUR. http://doi.org/10.3389/fphar.2016.00107 103 Vizovišek, M., Vidmar, R., Fonović, M., & Turk, B. (2016). Current trends and challenges in proteomic identification of protease substrates. Biochimie, 122, 77–87. JOUR. http://doi.org/https://doi.org/10.1016/j.biochi.2015.10.017 Vogt, D. A., & Andino, R. (2010). An RNA Element at the 5′-End of the Poliovirus Genome Functions as a General Promoter for RNA Synthesis. PLOS Pathogens, 6(6), e1000936. Retrieved from https://doi.org/10.1371/journal.ppat.1000936 Wagner, R. N., Reed, J. C., & Chanda, S. K. (2015). HIV-1 protease cleaves the serine-threonine kinases RIPK1 and RIPK2. Retrovirology, 12(1), 74. http://doi.org/10.1186/s12977-015-0200-6 Wagner, S., Klepsch, M. M., Schlegel, S., Appel, A., Draheim, R., Tarry, M., … de Gier, J.-W. (2008). Tuning Escherichia coli for membrane protein overexpression. Proceedings of the National Academy of Sciences of the United States of America, 105(38), 14371–14376. JOUR. http://doi.org/10.1073/pnas.0804090105 Wang, Q. S., & Jan, E. (2014). Switch from Cap- to Factorless IRES-Dependent 0 and +1 Frame Translation during Cellular Stress and Dicistrovirus Infection. PLoS ONE, 9(8), e103601. JOUR. http://doi.org/10.1371/journal.pone.0103601 Wei, C., Meller, J., & Jiang, X. (2013). Substrate specificity of Tulane virus protease. Virology, 436(1), 24–32. http://doi.org/10.1016/j.virol.2012.10.010 White, J. P., Cardenas, A. M., Marissen, W. E., & Lloyd, R. E. (2007). Inhibition of Cytoplasmic mRNA Stress Granule Formation by a Viral Proteinase. Cell Host & Microbe, 2(5), 295–305. http://doi.org/10.1016/j.chom.2007.08.006 Wilkesman, J. (2017). Cysteine Protease Zymography: Brief Review BT - Zymography: Methods and Protocols. In J. Wilkesman & L. Kurz (Eds.) (pp. 25–31). CHAP, New York, 104 NY: Springer New York. http://doi.org/10.1007/978-1-4939-7111-4_3 Wilson, J. E., Pestova, T. V, Hellen, C. U. T., & Sarnow, P. (2000). Initiation of Protein Synthesis from the A Site of the Ribosome. Cell, 102(4), 511–520. http://doi.org/10.1016/S0092-8674(00)00055-6 Wilson, J. E., Powell, M. J., Hoover, S. E., & Sarnow, P. (2000). Naturally Occurring Dicistronic Cricket Paralysis Virus RNA Is Regulated by Two Internal Ribosome Entry Sites. Molecular and Cellular Biology, 20(14), 4990–4999. Retrieved from http://www.ncbi.nlm.nih.gov/pmc/articles/PMC85949/ WRIGHT, H. T. (2018). Secondary and Conformational Specificities of Trypsin and Chymotrypsin. European Journal of Biochemistry, 73(2), 567–578. http://doi.org/10.1111/j.1432-1033.1977.tb11352.x Yalamanchili, P., Datta, U., & Dasgupta, A. (1997). Inhibition of host cell transcription by poliovirus: cleavage of transcription factor CREB by poliovirus-encoded protease 3Cpro. Journal of Virology, 71(2), 1220–1226. Retrieved from http://www.ncbi.nlm.nih.gov/pmc/articles/PMC191176/ Young, C. L., Britton, Z. T., & Robinson, A. S. (2012). Recombinant protein expression and purification: A comprehensive review of affinity tags and microbial applications. Biotechnology Journal, 7(5), 620–634. http://doi.org/10.1002/biot.201100155 Yuan, Y., Barrett, D., Zhang, Y., Kahne, D., Sliz, P., & Walker, S. (2007). Crystal structure of a peptidoglycan glycosyltransferase suggests a model for processive glycan chain synthesis. Proceedings of the National Academy of Sciences, 104(13), 5348 LP-5353. Retrieved from http://www.pnas.org/content/104/13/5348.abstract 105 Appendices Figure A.1 Purification of preparation 2. (A) Wild-type (WT) GST CrPV 3C and (B) GST CrPV 3C (Cys211Ala) were purified by glutathione affinity chromatography from C41 E. coli after induction with 1 mM IPTG at 25º C for 4 hours. Supernatant (S), pellet (P), flowthrough (FT), and elution fractions from GST-tag purification were analyzed by SDS-PAGE and visualized by Coomassie blue staining. GST CrPV 3C was eluted with 50 mM Tris pH 7.5, 10 mM glutathione reduced. Elution fractions 1-6 were pooled and dialyzed against 100 mM NaCl, 20 mM HEPES pH 7.5, and 20% glycerol. (C) Purified protein was analyzed for purity and concentration by 12% SDS-PAGE gel against varying concentrations of BSA and visualized by Coomassie blue staining. 106 Figure A.2 Purification of preparation 3 (A) Wild-type (WT) GST CrPV 3C and (B) GST CrPV 3C (Cys211Ala) were purified by glutathione affinity chromatography from C41 E. coli after induction with 1 mM IPTG at 25ºC for 4 hours. Supernatant (S), pellet (P), flowthrough (FT), and elution fractions from GST-tag purification were analyzed by SDS-PAGE and visualized by Coomassie blue staining. GST CrPV 3C was eluted with 50 mM Tris pH 7.5, 10 mM glutathione reduced. Elution fractions 1-7 107 were pooled and dialyzed against 100 mM NaCl, 20 mM HEPES pH 7.5, and 20% glycerol. (C) Purified protein was analyzed for purity and concentration by 12% SDS-PAGE gel against varying concentrations of BSA and visualized by Coomassie blue staining. 108 Figure A.3 GST CrPV 3C cleavage site specificity using PICS in a GluC-digested E. coli library (A) Normalized heat map of the CrPV 3C cleavage site specificity from P6-P6’ and (B) its generated iceLogo from P3-P3’. Cleavage site specificity indicated was determined using PICS with a GluC-digested E. coli library which cleaves at D or E at the P1 position. Specificity of CrpV 3C is indicated with G or R the preferred amino acid at P1 and A, G, K or H at P1’. 109 Table A.1 Raw data used to make alignment and unrooted tree Sequences used to generate the unrooted tree, with its family, name, abbreviation, accession, and the source. Family Name Abbreviation Accession Source Dicistroviridae Cricket paralysis virus CrPV Q9IJX4 Uniprot Dicistroviridae Drosophila C virus DCV NP_044945 NCBI Dicistroviridae Plautia stali intestine virus PSIV NP_620555 NCBI Dicistroviridae Himetobi P virus HiPV AGW80519 NCBI Dicistroviridae Israeli acute paralysis virus IAPV YP_001040002 NCBI Dicistroviridae Kashmir bee virus KBV NP_851403 NCBI Dicistroviridae Acute bee paralysis virus ABPV NP_066241 NCBI Dicistroviridae Anopheles C virus AnCV YP_009252204 NCBI Dicistroviridae Empeyrat virus EmRV AMO03208 NCBI Dicistroviridae Homalodisca coagulata virus 1 HoCV1 ANS71495 NCBI Dicistroviridae Aphid lethal paralysis virus ALPV APG77968 NCBI Environmental samples Marine RNA virus JP-A JP-A YP_001429581 NCBI Environmental samples Marine RNA virus JP-B JP-B YP_001429583 NCBI Environmental samples Marine RNA virus SF-1 SF-1 AFM44930 NCBI Environmental samples Marine RNA virus SF-2 SF-2 AGZ83339 NCBI Environmental samples Marine RNA virus SF-3 SF-3 AHA44480 NCBI Marnaviridae Marine RNA virus BC-1 BC-1 AYD68773 NCBI Marnaviridae Marine RNA virus BC-2 BC-2 AYD68775 NCBI 110 Family Name Abbreviation Accession Source Marnaviridae Marine RNA virus BC-3 BC-3 AYD68777 NCBI Environmental samples Marine RNA virus JP-A JP-A YP_001429581 NCBI Picornaviridae Avisivirus A TuASV M4PJD6 Uniprot Picornaviridae Tremovirus A (also named Avian encephalomyelitis virus) AE NP_705604 NCBI Picornaviridae Senecavirus A SVA YP_002268402 NCBI Picornaviridae Hunnivirus A HuV-A2 F4YYF3 Uniprot Picornaviridae Teschovirus A TV-A NP_740358 NCBI Picornaviridae Foot-and-mouth disease virus FMDV-O NP_740466.1 NCBI Picornaviridae Oscivirus A1 OsV-A1 YP_003853308 NCBI Picornaviridae Hepatovirus A (also named hepatitis A virus) HAV NP_740558 NCBI Picornaviridae Pasivirus A1 PaV-A1 YP_006546268 NCBI Picornaviridae Mosavirus A2 MoV-A2 YP_009026384 NCBI Picornaviridae Cosavirus A HCoSV YP_002956106 NCBI Picornaviridae Equine rhinitis B virus 1 ERBV-1 NP_740371 NCBI Picornaviridae Foot-and-mouth disease virus FMDV-O NP_740466.1 NCBI Picornaviridae Oscivirus A1 OsV-A1 YP_003853308 NCBI Picornaviridae Hepatovirus A (also named hepatitis A virus) HAV NP_740558 NCBI Picornaviridae Pasivirus A1 PaV-A1 YP_006546268 NCBI Picornaviridae Mosavirus A2 MoV-A2 YP_009026384 NCBI Picornaviridae Cosavirus A HCoSV YP_002956106 NCBI Picornaviridae Equine rhinitis B virus 1 ERBV-1 NP_740371 NCBI Picornaviridae Seal picornavirus type 1 SePV1 YP_001497183 NCBI Picornaviridae Kunsagivirus A KuV-A S4VD62 Uniprot Picornaviridae Parechovirus A HPeV NP_740736 NCBI Picornaviridae Canine picodicistrovirus CaPd YP_007947667 NCBI Picornaviridae Duck hepatitis A virus 1 DHAV-1 YP_007969882 NCBI Picornaviridae Porcine sapelovirus 1 PSV-1 NP_740488 NCBI 111 Family Name Abbreviation Accession Source Picornaviridae Rosavirus A2 RoV-A2 A0A023T7J3 Uniprot Picornaviridae Miniopterus schreibersii picornavirus 1 MsPV-1 YP_009361827 NCBI Picornaviridae Passerivirus A1 PasV-A1 YP_003853297 NCBI Picornaviridae Encephalomyocarditis virus EMCV-1 NP_740410 NCBI Picornaviridae Coxsackievirus B3 CVB3 2ZTX_A NCBI Picornaviridae Enterovirus A71 EV-A71 ABG78190 NCBI Picornaviridae Enterovirus C EV-C YP_007353734 NCBI Picornaviridae Human rhinovirus B92 HRV-B92 ACU27233 NCBI Picornaviridae Rhinovirus A RV-A NP_740400 NCBI Picornaviridae Bat picornavirus BPV AIF74258 NCBI Picornaviridae Rhinovirus C HRV-C YP_001552441 NCBI Picornaviridae Enterovirus A71 (EV)-A71 ACL97382 NCBI Picornaviridae Coxsackievirus A24 CV-A24 AGG78621 NCBI Picornaviridae Rhinovirus B14 RV-B14 NP_740524 NCBI Picornaviridae Enterovirus B EV-B NP_740546 NCBI Picornaviridae Aichivirus B BKV NP_859027 NCBI Picornaviridae African bat icavirus PREDICT-06105 IcaV YP_009121764 NCBI Picornaviridae Tortoise rafivirus A RafV-A YP_009241362 NCBI Picornaviridae Eel picornavirus 1 EPV-1 YP_008549609 NCBI Picornaviridae Kobuvirus cattle/Kagoshima-1-22-KoV/2014/JPN KCaKV-1-22 YP_009167367 NCBI Picornaviridae Tupaia hepatovirus A TuHV-A YP_009220469 NCBI Picornaviridae Rabovirus A RaBoV YP_009118289 NCBI Picornaviridae Salivirus NG-J1 SaNGJV-1 YP_003038643 NCBI Picornaviridae Cosavirus D CoSV-D1 YP_002956128 NCBI Picornaviridae Human cosavirus B HCoSV-B YP_002956117 NCBI Picornaviridae Cosavirus E CoSV-E YP_002956086 NCBI Picornaviridae Tortoise picornavirus ToPV YP_009111405 NCBI Picornaviridae Caprine kobuvirus CapKV YP_009001379 NCBI Picornaviridae Chicken picornavirus 5 ChiPV-5 YP_009055045 NCBI Picornaviridae Chicken picornavirus 4 ChiPV-4 YP_009055034 NCBI Picornaviridae Chicken picornavirus 3 ChiPV-3 YP_009055023 NCBI Picornaviridae Chicken picornavirus 2 ChiPV-2 YP_009055012 NCBI 112 Family Name Abbreviation Accession Source Picornaviridae Oscivirus A2 OsV-A2 YP_003853319 NCBI Picornaviridae Enterovirus J EV-J YP_003359175 NCBI Picornaviridae Enterovirus A CV-A2 NP_740535 NCBI Picornaviridae Bovine hungarovirus 1 BHuV-1 YP_006846326 NCBI Picornaviridae Bat picornavirus 2 BPV-2 YP_004782568 NCBI Picornaviridae Bat picornavirus 1 BPV-1 YP_004782554 NCBI Picornaviridae Bat picornavirus 3 BPV-3 YP_004782540 NCBI Picornaviridae Pigeon picornavirus B PPV-B YP_004564618 NCBI Picornaviridae Bovine rhinitis B virus BRBV YP_001686947 NCBI Picornaviridae Avian sapelovirus ASV YP_164830 NCBI Picornaviridae Simian sapelovirus 1 SV2 NP_937978 NCBI Picornaviridae Sicinivirus A SiV-A YP_009021776 NCBI Picornaviridae Foot-and-mouth disease virus FMDV-C ABD67461 NCBI Picornaviridae Enterovirus G EV-G ARC95293 NCBI Picornaviridae Echovirus E30 E30 CAJ86643 NCBI Picornaviridae Enterovirus B77 EV-B77 CAD38168 NCBI Picornaviridae Echovirus E6 EE-6 CBL42978 NCBI Picornaviridae Echovirus E3 EE-3 CAH61520 NCBI Picornaviridae Porcine enterovirus 9 PEV-9 CAA74807 NCBI Picornaviridae Human poliovirus 3 PV-3 ALI31820 NCBI Picornaviridae Human poliovirus 2 PV-2 ALI31819 NCBI Picornaviridae Human poliovirus 1 PV-1 ALI31817 NCBI Picornaviridae Feline picornavirus FePV YP_004934029 NCBI Picornaviridae Coxsackievirus B4 CV-B4 AFR79234 NCBI Picornaviridae Porcine kobuvirus swine/S-1-HUN/2007/Hungary PKsHV YP_002456506 NCBI Secoviridae Cherry rasp leaf virus CRLV YP_081453 NCBI Secoviridae Broad bean wilt virus 1 BBWVI NP_951029 NCBI Secoviridae Cowpea mosaic virus CMV NP_734056 NCBI Secoviridae Red clover mottle virus RCMV NP_734029 NCBI Secoviridae Cowpea severe mosaic virus CPSMV NP_734061 NCBI Secoviridae Squash mosaic virus SqMV NP_734011 NCBI Secoviridae Rice tungro spherical virus RTSV NP_734462 NCBI Secoviridae Tomato torrado virus ToTV APP18148 NCBI 113 Family Name Abbreviation Accession Source Secoviridae Parsnip yellow fleck virus PYFV NP_734449 NCBI Secoviridae Satsuma dwarf virus SDV NP_734024 NCBI Secoviridae Tobacco ringspot virus TRSV AIA10370 NCBI Secoviridae Broad bean wilt virus 1 BBWV1 NP_951029 NCBI Unassigned Picornavirales Rhizosolenia setigera RNA virus 01 RsetRNAV01 YP_006732323 NCBI Unassigned Picornavirales Aurantiochytrium single-stranded RNA virus 01 AuRNAV Q33DY4 Uniprot Unassigned Picornavirales Darwin bee virus 1 DBV-1 AWK77841 NCBI Unassigned Picornavirales Beihai wrasse picornavirus BeWPV AVM87595 NCBI Unassigned Picornavirales Chaetoceros tenuissimus RNA virus 01 CtenRNAV01 YP_009505620 NCBI Unassigned Picornavirales Asterionellopsis glacialis RNA virus AglaRNAV YP_009047193 NCBI Unassigned Picornavirales Aurantiochytrium single-stranded RNA virus 01 AuRNAV01 YP_392465 NCBI Unclassified Dicistroviridae Caledonia beadlet anemone dicistro-like virus 2 CBADlV-2 ASM93984 NCBI Unclassified Dicistroviridae Millport beadlet anemone dicistro-like virus 1 MBADlV-1 ASM93982 NCBI Unclassified Dicistroviridae Apis dicistrovirus ADlV YP_009388499 NCBI Unclassified Dicistroviridae Goose dicistrovirus GDV YP_009221981 NCBI Unclassified Dicistroviridae Macrobrachium rosenbergii dicistrovirus 2 MrDV-2 AVP71827 NCBI Unclassified Dicistroviridae Barns Ness breadcrumb sponge dicistro-like virus 1 BbsDlV-1 ASM94061 NCBI Unclassified Dicistroviridae Big Sioux River virus BSRV YP_009389287 NCBI 114 Family Name Abbreviation Accession Source Unclassified Dicistroviridae Midge dicistrovirus MidDicV AOX47515 NCBI Unclassified Dicistroviridae Centovirus AC CtVAC YP_009315868 NCBI Unclassified Picornavirales Bundaberg bee virus 3 BBV-3 AWK77856 NCBI Unclassified Picornavirales Biomphalaria virus 2 BV2 YP_009342320 NCBI Unclassified Picornavirales Pittsburgh sewage-associated virus 1 PSAV1 AVA16916 NCBI Unclassified Picornavirales Beihai pentapodus picornavirus BePPV AVM87593 NCBI Unclassified RNA virus Beihai picorna-like virus 76 BePlV-76 APG77930 NCBI Unclassified RNA virus Hubei diptera virus 1 HbDV-1 YP_009336571 NCBI Unclassified RNA virus Hubei picorna-like virus 16 HplV-16 YP_009336583 NCBI Unclassified RNA virus Beihai picorna-like virus 75 BePlV-75 YP_009333386 NCBI Unclassified RNA virus Hubei picorna-like virus 14 HplV-14 YP_009337313 NCBI Unclassified RNA virus Hubei picorna-like virus 23 HplV-23 APG77443 NCBI Unclassified RNA virus Wuhan insect virus 33 WInV-33 YP_009345032 NCBI Unclassified RNA virus Hubei picorna-like virus 15 HplV-15 YP_009336540 NCBI Unclassified RNA virus Wuhan insect virus 11 WInV-11 APG76667 NCBI Unclassified RNA virus Beihai picorna-like virus 106 BePlV-106 YP_009333579 NCBI Unclassified RNA virus Beihai picorna-like virus 111 BePlV-111 YP_009333474 NCBI Unclassified RNA virus Beihai picorna-like virus 59 BePlV-59 YP_009345905.1 NCBI Unclassified RNA virus Beihai sphaeromadae virus 1 BeiSV-1 YP_009336998 NCBI Unclassified RNA virus Beihai picorna-like virus 99 BePlV-99 YP_009333580 NCBI Unclassified RNA virus Beihai picorna-like virus 107 BePlV-107 YP_009333563 NCBI 115 Family Name Abbreviation Accession Source Unclassified RNA virus Beihai picorna-like virus 125 BePlV-125 YP_009333553 NCBI Unclassified RNA virus Hubei earwig virus 3 HuEaV-3 APG76657 NCBI Unclassified RNA virus Hubei picorna-like virus 15 HplV-15 APG76662 NCBI Unclassified RNA virus Beihai picorna-like virus 80 BePlV-80 APG76683 NCBI Unclassified RNA virus Beihai picorna-like virus 70 BePlV-70 APG76699 NCBI Unclassified RNA virus Wenzhou channeled applesnail virus 3 WcASV-3 APG76701 NCBI Unclassified RNA virus Beihai picorna-like virus 103 BePlV-103 APG76703 NCBI Unclassified RNA virus Wenzhou picorna-like virus 29 WPlV- 29 APG76704 NCBI Unclassified RNA virus Beihai picorna-like virus 88 BePlV-88 APG76709 NCBI Unclassified RNA virus Beihai shrimp virus 1 BeShV-1 APG76712 NCBI Unclassified RNA virus Beihai picorna-like virus 93 BePlV-93 APG76720 NCBI Unclassified RNA virus Beihai picorna-like virus 105 BePlV-105 APG76745 NCBI Unclassified RNA virus Beihai picorna-like virus 72 BePlV-72 APG76750 NCBI Unclassified RNA virus Beihai picorna-like virus 85 BePlV-85 APG76754 NCBI Unclassified RNA virus Beihai picorna-like virus 115 BePlV-115 APG76767 NCBI Unclassified RNA virus Beihai picorna-like virus 87 BePlV-87 APG76810 NCBI Unclassified RNA virus Beihai echinoderm virus 1 BeEV-1 APG76811 NCBI Unclassified RNA virus Beihai picorna-like virus 100 BePlV-100 APG76829 NCBI Unclassified RNA virus Beihai picorna-like virus 84 BePlV-84 APG76879 NCBI 116 Family Name Abbreviation Accession Source Unclassified RNA virus Beihai picorna-like virus 90 BePlV-90 APG76897 NCBI Unclassified RNA virus Beihai picorna-like virus 83 BePlV-83 APG76903 NCBI Unclassified RNA virus Beihai mantis shrimp virus 4 BeMSV-4 APG76917 NCBI Unclassified RNA virus Beihai picorna-like virus 71 BePlV-71 APG78872 NCBI Unclassified RNA virus Beihai mantis shrimp virus 3 BeMSV-3 APG78875 NCBI Unclassified RNA virus Beihai picorna-like virus 104 BePlV-104 APG78913 NCBI Unclassified RNA virus Beihai picorna-like virus 114 BePlV-114 APG78958 NCBI Unclassified RNA virus Changjiang picorna-like virus 12 CPicLV-12 APG78998 NCBI Unclassified RNA virus Wuhan insect virus 11 WInV-11 APG79008 NCBI Unclassified RNA virus Changjiang picorna-like virus 13 CPicLV-13 APG79029 NCBI Unclassified RNA virus Hubei picorna-like virus 15 HplV-15 APG77916 NCBI Unclassified RNA virus Hubei picorna-like virus 17 HplV-17 APG77921 NCBI Unclassified RNA virus Beihai picorna-like virus 81 BePlV-81 APG77931 NCBI Unclassified RNA virus Hubei picorna-like virus 18 HplV-18 APG77976 NCBI Unclassified RNA virus Hubei picorna-like virus 25 HplV-25 APG77992 NCBI Unclassified RNA virus Hubei diptera virus 1 HbDV-1 APG78034 NCBI Unclassified RNA virus Wenzhou shrimp virus 5 WeSV-5 APG78050 NCBI Unclassified RNA virus Wenzhou shrimp virus 4 WeSV-4 APG78059 NCBI Unclassified RNA virus Shahe picorna-like virus 8 ShPV-8 APG77379 NCBI Unclassified RNA virus Shahe picorna-like virus 10 ShPV-10 APG77386 NCBI 117 Family Name Abbreviation Accession Source Unclassified RNA virus Shahe picorna-like virus 11 ShPV-11 APG77393 NCBI Unclassified RNA virus Shahe heteroptera virus 4 ShHV-4 APG77358 NCBI Unclassified RNA virus Shahe picorna-like virus 12 ShPV-12 APG77372 NCBI Unclassified RNA virus Hubei picorna-like virus 48 HplV-48 APG77411 NCBI Unclassified RNA virus Sanxia water strider virus 9 SxWSV-9 APG77459 NCBI Unclassified RNA virus Hubei picorna-like virus 49 HplV-49 APG77504 NCBI Unclassified RNA virus Hubei picorna-like virus 24 HplV-24 APG77516 NCBI Unclassified RNA virus Hubei myriapoda virus 4 HbMV-4 APG78388 NCBI Unclassified RNA virus Hubei earwig virus 3 HuEV-3 APG78389 NCBI Unclassified RNA virus Wuhan arthropod virus 2 WuAV-2 APG78413 NCBI Unclassified RNA virus Wuhan insect virus 33 WhIV-4 APG78417 NCBI Unclassified RNA virus Hubei picorna-like virus 54 HplV-54 APG78441 NCBI Unclassified RNA virus Wenling picorna-like virus 3 WenpV-3 APG78472 NCBI Unclassified RNA virus Wenling picorna-like virus 4 WenpV-4 APG78474 NCBI Unclassified RNA virus Wenling crustacean virus WlCV APG78483 NCBI Unclassified RNA virus Wenling picorna-like virus 5 WenpV-5 APG78485 NCBI Unclassified RNA virus Wenling crustacean virus 5 WlCV-5 APG78487 NCBI Unclassified RNA virus Wenling crustacean virus 4 WlCV-4 APG78495 NCBI 118 Family Name Abbreviation Accession Source Unclassified RNA virus Wenzhou channeled applesnail virus 2 WcASV-2 APG78498 NCBI Unclassified RNA virus Wenzhou picorna-like virus 39 WPlV- 39 APG78508 NCBI Unclassified RNA virus Wenzhou picorna-like virus 34 WPlV- 34 APG78512 NCBI Unclassified RNA virus Wenzhou picorna-like virus 36 WPlV- 36 APG78526 NCBI Unclassified RNA virus Wenzhou picorna-like virus 35 WPlV- 35 APG78528 NCBI Unclassified RNA virus Wenzhou shrimp virus 4 WenzSV-4 APG78533 NCBI Unclassified RNA virus Wenzhou picorna-like virus 29 WPlV- 29 APG78539 NCBI Unclassified RNA virus Wenzhou picorna-like virus 45 WPlV- 45 APG78552 NCBI Unclassified RNA virus Wenzhou picorna-like virus 26 WPlV- 26 APG78588 NCBI Unclassified RNA virus Beihai picorna-like virus 116 BePlV-116 APG78598 NCBI Unclassified RNA virus Beihai picorna-like virus 82 BePlV-82 APG78608 NCBI Unclassified RNA virus Hubei orthoptera virus 1 HuOV-1 APG78626 NCBI Unclassified ssRNA virus Chaetoceros tenuissimus RNA virus type-II SS10-16V YP_009111336 NCBI Unclassified ssRNA virus Chaetoceros sp. RNA virus 2 Csp02RNAV01 BAK40203 NCBI