Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Structural insights into inhibitory mechanisms of cathepsin K Law, Simon Sau Yin 2018

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Notice for Google Chrome users:
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.

Item Metadata

Download

Media
24-ubc_2019_february_law_simon_sauyin.pdf [ 13.87MB ]
Metadata
JSON: 24-1.0375750.json
JSON-LD: 24-1.0375750-ld.json
RDF/XML (Pretty): 24-1.0375750-rdf.xml
RDF/JSON: 24-1.0375750-rdf.json
Turtle: 24-1.0375750-turtle.txt
N-Triples: 24-1.0375750-rdf-ntriples.txt
Original Record: 24-1.0375750-source.json
Full Text
24-1.0375750-fulltext.txt
Citation
24-1.0375750.ris

Full Text

   STRUCTURAL INSIGHTS INTO INHIBITORY MECHANISMS OF CATHEPSIN K by  Simon Sau Yin Law  B.Sc., The University of British Columbia, 2013  A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF  DOCTOR OF PHILOSOPHY in THE FACULTY OF GRADUATE AND POSTDOCTORAL STUDIES (BIOCHEMISTRY AND MOLECULAR BIOLOGY)  THE UNIVERSITY OF BRITISH COLUMBIA (Vancouver)  December 2018  © Simon Sau Yin Law, 2018   ii  The following individuals certify that they have read, and recommend to the Faculty of Graduate and Postdoctoral Studies for acceptance, the dissertation entitled: Structural Insights into Inhibitory Mechanisms of Cathepsin K  submitted by Simon Sau Yin Law in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Biochemistry and Molecular Biology  Examining Committee: Dieter Bromme, Oral Biological and Medical Sciences Supervisor  Chris Overall, Oral Biological and Medical Sciences Supervisory Committee Member  Christian Kastrup, Biochemistry and Molecular Biology Supervisory Committee Member Katherine Ryan, Chemistry University Examiner Filip Van Petegem, Biochemistry and Molecular Biology University Examiner    iii  Abstract Cathepsin K (CatK) is a lysosomal cysteine protease highly expressed in osteoclasts and is responsible for the degradation of bone. Implication of CatK in various musculoskeletal disorders including osteoporosis highlights CatK as an important target for drug development. This thesis aims to identify and characterize different classes of CatK inhibitors, which target the active site and collagenolytically relevant ectosteric sites using X-ray crystallography combined with mutational and kinetic studies. In chapter 2, odanacatib, a specific CatK inhibitor recently abandoned in clinical trials due to adverse side effects, was investigated for its selectivity for mouse CatK over the human counterpart. Structural, mutagenic, and kinetic studies identified two structural features in the mouse enzyme which determines odanacatib’s selectivity. Replacement of these features from the human counterpart was able to restore inhibitory activity in the mouse enzyme.  In chapters 3 and 4, composite docking and high throughput fluorescence polarization (FP) assay methods were developed to identify novel ectosteric inhibitors of CatK by targeting the protein oligomerization site required for collagen degradation. Screening of the NCI Database (280,000 compounds) using three distinct molecular modeling methods identified nine active compounds. The best compound had an IC50 value around 300 nM in cell-based resorption assays. Over 5,000 compounds were also screened using a developed FP assay and nine collagenase inhibitors were identified. Three of these compounds were active in subsequent cell-based assays. In chapter 5, NSC-13345, the structure of a putative allosteric inhibitor of CatK described in literature was determined in complex with the enzyme using X-ray crystallography. Previous characterization was performed using an inactive variant and the allosteric site affected by the iv  presence of an extra loop. Structural determination with the fully processed enzyme identified three binding sites. One site located above the active site may explain its substrate selective inhibition. In addition, the crystal structure of T-06, a potent collagenase inhibitor of CatK, in complex with CatK was determined and provided insight into its ectosteric inhibition. These findings suggest that NSC-13345 and T-06 function as a substrate selective ectosteric inhibitors for CatK. v  Lay Summary Osteoporosis affects 50% of women aged 50 years or older and medical costs are estimated to be 17-20 billion dollars in the US and over two billion dollars in Canada, annually. Cathepsin K (CatK) is an enzyme highly expressed in the cells responsible for bone degradation. Inhibiting this enzyme has been shown to be effective at alleviating symptoms associated with osteoporosis. Various CatK inhibitors have been tested in clinical trials but they have all failed due to side effects with unknown causes. Therefore, it is important to study the origins of these side effects and to investigate other potential therapeutics. This thesis covers the investigation of the potency of odanacatib, a highly promising drug for treating osteoporosis abandoned after Phase III clinical trials due to complicating side effects. It also covers the development of novel drug development methods to identify compounds specifically targeting the bone remodeling activities of the enzyme. vi  Preface Dr. Dieter Brömme was the principal investigator of the research projects. I was responsible for design, performance and analysis of the research as well as all the manuscript preparation for the work described in this thesis with details and exceptions outlined below.  Chapter 2 A version of chapter 2 has been published in the Biochemical Journal. (Law, S., Andrault, P., Aguda, A., Nguyen, N., Kruglyak, N., Brayer, G., Brömme, D., Identification of mouse cathepsin K structural elements that regulate the potency of odanacatib. Biochemical Journal, Vol. 474, No. 5, pp. 851-864, Feb., 2017.) I was responsible for the research design and performance of the experiments and the manuscript preparation. Dr. Pierre-Marie Andrault assisted with the characterization of the kinetic properties of the enzyme as well as general revision of the manuscript prior to submission and publication. Dr. Adeleke Aguda aided with the structural determination of the crystal structure of the inhibitor-free human wild-type CatK and Dr. Gary Brayer provided insight for the other CatK structures. Chapter 3 A version of chapter 3 has been published in PLOS One. (Law, S., Panwar, P., Li, J., Aguda, A., Jamroz, A., Guido, R., Brömme, D., A composite docking approach for the identification and characterization of ectosteric inhibitors of cathepsin K, PLoS One, Vol. 12, No. 10, pp. e0186869, Oct., 2017.) I was responsible for the research design and performing of the experiments and the manuscript preparation. Co-op student Jody Li assisted under my supervision in the evaluation of the compounds identified through composite docking for their collagenase properties. Dr. Preety Panwar provided her expertise and assisted in the cell-based vii  osteoclast resorption assays. Dr. Rafael Guido provided guidance and resources for the development of the composite docking method. Chapter 4 Former Ph.D. graduate Xin Du generated the screening data for the compounds from the KD2 library in collaboration with the Centre for Drug Research and Development (CDRD). Dr. Preety Panwar assisted with the cell-based bone resorption assays and subsequent analysis. A manuscript containing a modified version of the chapter is in preparation. Chapter 5  Udit Dalwadi under my supervision generated and characterized the allosteric binding site mutant CatK as part of his undergraduate thesis. Dr. Adeleke Aguda provided his guidance and expertise in the structural determination of the CatK-NSC-13345 and CatK-T-06 complex. A manuscript prepared containing the data presented is under preparation. In addition to the projects outlined above, I also conducted additional projects and experiments which have resulted in contributions to four other published peer-reviewed research articles. A complete list of other co-authored publications can be found in the Appendices with a short description of my relative contributions to each manuscript. Manuscripts have been reproduced with permission from the respective copyright holders.    viii  Table of Contents   Abstract ......................................................................................................................................... iii Lay Summary ................................................................................................................................ v Preface ........................................................................................................................................... vi Table of Contents ....................................................................................................................... viii List of Tables ................................................................................................................................. x List of Figures ............................................................................................................................... xi Abbreviations ............................................................................................................................. xiii Acknowledgements ..................................................................................................................... xv 1. Introduction ........................................................................................................................... 1 1.1. Structure and Function of Proteases ............................................................................ 1 1.2. Proteases in the Extracellular Matrix .......................................................................... 6 1.3. Cysteine Proteases .......................................................................................................... 9 1.3.1. Cathepsin K ........................................................................................................... 13 1.3.2. Cathepsin K and Bone Remodeling..................................................................... 16 1.4. Protease Inhibition ....................................................................................................... 23 1.4.1. Active Site-Directed Inhibition in Proteases....................................................... 23 1.4.2. Side Effect Issues of Active Site-Directed Inhibitors of CatK .......................... 31 1.4.3. Exosite Directed Inhibition in Proteases ............................................................. 35 1.4.4. Exosite Directed Inhibition in Cathepsin K........................................................ 42 1.5. Recent Developments in Protease Inhibitor Identification and Design .................. 47 1.6. Hypotheses and Specific Aims ..................................................................................... 52 2. Structural Insights into Enzyme Mechanism and Inhibitory Differences Between hCatK and mCatK ...................................................................................................................... 56 2.1. Introduction .................................................................................................................. 58 2.2. Materials and Methods ................................................................................................ 59 2.3. Results ........................................................................................................................... 63 2.4. Discussion ...................................................................................................................... 74 Supplementary Information (Chapter 2) .............................................................................. 83 ix  3. A Composite Docking Approach for the Identification and Characterization of Ectosteric Collagenase Inhibitors of Cathepsin K ................................................................... 87 3.1. Introduction .................................................................................................................. 89 3.2. Materials and Methods ................................................................................................ 90 3.3. Results ........................................................................................................................... 95 3.4. Discussion .................................................................................................................... 117 Supplementary Information (Chapter 3) ............................................................................ 121 4. Identification of Ectosteric Inhibitors of CatK through High-Throughput Screening 131 4.1. Introduction ................................................................................................................ 132 4.2. Materials and Methods .............................................................................................. 135 4.3. Results ......................................................................................................................... 139 4.4. Discussion .................................................................................................................... 151 Supplementary Information (Chapter 4) ............................................................................ 157 5. Allostery or Ectostery?  Substrate specific inhibition of cathepsin K .......................... 160 5.1. Introduction ................................................................................................................ 161 5.2. Materials and Methods .............................................................................................. 162 5.3. Results ......................................................................................................................... 169 5.4. Discussion .................................................................................................................... 193 Supplementary Information (Chapter 5) ............................................................................ 199 6. Conclusions and Suggestions for Future Work .............................................................. 205 Suggestions for Future Work ........................................................................................... 208 References .................................................................................................................................. 211 Appendices ................................................................................................................................. 243 List of Other Co-Authored Publications ............................................................................. 243    x  List of Tables  Table 1-1 Major protease clans ................................................................................................... 2 Table 1-2 Current major therapeutics for the treatment of osteoporosis ............................. 22 Table 1-3 List of major protease inhibitor warheads .............................................................. 25 Table 1-4 Summary of active site-directed CatK inhibitors in clinical trials ........................ 34 Table 2-1: Data collection and refinement statistics of crystal structures ............................. 78 Table 2-2: Total surface areas and active site cleft volumes for uninhibited human and mouse CatKs, and ODN, E64 and NFT bound human CatK enzymes. ................................. 80 Table 2-3: Determination of the kinetics constants of the wild-type mouse and human CatK, and the mCatK variants. ............................................................................................................ 81 Table 2-4: Inhibition constants (Ki) of ODN and balicatib against mCatK, hCatK, and the mCatK variants ........................................................................................................................... 82 Table 3-1 Summary of collagenase inhibitors identified through composite docking from the NCI-DTP Repository .......................................................................................................... 104 Table 3-2 Comparison of the hit rates of each individual docking method and composite docking methods........................................................................................................................ 114 Table 4-1 Active compounds identified from KD2 and BM Library ................................... 141 Table 4-2 Predicted binding affinities for the most potent collagenase and osteoclast resorption inhibitors. ................................................................................................................ 150 Table 5-1 Crystallographic Data Collection and Refinement Statistics for CatK-NSC-13345 and CatK-T-06 complexes ........................................................................................................ 181 Table 5-2 Kinetic constants of CatK-mediated hydrolysis of Z-FR-MCA and Abz-HPGGPQ-EDDnp in the presence or absence of NSC-13345 or T-06 ................................. 183 Table 5-3 Kinetic constants of CatK and allosteric site mutant-mediated hydrolysis of Z-FR-MCA and Abz-HPGGPQ-EDDnp in the presence or absence of NSC-13345. ............. 185    xi  List of Figures  Figure 1-1 Enzymatic mechanism for hydrolysis in proteases ................................................. 3 Figure 1-2 Schechter and Berger subsite model for proteases ................................................. 5 Figure 1-3 Schematic diagram of the extracellular matrix ....................................................... 7 Figure 1-4 Detailed catalytic mechanism of cysteine proteases .............................................. 11 Figure 1-5 General subsite definitions for CatK. ..................................................................... 13 Figure 1-6 Overall structure of CatK........................................................................................ 15 Figure 1-7 Triple helical structure found in collagen .............................................................. 17 Figure 1-8 Biological process in osteoclast differentiation and bone remodeling ................. 19 Figure 1-9 Scheme of covalent active site-directed inhibition ................................................ 24 Figure 1-10 Reaction mechanisms for selected covalent protease inhibitor warheads ........ 29 Figure 1-11 Exosites identified in thrombin. ............................................................................ 36 Figure 1-12 Scheme of allosteric activation and inhibition ..................................................... 38 Figure 1-13 Allosteric inhibition of caspases-9 and -7. ............................................................ 40 Figure 1-14 Tetramer model for collagenase activity of CatK ............................................... 43 Figure 1-15 Ectosteric sites in CatK. ......................................................................................... 45 Figure 1-16 Tanshinone ectosteric collagenase inhibitors of CatK ........................................ 46 Figure 1-17 Overview of computation-based drug design ...................................................... 49 Figure 2-1 Structural comparison between hCatK and mCatK. ........................................... 65 Figure 2-2 Chemical structures of CatK Inhibitors................................................................. 67 Figure 2-3 Binding of ODN into the active site cleft of hCatK. .............................................. 68 Figure 2-4 Structural analysis of the binding of ODN on hCatK. .......................................... 70 Figure 2-5 Putative binding of ODN in the mCatK enzyme. .................................................. 72 Figure 3-1: Binding site analysis of ectosteric site 1 in CatK .................................................. 97 Figure 3-2 Screening and evaluation workflow for the identification of ectosteric site 1 inhibitors of CatK ....................................................................................................................... 99 Figure 3-3 Chemical similarity mapping of hits identified through molecular docking .... 102 Figure 3-4 Collagenase inhibitory activity of compounds 1 and 3 ....................................... 107 Figure 3-5 Top binding poses of compounds 1 and 3 from composite docking .................. 109 xii  Figure 3-6 The effects of NSC-374902 and NSC-645836 on human osteoclasts and bone resorption activity ..................................................................................................................... 116 Figure 4-1 Experimental workflow for the identification of collagenase inhibitors of CatK using FP and active site screening assays. .............................................................................. 134 Figure 4-2 Distribution of potential inhibitor hits from KD2 and Biomol libraries based on their inhibitory potencies. ........................................................................................................ 140 Figure 4-3 Analysis of anti-collagenase inhibitors on the viability and activity of human osteoclasts................................................................................................................................... 144 Figure 4-4 IC50 determination of human osteoclast bone resorption parameters for EGCG and ATC ..................................................................................................................................... 145 Figure 4-5 Ectosteric sites required for CatK-mediated collagen degradation and the best binding poses predicted for the most potent anti-collagenase activity inhibitors. .............. 149 Figure 4-6 Enzymatic inhibition of CatK by DHT1, ATC, and EGCG on its collagenase and elastase activities. ...................................................................................................................... 151 Figure 5-1 Global docking of NSC-13345 with CatK identified eight potential binding sites...................................................................................................................................................... 171 Figure 5-2 Global docking of T-06 with CatK identified eight potential binding sites. ..... 172 Figure 5-3 Binding of NSC-13345 with wild-type CatK. ....................................................... 175 Figure 5-4 Structural comparison of binding site 3. .............................................................. 177 Figure 5-5 Binding of T-06 with wild-type CatK. .................................................................. 179 Figure 5-6 Rmsd difference maps for NSC-13345 and T-06 bound structures compared to uninhibited structure. ............................................................................................................... 180 Figure 5-7 Comparison of the best poses of peptide substrates, HPGGPQ and Z-FR-MCA, docked into the active site......................................................................................................... 186 Figure 5-8 Specificity constants for the hydrolysis of Abz-KLR-XXX-EDDnp peptides by CatK in the presence of NSC-13345 and molecular docking of the substrates in the absence and presence of the inhibitor. .................................................................................................. 189 Figure 5-9 Inhibition of cleavage of selected substrates by NSC-13345. ............................. 190 Figure 5-10 Inhibition of the degradation of macromolecular substrates, azocasein and collagen and C4-S complex formation by NSC-13345. .......................................................... 192  xiii  Abbreviations  AA  Abscisic Acid ASA  Accessible Surface Area ATC  Aurintricarboxylic Acid CatK  Cathepsin K CatV  Cathepsin V C4-S  Chondroitin 4-Sulfate CTx  C-terminal Telopeptide of type I collagen DTP  Developmental Therapeutics Program DTT  Dithiothreitol E-64  L-3-carboxy-trans-2–3-epoxypropionyl-leucylamido-(4-guanidino)-butane ECM  Extracellular Matrix EDTA  Ethylenediametetraacetate EGCG  Epigallocatechin Gallate FP  Fluorescence Polarization GAG  Glycosaminoglycan HTS  High Throughput Screening IGF  Insulin-like Growth Factor M-CSF Macrophage Colony-Stimulating Factor MMPs  Matrixmetalloproteinases NCI  National Cancer Institute NFT N-(2-aminoethyl)-N~2~-{(1s)-1-[4'-(aminosulfonyl)biphenyl-4-Yl]-2,2,2-trifluoroethyl}-L-leucinamide NMR Nuclear Magnetic Resonance NSC-13345 2-[(2-carbamoylsulfanylacetyl)amino]benzoic acid ODN  Odanacatib OPG  Osteoprotegerin OVX  Ovariectomized PG  Proteoglycan RANK  Receptor Activator of Nuclear Factor Kappa-β RANKL Receptor Activator of Nuclear Factor Kappa-β Ligand xiv  SGC  Sanguinarine Chloride T-06  Tanshinone II-A Sulfonate TGF-β  Transforming Growth Factor β TNF  Tumour Necrosis Factor   xv  Acknowledgements  Undertaking the passage to a doctoral degree is often considered an arduous and gruelling task filled with both frustration and joy. I am happy to say that my journey has been filled with more of the latter. However, this would not have been possible without the help of many along my journey.   I would like to first express my gratitude towards my supervisor, Dr. Dieter Brömme, for his guidance and excellent mentorship. It has been a great honour to be a part of his research group. His enthusiasm, patience, inspiration and continued support were a tremendous motivation to me throughout my entire degree.  I am also grateful to my committee members, Dr. Christopher Overall and Dr. Christian Kastrup who provided excellent support and insightful suggestions throughout my degree. I would also like to thank Dr. Gary Brayer, for his expertise and recommendations on all the crystallography work and Dr. Rafael Guido for graciously hosting me in his research group in São Carlos.  In addition, it has been a privilege to work with an amazingly talented group of people from a variety of disciplines through my various projects. Their contributions made much of the work outlined in this thesis possible. I am indebted to the members of Brömme Lab for the countless advice and assistance: Dr. Preety Panwar, Dr. Pierre-Marie Andrault, Dr. Kamini Srivastava, Dr. Li Ming Xue, Dr. Neil Mackenzie, Jody Li, Udit Dalwadi, Marcus Lee, Pouya Azizi, Andrew Jamroz. Much of the X-ray crystallography-based work could not have been accomplished without help from the members in the Brayer Lab including Dr. Adeleke Aguda, Dr. Nham Nguyen, and Dr. Sami Caner. xvi   I would also like to thank the funding agencies: Canadian Institute of Health Research, and the Natural Sciences and Engineering Research Council as well as the Centre for Blood Research for directly funding my research.  Last but not least, I would like to thank my family and friends for their continued support which have made the last few years as enjoyable as they were.     1  1. Introduction 1.1. Structure and Function of Proteases Proteases are enzyme molecules with the ability to hydrolyze the amide bonds found in the peptide units in polypeptides and proteins. Proteolytic enzymes are encoded by approximately 2-4% of all genes found in organisms and over human 1,000 proteases and their pseudegenes are annotated in MEROPS, a comprehensive human protease and inhibitor database (1, 2). Initial classification of proteases were into two groups: the endopeptidases, which target internal peptide bonds, and exopeptidases, which cleave the terminal C- and N- peptide bonds (3). With additional structural and mechanistic information, other classification schemes such as the type of nucleophile were introduced to group related enzymes. Proteases, which use the side-chain of an amino acid, are protein nucleophiles and those which use an activated water molecule are water nucleophiles. Protein nucleophiles can be the hydroxyl group of a Ser or Thr residue, or the thiol of a Cys and are named serine, threonine or cysteine proteases, respectively (4). Water nucleophiles are activated by either Asp or Glu or by a metal ion bound by the amino acid side chains and are typically referred to as aspartyl, glutamyl or metallo proteases. Typically, Thr and metalloproteases have a neutral pH optimum, whereas Asp, Glu, and Cys proteases have an acidic pH optimum and Ser proteases have a neutral to basic pH optimum (1). In addition, proteases can also be classified into domain families and most proteases carry only one domain essential for peptide bond hydrolysis. The MEROPS classification system was introduced in 1993 to group proteases into peptidase families based on sequence homology and structural similarities of the catalytic domain (5).  The major peptidase families are as follows: A for Asp-, C for Cys-, G for Glu-, M for metallo, S for Ser-, T for Thr-, and U for unknown. 2  Families are then built around peptidases which have been characterized structurally and biochemically and are referred to as type examples (2). For example, the type example for peptidase family S1 is chymotrypsin. If multiple peptidases from different families are similar in tertiary structure, then the related families are grouped into a clan. A clan often contains many families and typically indicates the catalytic type associated with the proteases found in the clan (5). There are three clans (PA, PB, and PC), which contain proteases of different catalytic types. Clans are further subdivided into subclans if they have unifying characteristics. For example, in clan PA, all the families of Ser and Cys proteases are grouped into subclans PA(S) and PA(C), respectively. A list of major protease clans and notable examples are listed in Table 1-1. Table 1-1 Major protease clans Clan Catalytic Residue Type Examples (Family) AA Asp Pepsin (A1) CA Cys Papain (C1), Calpain (C2) CD Cys Clostripain (C11), Legumain (C13), Caspase (C14) CE Cys Adenain (C5), Ulp1 peptidase (C48) MA Metallo- Aminopeptidase (M1), Thermolysin (M4), Bacterial collagenase (M9), Matrix metallopeptidase-1 (M10) MC Metallo- Carboxypeptidase (M14) PA Mixed Poliovirus-type picornain (C3), Chymotrypsin A (S1) PB Mixed Archaean proteasome (T1), Dipeptidase A (C69) PC Mixed Gamma-glutamyl hydrolase (C26), dipeptidase E (S51) SB Ser Subtilisin (S8) SC Ser Carboxypeptidase Y (S10) SF Ser Signal Peptidase I (S26) SJ Ser Lon-A peptidase (S16) 3  The enzymatic mechanism of the protease depends on the catalytic residue found in the protease. In serine, threonine, and cysteine proteases, catalysis involves the direct nucleophilic attack by the catalytic residue leading to the formation of an tetrahedral acyl-enzyme complex and is followed by the release of the carboxylate and amine products (6) (Figure 1-1A). Asp, Glu, and metalloproteases use an activated water molecule as a nucleophile for the peptide bond hydrolysis (Figure 1-1B). Hydrolysis of the peptide bond is an energetically favourable reaction but occurs extremely slowly. The transition states for proteases often involve the formation of a tetrahedral intermediate and the oxyanion is stabilized within the oxyanion hole via hydrogen bond donors from the protease backbone (7, 8). However, the exact transition state configuration varies between different protease families.  Figure 1-1 Enzymatic mechanism for hydrolysis in proteases Hydrolysis mechanisms for (A) 1-step acid-based catalysis found in Asp, Glu, metallo- proteases, and for (B) 2-step nucleophile-based catalysis found in Ser, Thr, and Cys proteases. In (A), the 4  enzyme uses an acid (side chain or metal ion bound to the enzyme) to deprotonate a water molecule for nucleophilic attack on the scissile peptide bond. In (B), a residue from the enzyme performs the nucleophilic attack to form an acyl enzyme complex before subsequent hydrolysis.  The active site of an enzyme is defined as the region where the substrate temporarily bonds with the enzyme to catalyze its reaction and is often relatively small, and are most frequently between 500 and 1000 Å3 in volume (9). The active site is required for both the substrate interaction with the enzyme as well as the catalysis of the reaction. The substrate binding region of the enzyme mediates interactions between the substrate molecule and the enzyme and the catalytic region of the active site often consists of three or four amino acids that directly catalyze the enzymatic reaction by chemically reacting with the substrate (10). The active site of an enzyme is specific to accommodate its substrates and thus confer high specificity for their preferred substrate. This specificity is determined by the arrangement of amino acids in the substrate binding site and the structure of the substrate with which it is designed to interact (11). The orientation of the substrate and the proximity to the catalytic site make the substrate-binding site integral for the catalytic reaction. The initial binding of the substrate to the substrate-binding site is non-covalent and the enzyme-substrate complex is held together by hydrogen bonding, Van der Waals forces, and hydrophobic and electrostatic interactions (12). Often the charge distribution on the substrate and active site are complementary.  The substrate binding site of proteases is defined based on the region occupied by the substrate peptide in the Schechter and Berger nomenclature model (13). The amino acid residues of the substrate are designated as P1-P3 and P1’ to P3’ relative to the hydrolyzed amide bond between positions P1 and P1’ (Figure 1-2). The complementary binding sites (subsites) on the 5  enzyme occupied by the substrate are designated as S1-S3 and S1’ to S3’. Each protease has its own unique preference for amino acids in their subsites for optimal cleavage. For example, trypsin has a specific preference for Lys and Arg in its P1 position (14). Due to their role in protein processing, proteases regulate the activity of many cellular pathways and modulate countless molecular signals (15). They can influence critical processes ranging from DNA replication and transcription to inflammation and the immune response (15). Dysregulation or alterations in proteolytic systems lead to a multitude of pathological conditions including cancer, neurodegenerative disorders and cardiovascular diseases (16). Because of these regulatory roles, proteases are a major focus of attention for the pharmaceutical industry as potential drug targets or as diagnostic biomarkers (17). Endogenous protease inhibitors are almost exclusively proteins and most pharmaceutical protease inhibitors are designed with peptide moieties to take advantage of the affinity and specificity for protease active sites (18–20).   Figure 1-2 Schechter and Berger subsite model for proteases 6  S3-S3’ describe the substrate binding sites within the active site of the enzyme. P3-P3’ describe the substrate residues which bind to the corresponding S3-S3’. The scissile bond is located between the P1 and P1’ position and is indicated by the red arrow (13).  1.2. Proteases in the Extracellular Matrix The extracellular matrix (ECM) is a three-dimensional, non-cellular structure found in all tissues and essential to life (Figure 1-3). Each organ contains its own unique ECM that is generated during embryonic stages. The functions of the ECM include providing physical support for tissue integrity and elasticity for controlling tissue homeostasis. The importance of the ECM is highlighted by the range of tissue defects including embryonic lethality caused by mutations in ECM-related genes (21). Specific genetic deletions of specific ECM proteins such as fibronectin and collagens are often embryonically lethal (21). In humans, the ECM contains around 300 proteins including collagen, proteoglycans, and glycoproteins (21). Two main types of ECM exist: the interstitial connective tissue matrix that provides structural scaffolding for tissues and the basement membrane, which is a specialized form of ECM that separates the epithelium from the stroma (22).  7   Figure 1-3 Schematic diagram of the extracellular matrix The extracellular matrix consists of a network of proteoglycans and fibrous proteins that provides structure to and physical support. Most fibrous proteins such as collagens are anchored to the plasma membrane via membrane proteins such as integrin and fibronectin (Figure reprinted with permission from OpenStax Biology (23)). The ECM is made of two main classes of macromolecules: proteoglycans (PGs) and fibrous proteins. The main fibrous proteins found in the ECM include collagens, elastins, 8  fibronectins, and laminins (24) (Figure 1-3). The proteoglycans fill the majority of the interstitial space within the tissue similar to a hydrated gel and have a wide variety of functions depending on the type of tissue (22). In epithelial tissues, proteoglycans associate with collagen fibers to generate an overall mechanical structure essential for buffering, hydration and act as an enzymatically accessible source for a variety of growth factors (25).  Collagens are the most abundant fibrous protein found in the interstitial ECM and provides tensile strength and structure as the main structural component (26). They also regulate cell adhesion, support chemotaxis and direct tissue development (26). Much of the collagens are transcribed and secreted by fibroblasts residing in the stroma. Fibroblasts are able to organize the collagen fibrils into sheets and cables and thus can change the alignment of the collagen fibers (27). Collagen is also often found associated with elastin within the ECM. Elastin fibers provide recoil to tissues which undergo repeated stretching by forming tight associates with collagen fibrils (28). The third most abundant fibrous protein is fibronectin which is involved in the organization of the interstitial ECM and is crucial in mediating cell attachment and function (29). Fibronectin is highly flexible and functions as an extracellular mechano-regulator (29). Cleavage of the ECM components are primarily performed during remodeling and is important in regulating its composition and structure (30). This cleavage is performed by multiple families of proteases and including metallo-, serine, and cysteine proteases. The two most specialized ECM remodelling families include those found in the matrix metalloproteinases (MMPs) and A Distintegrin And Metalloproteinase with Thrombospondin motifs (ADAMTSs). MMP activity is normally low under physiological conditions and is upregulated during repair or remodeling during disease or inflammation (31). MMPs are either soluble or cell membrane-anchored with wide substrate specificities. Twenty three human MMPs have been identified so 9  far and activation of the precursors is primarily via proteolytic cleavage by serine proteases or other MMPs (32, 33). Together, the ECM activity of MMPs plays an important roles in organogenesis and branching morphogenesis (31). In addition to the MMPs, serine, and cysteine proteases also contribute to the maintenance of the ECM (16, 34). Many cell types can internalize ECM components through endocytosis and degradation mostly occurs within the lysosomes, where aspartate and cysteine proteases are predominantly active (12).  The balance of ECM proteolysis by proteases and their inhibitors plays an important role in homeostasis. Dysregulation has been implicated in several diseases including cancer and cardiovascular diseases including atherosclerosis (35). For example, invasion of cancer cells requires the remodeling of the ECM proteins and modifications of cell-matrix and cell-cell contacts to allow for tumour progression and cardiac hypertrophy (31). Proteolytic release and activation of ECM-sequestered growth factors and cytokines have been previously implicated in cancer progression and metastasis (36). Studies have also reported elevated levels of protease activity in atherosclerotic plaques in animal models of atherosclerosis (37, 38).  1.3. Cysteine Proteases Cysteine proteases are found in all living organisms performing numerous functions from catabolism to eliciting immune responses (39). They constitute approximately 26% of all human proteases and can be grouped in seven clans (superfamilies): Clans CA, CD, CE, CF, CL, and two mixed Clans PA and PB (40). Each clan is further sub-divided into a total of 21 families (C1-C21) based on their amino acid sequence and three-dimensional structures. Many of the cysteine proteases belong to the papain family (C1) (41). The other members of the “papain-like” clan A include C2 (calpains) and C10 (streptopains) (42).  10  The first cysteine protease isolated and characterized was papain from Carica papaya (43). Cysteine cathepsins are found in the lysosome and their name is derived from the Greek word kathepsein meaning to digest (42). There are eleven papain-like lysosomal cathepsins (B, C, F, H, K, O, S, V, X and W). The majority of the cathepsins (B, C, F, H, L, O, V, X) are ubiquitously expressed in human tissues (44). However, cathepsins K, W, and S are restricted to specific cell types or tissues (45, 46). CatK is highly expressed in osteoclasts, epithelial cells, and fibroblasts of the rheumatoid synovium. CatK is also unique in its ability to fully degrade triple helical collagen (47). Historically, cathepsins were denoted as acidic proteases and included proteases from other catalytic families such as the aspartic (cathepsins D and E) and serine proteases such as cathepsin G (48, 49). Each cathepsin contains three functional structural regions including a signal peptide, a propeptide, and the active catalytic domain (50). The signal peptide, averaging 10 to 20 amino acids, is responsible for directing the translocation of the pre-proenzyme into the endoplasmic reticulum during translation and is cleaved in the process (51). The propeptide varies in length ranging from 38 amino acids in cathepsin X to 251 amino acids in cathepsin F (52). The propeptide aids in the proper folding of the enzyme and the transportation of the proenzyme to the lysosome (52). In addition, they act as specific inhibitors of the enzymes to prevent premature activation prior to reaching the lysosome (53–55). Upon reaching the lysosome, processing of the enzyme begins with the cleavage of the propeptide. In the case of cathepsins, activation occurs through autocatalysis under the influence of a pH change, which triggers the disruption of the interactions between the propeptide and mature domain (56). This shift allows the cleavage site in the propeptide loop to become accessible to the active site, which is then 11  cleaved off for enzyme activation either through autocatalysis or cleavage by other lysosomal proteases (56).    Figure 1-4 Detailed catalytic mechanism of cysteine proteases  Cysteine proteases have a common catalytic mechanism involving a nucleophilic Cys. The first step involves the deprotonation of the Cys by the His in preparation for the nucleophilic attack on the peptide carbonyl carbon. The amine terminus on the substrate is then released and the His is restored to its deprotonated form. Finally, the thioester bond is hydrolyzed to release the carboxylic moiety of the substrate and regenerating the Cys in the free enzyme (44).  Cysteine proteases contain a Cys-His-Asn catalytic triad at the active site (41). The His residue acts as a proton donor and enhances the nucleophilicity of the cysteine. The general mechanism of the reaction is outlined in Figure 1-4. Nucleophilic attack begins with the Cys on the carbon of the reactive peptide bond, producing a tetrahedral thioester intermediate that is stabilized by H-bonding with a highly conserved Gln and forms an oxyanion hole. Next, the His 12  donates its additional proton to the amine at the C-terminus of the substrate, transforming the tetrahedral intermediate into the acyl-enzyme by releasing the amine. The His residue then deprotonates a water molecule to produce a stronger nucleophile which subsequently releases the cleaved substrate linked to the Cys and regenerates the protonated Cys found in the free enzyme.  The crystal structure of the cysteine protease, papain, was one of the first protein crystal structures to be determined and its structure is representative for the cysteine cathepsin family (57). It consists of two domains, referred to as the left (L-) and right (R-) domain (57). The L-domain consists of three α-helices with a central helix over 30 residues long. The R-domain contains a β-barrel with several strands forming a coiled structure. The active site lies at the intersection of the two domains (57). The Cys is located at the end of the central helix of the L-domain and the His lies on the β-barrel of the R-domain. The active site of cathepsins can be further subdivided into subsites according to the Schechter and Berger model (Figure 1-5) (13). Four subsites (S4-S1) exist on the N-terminal side of the scissile bond and three on the C-terminal (S1’-S3’). Each subsite accommodates its corresponding amino acid on the peptide (P4 to P3’) for positioning the substrate into the active site cleft. Peptidyl cleavage library screens and positional scanning have identified particular amino acid preferences for individual cathepsins in their subsites (58–60). For CatK, the S2 subsite is generally the most selective site with a unique preference for a Pro residue and is characterized by a deep and well-defined pocket (61) (Figure 1-5). Positively charged residues and Gly are favoured in the S1 and S3 positions, whereas other sites have less pronounced preferences (60).  13   Figure 1-5 General subsite definitions for CatK. Overall subsite binding S3 to S3` for CatK using a modelled peptide substrate with the sequence, HPGGPQ. The scissile bond lies between S1 and S1` corresponding to the amide bond between the Gly-Gly residues on the substrate. The catalytic residues Cys25 and His162 are coloured in yellow and blue, respectively.  1.3.1. Cathepsin K  Human CatK is a member of the MEROPS peptidase family C1, sub-family C1A (papain family, clan CA) and is predominantly expressed in osteoclasts (62). The gene encoding CatK is located on chromosome 1 at position 1q21 and is made up of eight exons (63). The promoter region of CatK contains consensus Sp1 binding sites, a GC rich motif, and a NF-κB (nuclear factor κB) binding site without a TATA-box (63). Regulation of the expression of CatK is complex and is the key for regulating bone remodeling. Many agents that induce osteoclast formation and activation or inhibit osteoclast activity also enhance or suppress CatK expression, 14  respectively. RANKL (Receptor Activator of Nuclear Factor κB Ligand) plays an important role in osteoclast differentiation and activation (64). It is a membrane-bound factor produced by osteoblasts and binds to the cytoplasmic membrane receptor RANK, which subsequently induces both osteoclast differentiation through an activation cascade (64). RANKL is also able to stimulate CatK mRNA and protein expression in osteoclasts through stimulation of TNF Receptor-Associated Factor 6 (TRAF6) and phosphorylation of Nuclear Factor of Activated T-cells-2 (NFAT-2) (65–67). TNF-α, which is a member of the same family of ligands as RANKL, is also able to stimulate CatK mRNA expression (68). Examples of other agents that can regulate RANKL (and subsequently CatK) include vitamin D, parathyroid hormone, glucocorticoids, and interleukins 1 and 11 (64). Conversely, estrogen and TGF-β act as inhibitors of RANKL expression and CatK expression (69). Expression of CatK is fairly limited compared to the ubiquitously expressed cathepsins B, L, and H (46). CatK mRNA has been detected in tissues such as the bone, colon, ovary, heart, skeletal muscle, lung, and small intestine (70). High levels of CatK expression have been detected in osteoclasts, smooth muscle cells, fibroblasts, macrophages, and in lung epithelioid cells (65). 15   Figure 1-6 Overall structure of CatK.  Structural representation of CatK with the active site residues  (Cys25 and His162) displayed as sticks (PDBID: 5TUN).   The overall structure of CatK is highly conserved with other papain-like cysteine proteases (50) (Figure 1-6). The complete protein including its signal sequence (15 amino acids) and propeptide (99 amino acids) is 329 amino acids. The fully processed and mature enzyme is 215 amino acids in length and has a molecular weight of 23,495 Da (71). CatK shares about 60% protein sequence identity with cathepsins L, S, V and less than 35% with cathepsins F, O, B, H, and W (71). It exists as a monomer and contains a high density of positively charged amino acid residues on the opposite face of the active cleft on the R-domain (72). This region has been demonstrated to interact with extracellular matrix glycosoaminoglycans (GAGs), such as chondroitin 4-sulfate (C4-S) via ionic interactions, which plays a role in governing its physiological functions including its unique collagenase activity (73). The previously mentioned 16  preference for Pro in S2 and Gly in S3 positions allow for efficient cleavage of collagen peptides, which largely consists of G-P-Y sequences (74).  1.3.2. Cathepsin K and Bone Remodeling One of the most unique enzymatic activities of CatK is its ability to degrade collagen. Collagens are one of the most abundant proteins in the human body and makes up approximately one-third of the total protein in humans (75). There are 28 different types of collagen containing at least 46 distinct polypeptide chains identified in vertebrates (75). The different types of collagen are characterized by their complexity and diversity in structure, splice variants, assembly, and function (76). Collagen is the major component of the extracellular matrix of connective tissues. It comprises 90% of the organic matrix in bones (76, 77).  The distinctive structural motif of collagen is a three right-handed parallel polypeptide strand wound together in a left-hand helical conformation coil (78) (Figure 1-7A). In order to achieve this coiled structure, tight-packing is required and therefore every third residue is often a glycine due to the lack of a side chain group (78) (Figure 1-7B). This results in a repeating Gly-Xaa-Yaa sequence and in collagen, where Xaa and Yaa are found to be proline and hydroxyproline, respectively (79). Individual collagen triple helices, known as tropocollagen, assemble in a hierarchical manner leading via fibrils to macroscopic fibers and networks that are found in tissues, bone, and basement membranes (80). The arrangement of the triple helix allows for collagen’s resistance to protease degradation by protecting the regions that are vulnerable to proteolysis (77). The only mammalian proteases that have been characterized to degrade the native triple helical region of type I collagen are the collagenases in the MMP family (MMP-1, MMP-2, MMP-8, MMP-13, and MMP-14), the neutrophil serine elastase, and CatK (81, 82).  17   Figure 1-7 Triple helical structure found in collagen (A) Overview of the collagen triple helix from a synthetic peptide shows the tight packing found in the collagen polypeptide. (B) A molecular representation of a single turn in the helix shows the  orientation of the Pro residues relative to the helix. Figure adapted and reprinted with permission from Shoulders, M. D. & Raines, R. T. Collagen structure and stability. Annu. Rev. Biochem. 78, 929–58 (2009).  Due to its collagenase and elastase activities, dysfunction in the regulation of CatK activity has been linked to various musculoskeletal and cardiovascular diseases (83). Overexpression and excessive CatK activity lead to bone related disorders such as osteoporosis and Paget’s disease, characterized by global and localized loss of bone mineral density (84). Excessive CatK activity has also been implicated in rheumatoid arthritis and osteoarthritis by breaking down aggrecans and fibrillary type II collagen, the major constituents of the ECM in articular cartilage (85, 86). In addition to being a potent collagenase, CatK is also an effective 18  elastase and has been linked to elastin-related pathologies such as atherosclerosis and lung fibrosis (87, 88).  Pycnodysostosis is a rare autosomal recessive disorder of the bone caused by the lack of CatK activity and subsequent osteoclast dysfunction (89). The disorder was first defined in 1962 and was characterized by a short stature, osteosclerosis with bone fragility, and atypical facial features (90). To date, close to fifty different mutations in the CatK gene have been reported including nonsense, missense, frame shift, and splice site mutations (91). Pycnodysostosis patients display decreased bone resorption but normal bone formation. Bone formation markers are normal whereas bone resorption markers are significantly decreased (46). Mental and motor development in patients are also generally normal (91). The unique phenotypes observed in pycnodysostosis highlight the important physiological role of CatK and that inhibition of its activity can be beneficial in diseases characterized by excessive bone degradation (91).  Bone tissue in humans is renewed continuously in the bone remodelling cycle and is in a dynamic balance between osteoblastic bone formation and osteoclastic bone resorption (92). Osteoclasts are multinucleated cells and they are formed from the fusion of mononuclear progenitor cells in the monocyte/macrophage family. During osteoclastogenesis, two hemotopoietic factors, macrophage colony-stimulating factor (M-CSF) and RANKL are required (64). Upon binding to its specific receptors, a series of signalling cascades activate and mediate the formation of an osteoclast. Inhibitors to the M-CSF (Ki20227) and RANKL (Denosumab) receptors were considered as a viable strategy to reduce the number of osteoclasts (93, 94).   Bone resorption is a multistep process that finally results in the degradation of the organic and inorganic phases of the bone by osteoclasts. One of the unique features of the osteoclast is its ability to polarize on bone and form a “ruffled membrane” that occurs when the cell is attached 19  to the bone (95). This initial event generates a tight adhesion between the cell and the bone surface also known as the “sealing zone.” Dissolution of the inorganic phase containing mostly hydroxyapatite occurs via acidification (95). Upon mobilizing of the bone mineral, CatK is secreted and degradation of the collagenous bone matrix can occur. Liberated collagen fibers or their fragments are endocytosed by the osteoclast and subsequently completely degraded (96).  Figure 1-8 Biological process in osteoclast differentiation and bone remodeling The formation of the osteoclast is a multistep process that has been targeted for therapeutic intervention. Primary activators of the osteoclast differentiation include M-CSF and RANKL. The active osteoclast cell resorbs bone through the dissolution of the inorganic matrix through acidification and degradation of the organic bone matrix by secreted CatK. Figure adapted and reproduced with permission from Bi, H. et al. Key Triggers of Osteoclast-Related Diseases and Available Strategies for Targeted Therapies: A Review Front. Med. 4; 234 (2017) (97).  Osteoporosis is a common disorder characterized by a decrease in bone mass and mineral density leading to a structural deterioration of the bone (96). In patients with osteoporosis, excessive CatK activity and bone resorption cause bone fragility and a significant increase in the 20  likelihood of fractures (83). Postmenopausal osteoporosis is the most common type of osteoporosis and is caused primarily by the decline in estrogen levels associated with menopause (98). The pathogenesis of postmenopausal osteoporosis is characterized by a dysfunction of bone maintenance leading to an increase in bone resorption that exceeds formation. The current view is that estrogen plays a role in the regulation of osteocyte and osteoclast formation, the expression of tumour necrosis factor α (TNF-α), and the osteoprogerin (OPG), RANKL, RANK system for regulating bone resorption (99). Estrogen inhibits the activation of bone remodeling and the differentiation of osteoclasts as well as reduces the expression of cathepsin K (100, 101). Recent studies have reported that serum estrogen levels are inversely associated with levels of the sclerostin, a key inhibitor in Wnt signaling produced by osteocytes (102). Estrogen treatment in postmenopausal women has been shown to reduce circulating sclerostin levels (69, 103). Moreover, it has been suggested that Wnt signaling requires estrogen and when Wnt signaling is interrupted in the mesenchyme, osteoblast differentiation is reduced and thus bone remodeling is affected (104). Therefore, the relationship between estrogen and Wnt signaling pathways is likely to be mediated by the osteocyte (104). Another dominant effect of estrogen is the blockage of new osteoclast formation by modulation of RANK signaling (102). Withdrawal of estrogen is also associated with apoptosis of osteocytes. Given that postmenopausal osteoporosis is due to an imbalance between bone resorption and formation, inhibition of bone resorption has been an important target for the development of anti-osteoporotic therapies. Despite the availability of diagnostic tools and treatment options, osteoporosis remains a significant public health burden (105, 106). Drugs currently approved for treatment are categorized as antiresorptives (inhibitors of bone resorption) or anabolics (stimulators of bone formation) (99). Current primary treatments include bisphosphonates, Selective Estrogen 21  Receptor Modulators (SERMs), and denosumab. A summary of current drugs used for the treatment and management of osteoporosis is listed in Table 1-2. However, all current treatments are limited due to the coupling of bone resorption and formation and changes in one subsequently affects the other pathway (107). Therefore, the long-term efficacies of the drugs are limited and intensifies the demand for safe long-term therapeutics.   22  Table 1-2 Current major therapeutics for the treatment of osteoporosis Therapeutic Target/Action Adverse Side Effects Bisphosphonates Inhibit osteoclast function. Prolonged use can lead to increased risk of fractures and poor bone quality (108, 109). Less commonly, osteonecrosis of the jaw. Selective estrogen-receptor modulators (Raloxifene, bazedoxifene, tamoxifen) Bind estrogen receptors and exert agonist or antagonist effects depending on organs. Menopausal symptoms such as hot flashes and deep vein thrombosis. No permanent impact on bones.  Parathyroid hormone (teriparatide, abaloparatide) Anabolic agents stimulating bone formation and inhibition of resorption. Increased incidence of osteosarcoma observed in rats (110, 111). Minor adverse events including nausea, headache, dizziness. Denosumab Antibody against receptor activator of nuclear factor- κB ligand (RANKL) and prevents osteoclast formation Generally well tolerated, rare cases of dermatitis and osteonecrosis of the jaw (107). Administration by injection required.  Because of its role in osteoporosis, CatK has become a promising therapeutic target. Development of active site-directed inhibitors showed early promise but all have failed in clinical trials due to side effects (112). Recent exosite inhibitors targeting the collagenase activity of CatK have shown promise in selectively blocking bone resorption (113–115). Details regarding the two types of inhibitors are described in the following subsections (Sections 1.4.2 and 1.4.4).   23  Because of their promising activities, a major focus of this thesis is on the identification and characterization of exosite inhibitors that selectively block the collagenase activity of the enzyme. Chapters 3, 4, and 5 describe developing computational, high-throughput screening, and crystallographic methods to identify and characterize novel exosite inhibitors of CatK. 1.4. Protease Inhibition  Due to the multitude of roles that proteases play in both regulation and disease pathology, numerous protease inhibitors have been developed as therapeutics in a wide range of applications from cancer to respiratory and gastrointestinal diseases (116–118). However, targeting proteases is complicated by a number of factors with the most significant concern being the specificity of the inhibitor for a particular protease (119, 120). Due to the similarities in tertiary structures as well as cell expression patterns, off-target side effects remain a significant hurdle in any drug design (121–123). Significant care and effort is required to ensure that the inhibitor has high specificity for the target protease (124). Nonetheless, protease inhibitors are widely used in blood coagulation (thrombin and TPA inhibitors) and blood pressure treatments (angiotensin-converting enzyme inhibitors) as well as antiviral drugs for treating HIV and hepatitis C through inhibition of the proteolytic cleavage of proteins required for the production of viral particles (125, 126). 1.4.1. Active Site-Directed Inhibition in Proteases Inhibitors targeting the active site of proteases are most frequently researched due to the remarkable potency that can be achieved (127). The majority of these inhibitors are competitive inhibitors and bind in the active site in a substrate-like manner. Since related proteases often have similar homology and tertiary structure in their catalytic domain, inhibition of multiple proteases can often be achieved using a single inhibitor (6, 128, 129). The most thoroughly 24  developed protease inhibitors are those that bind in the active site of the protease and are complementary to the substrate specificity of the target protease (130–133). Inhibitors interact with the protease subsites and active site residues in a non-catalytically manner and increase their affinity through interactions with the residues found in the active site (Figure 1-9) (6).   Figure 1-9 Scheme of covalent active site-directed inhibition The catalytic residue forms a covalent bond with the warhead (usually via a nucleophilic attack on the reactive warhead atom). The “missile region” confers specificity towards the target protease and allows for the inhibitor to specifically interact with the enzyme.  Some of the earliest inhibitors developed for proteases exploited the attachment of a reactive “warhead” to an effective substrate (“missile”) for efficient binding to the enzyme (6, 128, 129). Usually the warhead is an electrophilic moiety that forms a covalent bond with the catalytic residue of the protease (Figure 1-9). Early warheads used were alkylating agents such as haloketones (134, 135). Irreversible binding with the catalytic residues produces a permanent 25  covalent modification of the enzyme. The formation of the modified inhibitor-complex traps the enzyme in an inactive state and thus these inhibitors were termed irreversible inhibitors (136–138). The inhibitor often has a substrate-like character and uses the enzyme’s catalytic machinery to react with the enzyme. Therefore, specificity for the target proteases is critical to inactivate the target exclusively. Specificity is often conferred using the substrate specificity of the protease such as the shape and electrostatic character of the substrate binding pockets (139, 140). However, many irreversible inhibitors are able to inhibit multiple proteases due to the similarities in their tertiary structures (140). The mechanisms of several types of protease inhibitor warheads focused on serine and cysteine proteases are described below and a detailed list of common warheads can be found in Table 1-3.  Table 1-3 List of major protease inhibitor warheads Inhibitor Warhead General Warhead Structure Primary Targets (Example) Mechan-ism of Action Notable Examples (Targets) Other Notes Halomethyl-ketones  Serine/ Cysteine proteases (Trypsin) Covalent/Irreversible  Tosyllysyl Chloromethyl Ketone (TLCK) (141) • Lack of specificity can react with other nontarget enzymes results in low biological utility. Diazomethyl-ketones  Cysteine proteases (Cathepsins) Covalent/Irreversible  Z-Phe-Ala-CHN2 (142) Z-Phe-Phe-CHN2 (142) • Directly alkylates catalytic Cys residue. Acyloxy-methylketones  Cysteine proteases Covalent/Irreversible  AOMK targeting CaaX proteases (143) • Originally developed to improve clinical utility of halomethyl-ketones Epoxysucci-nates  Cysteine proteases (Papain, cathepsins) Covalent/Irreversible  E-64 (144) Estatins (130) Cathestatins (145) • Highly specific for Cys proteases through alkylation. 26  • Diagnostic reagent for cysteine proteases. Vinyl Sulfones  Cysteine proteases (Cruzain, proteosome) Covalent/Irreversible  K11777 (146) Peptidyl Vinyl Sulfones (147) • Specific for Cys proteases through Michaels addition. Aza-peptides  Serine/ Cysteine proteases Covalent/Irreversible  Aza-peptide esters (148)  • Inactivation occurs through acylation of the catalytic residue. Carbamates  Serine proteases (Fatty acid amide hydrolase) Covalent/Irreversible  N-Hydroxyhydantoin carbamates (149, 150) • Formation of N-alkyl carbamyl derivatives. • Short biological half-life limit therapeutic uses. β-Lactams  Serine/ Cysteine proteases (Transpeptidases) Covalent/Irreversible  Cephalosporins (151) • Formation of acyl enzyme derivative. • Commonly used in antibacterials through blockage of cell-wall synthesis. Isocoumarins  Serine proteases (Elastases) Covalent/Irreversible  3,4- Dichloro-isocoumarin (131) • Formation of stable acyl enzyme intermediate. • Specificity conferred by modification of functional groups. Phosphonates  Serine proteases Covalent/Irreversible  Diisopropylfluoro-phosphate (152) • Formation of stable tetravalent phosphonylated derivative. Sulfonyl fluorides  Serine/ Cysteine proteases Covalent/Irreversible  Phenylmethyl sulfonyl fluoride (PMSF) (153) • Formation of sulfonyl enzyme intermediate. 27  Nitriles  Cysteine proteases Covalent/Reversible  Odanacatib (154) • Formation of covalent thioimidates between nitrile and catalytic residue. • Reverse reaction generating nitrile and free residue is possible. α-ketoamide  Serine proteases Covalent/Reversible  α-ketoamide Phe-Pro isostere (155)  • Targets NS3/4A protease and rhomboid proteases.  Saccharins  Serine/ Cysteine proteases (Elastases, chymotrypsin) Covalent/Irreversible  N-acylsaccharins(156)  • Formation of acyl enzyme intermediate with hetercyclic ring. Aldehydes  Serine/ Cysteine proteases (Trypsin) Covalent/Reversible  Leupeptins, antipains, chymostatins (157, 158) • Formation of hemiacetal adduct between aldehyde and catalytic residue. Boronic acids  Serine proteases (Lon protease) Covalent/Reversible  Peptidyl boronates (159) • Highly potent transition state analogues forming covalent adducts.  Halomethyl ketones were among the first affinity labels developed for serine proteases and were one of the first active site-directed irreversible inhibitors reported for any enzyme (134). Tos-LysCH2C1 (TLCK) and Tos-PheCH2C1 (TPCK) were initially developed as specific peptide-based labels against trypsin and chymotrypsin, respectively (160). They function by irreversibly alkylating the active site His residue commonly found in Ser- and Cys- proteases (Figure 1-10A) (160). However, due to their reactivity, they often inhibit off-target proteases 28  and other biomolecules such as glutathione (6). Therefore, they are unsuitable for many in vivo experiments and thus as therapeutics (6). Diazomethyl and acyloxymethyl ketones are other alkylating warheads that target cysteine proteases through irreversible alkylation of the active site thiol group (142, 161, 162). Z-Phe-Ala-CHN2 and Z-Phe-Phe-CHN2 were originally developed as active site inhibitors of cysteine cathepsins and functions at stoichiometric ratios (163, 164). Acyloxymethyl ketones were designed as an alternative to halomethyl ketones with reduced chemical reactivity that would inactivate the target protease exclusively (162). Acycloxymethyl ketones inhibit cysteine proteases through alkylation of the active site Cys residue to form a thioether ketone (Figure 1-10B) (162). Numerous peptide groups and derivatives of the acyloxymethyl ketone moeity have been developed to increase the selectivity and reactivity towards a variety of cysteine proteases (165). Due to their decreased reactivity, acyloxymethyl ketones are selective towards cysteine proteases and do not show inhibitory activity towards other classes of proteases and biomolecules (162).   29   Figure 1-10 Reaction mechanisms for selected covalent protease inhibitor warheads The proposed mechanisms of inhibition for several protease inhibitor warheads: Halomethyl ketones (A), Acyclomethyl ketones (B), Epoxysuccinyls (C), Nitriles (D), Aldehydes (E). 30  In the development of novel therapeutics, irreversible active site-directed inhibitors had been largely avoided by the pharmaceutical industry due to their potential autoimmune and toxicity properties (166). Irreversible inhibitors are suggested to have a higher risk of unpredictable side effects due to generation of modified proteins (haptens), non-specific binding to off-target proteins, and the difficulty in tracking metabolites when covalently bound to proteins (166). Therefore, most drug development programs have shifted focus from irreversible inhibitors to reversible inhibitors (166).  Reversible inhibition refers to a decrease in enzymatic activity, which can be restored through the removal or dilution of the inhibitor (167, 168). Reversible inhibitors do not dictate the type of interaction that the inhibitor makes with the enzyme and the inhibitor can form covalent bonds with the catalytic site of the enzyme (169–171). Most reversible inhibitors are developed based on transition state analogs and take advantage of the greater affinity towards transition states compared with the native substrate molecules in the ground state (157, 172, 173). Aldehydes and nitriles have been demonstrated to be effective reversible inhibitor warheads for serine and cysteine proteases (41, 124). Their efficacies are suggested to be due to stable covalent adducts that resemble the transition state of the protease reaction (Figure 1-10D-E). The enzyme-inhibitor adduct is resistant to hydrolysis and is stabilized by the residues forming the oxyanion hole (172).  Peptidyl aldehydes were initially identified as cysteine and serine protease inhibitors from screening culture filtrates of different Streptomyces strains (124). Selectivity is conferred by varying the peptide segment in the P1 to P4 position corresponding to the substrate specificity of the target (124). The nucleophilic addition between the catalytic Cys (Ser) residue and the aldehyde forms a stable tetrahedral hemiacetal. Peptidyl aldehydes are considered slow binding 31  inhibitors and have a lag phase before reaching steady state inhibition (41). However, most of the naturally occurring peptidyl aldehydes including leupeptin and antipain have limited selectivity and inhibit multiple cysteine cathepsins (41). The lack of selectivity of peptidyl aldehydes also leads to many in vitro and in vivo side reactions and makes it difficult to attribute the effects of the inhibitor to an individual target protease (41).  Peptidyl nitriles also act as reversible inhibitors of cysteine proteases. Nucleophilic attack by the catalytic thiol on the nitrile carbon produces a stable isothioamide (174) (Figure 1-10D). Nitriles in general are weaker inhibitors compared with the corresponding aldehydes and thus making them more specific for cysteine proteases. Some of the most potent reversible inhibitors of cathepsins are in fact nitrile based and have been used as potential therapeutics for the treatment of osteoporosis (154).  1.4.2. Side Effect Issues of Active Site-Directed Inhibitors of CatK  Due to its role in osteoporosis, CatK has become a promising therapeutic target and has been the target for multiple drug design programs. Early CatK inhibitors were irreversible active site-directed inhibitors such as E-64-related epoxysuccinyl derivatives and vinyl sulfones. However, they conferred antigenic and immunologic complications and were not suitable for long-term treatments (175).  In order to combat these side effects, drug development focused on reversible inhibitors containing amides, ketones, nitriles or aldehyde warheads (132, 176). Major challenges in inhibitor design included satisfying drug-like property requirements while maintaining effective pharmacological profiles for chronic use. For targeting CatK, it was thought to be beneficial to direct the drugs into lysosomes (177) as cathepsins are mostly lysosomal enzymes. However, this 32  approach backfired as lysosomotropic CatK inhibitors such as balicatib also accumulated in CatK-expressing fibroblasts and therefore caused skin-fibrotic adverse events leading to the termination of the balicatib trial (178). Likely for this reason, odanacatib, a similar nitrile-based CatK inhibitor, was designed as a non-lysosomotropic drug (154). In clinical trials it showed less side effects in skin but at last succumbed to increased risks in cardiovascular events (179) despite having excellent bone-preserving outcomes in a 17,000 postmenopausal women osteoporosis trial (180).  Besides balicatib and odanacatib, several additional active site-directed CatK inhibitors have advanced into various stages of clinical development which included relacatib, ONO-5334, and MIV-711 (Table 1-4). However, also none of these CatK inhibitors has been approved. Relacatib was terminated due to undesired drug interactions, ONO-5334,  a hydrazine-based inhibitor with selectivity against CatK and CatS, was ultimately discontinued after phase II clinical trials (181, 182). MIV-711 of undisclosed structure is the only remaining CatK inhibitor currently in clinical trials for the treatment of osteoarthritis and has shown safety and tolerability (183). Table 1-4 summarizes CatK inhibitors which have entered clinical trials. A further key challenge during the development of selective and potent CatK inhibitors was due to the low potency of CatK inhibitors towards mouse CatK (112). Rodent models of osteoporosis by ovariectomy are often used to study the effects of bone loss (184). However, due to the significant reduction in potency of human CatK-specific inhibitors toward rodent CatK, preclinical trials for studying CatK inhibitors at clinically relevant concentrations were difficult or were not performed. Investigating the structural differences between the human and mouse CatK using X-ray crystallography and mutagenic approaches to understand the selectivity of 33  odanacatib and balicatib became a key objective as part of my Aim 1 (Chapter 2) and is outlined in more detail in Section 1.6.  34  Table 1-4 Summary of active site-directed CatK inhibitors in clinical trials Inhibitor Structure Other Notes Balicatib (Novartis) (178)  • Effective increase in BMD in spine and femur. • Discontinued after Phase II trials due to skin reactions. Relacatib (GlaxoSK) (185)  • Specificity towards CatK, CatL, CatV  • Discontinued after Phase I trials due to drug interactions. Odanacatib (Merck) (154, 179, 180)   • High specificity for CatK • Showed efficacy in BMD increases in hip, spine and neck to 5 years. • Discontinued after Phase III trials due to cardiovascular side effects. ONO-5334 (Ono Pharmaceuticals) (181, 182)  • Showed effective reduction on bone-resorption markers without affecting bone formation. • Discontinued after Phase II trials due to business reasons.  MIV-711 (Medivir) (183, 186) Undisclosed • Phase II trials for treatment of knee osteoarthritis completed.   35  1.4.3. Exosite Directed Inhibition in Proteases In addition to the active site, other secondary sites on the protease may also play a pivotal role in facilitating the enzymatic activity. These sites are generally termed exosites as they are remote from the active site of the enzyme (187–189). Exosites have been extensively characterized in proteases and can regulate numerous protease activities through the regulation of ligand or substrate binding and conformational changes. One of the earliest example of exosites identified in proteases were found in thrombin, a serine protease with an important role in the clotting process (190, 191) (Figure 1-11). Affinity labeling studies initially identified several secondary sites remote from the active site but could not completely explain the functional aspects of these sites (192). They were termed exosites due to their spatial separation from the active site. The crystal structures of the enzyme along with mutagenesis and kinetic studies discover of two electropositive regulatory sites on the thrombin surface (133, 193). However, they did not reveal significant conformational changes within the protein associated with allosteric regulation.  Exosite I is required for binding of fibrinogen to thrombin (194, 195) (Figure 1-11). The β- and γ- forms of the enzyme where this exosite is proteolytically removed are unable to cleave fibrinogen despite their normal catalytic activity toward synthetic peptide substrates (196). This site also accounts for adhesion to negatively charged surfaces and binding to cell surfaces (197). Hirudin, a highly potent and specific polypeptide inhibitor of thrombin, binds to this site and prevents the binding of fibrinogen to the enzyme and thus inhibits its proteolysis. Exosite II is known as the heparin-binding site (193, 198) (Figure 1-11). Heparin is a negatively charged glycosaminoglycan and a cofactor for the inhibition of thrombin by antithrombin by forming a ternary complex and mediates the binding between thrombin and antithrombin (198). The 36  exosites are also important for the recognition of additional substrates such as Factor V, Factor VIII, and cofactors such as thrombomodulin (199, 200).   Figure 1-11 Exosites identified in thrombin.  (A) shows a 3D representation of human thrombin with the charge density illustrated with blue for basic or positively charged regions and red for negatively charged or acidic regions. Exosites I and II and the active site are highlighted as shown. Two extended surface loops, the 60 loop and the γ-loop are shown in yellow and orange, respectively. (B) shows human thrombin where the positively charged amino acid positions studied by site directed mutagenesis in exosites I and II are marked in blue. The active site serine is shown in green. Figure adapted and reproduced with permission from Chahal, G. et al. (2015) The Importance of Exosite Interactions for Substrate Cleavage by Human Thrombin. PLoS One. 10, e0129511.  Exosites are also important for the degradation of collagen by the multidomain MMPs. Initial mutagenesis studies implicated one or more exosites in the hemopexin (HPX) domain (201). Structural and mutagenesis studies of the HPX domain bound to a synthetic triple helical peptide implicated blade-1 of the ß-propeller domains in the binding and unwinding of collagen (202). Additional biochemical assays based on NMR and X-ray crystallography also showed that 37  the catalytic domain of MMP-1 is positioned adjacent to the HPX domain, with the collagen triple helix bridging the cleft between the two domains (203). This allows for bending of the triple helix and the subsequent release of an α-chain from the triple helix to enter the catalytic cleft for proteolysis (203). The presence of unique exosites in MMPs has also provided opportunities to develop inhibitors that target specific enzymes despite overall structure similarity (204, 205). Allosteric regulation is among the most characterized type of exosite regulation (200, 206–208). Allostery describes the regulation of enzymatic activity through conformational changes in the active site induced by ligand binding at allosteric sites. In the case of allosteric activation, the conformational shift promotes enzymatic activity often through conformational shifts in the active site or its surrounding area (209, 210) (Figure 1-12A). Allosteric inhibition is less commonly found in organisms compared with activation and describes enzyme inhibition through the binding of an inhibitory ligand at an allosteric site (209) (Figure 1-12B).  38   Figure 1-12 Scheme of allosteric activation and inhibition In the simple case of allosteric activation, an allosteric activator binds at an allosteric site which induces a conformational change in the active site which allows for the cleavage of the substrate. In allosteric inhibition, the binding induces a change in the active site such that it can no longer cleave the substrate.  Due to the dynamic conformations usually found in proteins, allosteric conformational changes are generally explained by two different models (211, 212). According to the induced fit model, a ligand interacting at an allosteric site induces a conformational change in the protein at the active site (211). This is often used in X-ray crystallography studies to explain conformational rearrangements in the proteins upon binding of an allosteric regulator (213). A second model used to describe allosteric changes is based on population shift. A population of 39  protein molecules can adopt different conformations depending on factors such as solvent and temperature (212). NMR spectroscopy and molecular dynamics simulations suggest that proteins are generally dynamic and can adopt multiple conformations (212). Under a given set of conditions, one conformation is predominantly present (212). However, upon the binding of an allosteric regulator, this predominant conformation shifts to a different conformation due to the interactions between the ligand and the enzyme (214). A key difference from the induced fit model is that the allosteric conformational shift in the protein is part of the pre-existing populations and the ligand merely shifts the population of molecules to this new conformation (215). Nonetheless, in both models of allosteric regulation, a conformational change is observed in the protein upon an allosteric ligand binding and subsequently affects enzymatic activity.  Several examples of allosteric regulation can be observed in caspases. Caspases are cysteine proteases from the C14A family that plays a critical role in inflammation and cell death (216). There are twelve characterized caspases in humans and seven play a role in apoptosis and are characterized as initiator (Caspases -2, -8, -9, -10) or executioner (Caspases -3, -6, -7) caspases (216).  Caspase-9, a well-characterized initiator caspase, is inhibited allosterically by the baculoviral IAP protein (BIR3) domain through binding at an allosteric site remote from the active site (217). The enzyme is produced as a zymogen and activation of the enzyme is accompanied by rearrangement of the loops found in the active site associated with homodimerization of the enzyme (218). The crystal structure of the enzyme-inhibitor complex suggests that binding of the inhibitor prevents the protease from achieving its active state (217) (Figure 1-13A).  40   Figure 1-13 Allosteric inhibition of caspases-9 and -7. (A) Overall view of the caspase-9 (blue) and BIR3 (green) show the inhibitor binding at an allosteric site remote from the active site. The active site loops (purple) are locked in an inactive conformation. The XIAP-BIR3 domain is colored green, with the bound zinc atom in red. Figure adapted and reproduced with permission from Shiozaki et al. (2003). Mechanism of XIAP-Mediated Inhibition of Caspase-9. Mol Cell. 2003 Feb;11(2):519-27. (B) A network of interactions across linking the active site residues (Cys-285 and Arg-286) and the bound inhibitor (yellow) suggest propagation of the allosteric regulation signal in caspase-7. Figure adapted and reproduced with permission from Scheer et al. (2006) A common allosteric site and mechanism in caspases. Proc Natl Acad Sci U S A. 2006 May 16; 103(20): 7595–7600.  Disulfide trapping using 10,000 thiol containing compounds against surface Cys residues of caspases-1 identified compounds binding at an allosteric site 15 Å away from the active site through reversible conjugation with the Cys side chain (219, 220). Reaction with the Cys found at this site is able to functionally inactivate the protease, suggesting functional coupling between the allosteric site and the active center of the enzyme (220). The X-ray crystal structures of the inhibitor-enzyme complex revealed a network of residues involved in hydrogen bonding leading from the allosteric site to the active site (220) (Figure 1-13B). In particular, the inhibitor prevents the formation of a salt bridge between Arg-286 and Glu-390 associated with a 41  conformational transition required for enzyme activity and mutagenesis studies of these residues support the inhibitory mechanism proposed and reduced the enzymatic activity by over 400-fold (220). Using a similar method, an analogous allosteric site was identified in caspase-7 despite sharing only a 23% sequence identity, suggesting that widespread allostery can be found within the caspases (221).  Inositol hexaphosphate (InsP6) is a eukaryotic-specific small molecule that regulates enzymes in DNA repair and RNA editing and has been characterized as an allosteric activator of Family C80 cysteine protease domain (CPD) (222, 223). CPD is a bacterial protease found in the Multifunctional Autoprocessing RTX-like (MARTX) and large glucosylating (LGT) toxin families and modulates the virulence of several viral pathogens including cholera and Clostridium difficile-associated disease (224–227). Mutagenesis, kinetic, and binding assays suggested that InsP6 binds to a distal exosite through electrostatic interactions with the negatively charged phosphates and is essential for enzymatic activity (228). Subsequent studies using fluorescent probes to measure active site exposure, suggested that the binding of InsP6 induces conformational changes in the active site by stabilizing the enzyme and makes it more resistant to proteolysis and heat denaturation (229). X-ray structural studies were able to further characterize the exact mechanism of this conformational change. A three-strand structure termed the β-flap was identified to dynamically respond to InsP6 binding and is required for substrate binding in the active site (229). In addition, mutations in this region is retains InsP6 binding but was detrimental to enzyme activity. The presence of this flap was found to be conserved in the Clostridium difficile Toxin A, suggesting that it may be conserved among the other proteases in the family (229). 42  In order to distinguish between secondary binding sites where no conformational changes can be observed upon ligand, inhibitor, or protein binding from those where conformational changes can be observed (such as the traditional allosteric regulatory sites described in the previously), we introduce the concept of the ectosteric site. Ectosteric binding sites refer to exosites on the enzyme required for specific enzymatic activities and can include protein-protein oligomerization sites or secondary ligand binding sites. In contrast to traditional allosteric sites, ectosteric sites do not induce any conformational changes in the enzyme. The aforementioned fibrinogen and heparin binding sites described for thrombin which do not induce substantial conformational changes to the catalytic site can be classified into the category of ectosteric sites. Potential ectosteric sites have also been implicated in the family of cathepsins, including CatK (described in more detail in the following section) and CatV. Mutagenesis studies have shown the presence of two unique elastin binding sites that are required for its elastase activities (207, 230).  1.4.4. Exosite Directed Inhibition in Cathepsin K Exosites have also been implicated for the degradation of collagen and elastin in cysteine cathepsins (201, 202). In particular, for CatK, protein oligomerization mediated by glycosaminoglycans has been demonstrated to mediate its collagenase activity (73, 231, 232). Despite possessing a narrow active site with a width of less than 10 Å, CatK is highly efficient at cleaving the tropocollagen unit with a diameter of approximately 15 Å (78). The enzyme does not contain a hemopexin-like domain as found in MMPs that is used to unwind the triple helix (202). In addition, CatK is able to cleave collagen within both the telopeptides and triple helical domains (232). This collagenase activity of CatK has been demonstrated to require 43  the formation of oligomeric complexes with ECM-resident GAGs (73, 232). Without GAGs, the collagenase activity of CatK is significantly reduced and CatK is only able to cleave collagen in its telopeptide regions (73). The addition of GAGs allow for the enzyme to unfold triple helical collagen and completely digest the tropocollagen into soluble peptides (232). The formation of the oligomers needed for collagen degradation requires secondary protein-protein interaction sites remote from the active site of the enzyme. We previously introduced the term ectosteric site to describe these secondary sites required for enzymatic activity which do not elicit any conformation changes. In the case of CatK, the protein-protein interaction sites for neighbouring CatK molecules and the GAG binding site are both necessary for the protein to oligomerize and are thus ectosteric sites required for collagen degradation (Figure 1-14).   Figure 1-14 Tetramer model for collagenase activity of CatK The formation of a tetramer is predicted to be required for the collagenase activity of CatK and is mediated by the binding of chondroitin 4-sulfate (C4-S; orange). The protein-protein interaction site (Ectosteric Site 1) is shown in red. 44  Mutational and structural analysis have implicated oligomerization in CatK as critical for its collagen degradation activities (207, 232) (Figure 1-14). Ectosteric site 1 was initially discovered in the structurally related cathepsin V (CatV) based on mutagenesis studies characterizing its elastase activities. This site was later identified as critical to the collagenase activity of CatK as well. Ectosteric site 1 is located on a L-domain loop spanning residues Glu84 to Pro100 and at forms an interface site required for protein-protein interactions (Figure 1-14 and Figure 1-15). Mutations of the residues in this region abolish complex formation and cause the specific inhibition of the collagenase activity in CatK (207). However, the active site of the enzyme remains intact and the gelatinase and peptidase activity of the enzyme are unchanged (207). This suggests that ectosteric site 1 is not a direct binding site for collagen and is involved with the oligomerization of the enzyme. Ectosteric site 2 is found at the interface between the L- and R- domain below the S2 subsite and the catalytic residues and spans residues Gly109 and Glu118 (Figure 1-14). A third ectosteric site lies at the chondroitin 4-sulfate (C4-S) binding site. C4-S contributes to the oligomerization and has been demonstrated to be required for effective collagenase activity for CatK. Alignment of the crystal structures of the protein oligomers with the free CatK monomer structures showed no significant differences in the proteins suggesting that ectosteric sites do not induce large scale conformational changes and are not related to allosteric regulation. In addition, inhibitors targeted to these sites are selective for the collagenase activity of the enzyme and do not disrupt the active site (113, 114, 233).  45   Figure 1-15 Ectosteric sites in CatK.  Ectosteric site 1 (orange) is located on the L-domain spanning residues 84 to 100 and is implicated in the protein oligomerization required for collagenase activity. Another ectosteric site, the C4-S binding site (green), lies opposite to the active site on the R-domain of the enzyme.  Due to the role of the ectosteric sites implicated in the collagenase activity of CatK, inhibitors targeted to this region are able to selectively target collagen degradation by CatK (113, 114, 233). Screening of several Chinese herbs identified the medicinal plant, Salvia miltiorrhiza, as a source of potent CatK collagenase inhibitors (113). Further characterization of the herb’s constituent identified a family of compounds known as tanshinones (Figure 1-16). Interestingly, Salvia miltiorrhiza has been used historically in traditional Chinese medicine to treat musculoskeletal and cardiovascular diseases (115, 234, 235).  46   Figure 1-16 Tanshinone ectosteric collagenase inhibitors of CatK (A) Examples of tanshinone ectosteric inhibitors of CatK which target the collagenase activity specifically (113). (B) Predicted binding of tanshinone IIA sulfonic sodium at ectosteric site 1.   Characterization of compounds in this family revealed numerous potent collagenase inhibitors which functioned as ectosteric inhibitors of CatK (113). Computer-based molecular modeling predicted effective interaction of these compounds with ectosteric site 1 (113). The in vitro and in vivo potency of tanshinones suggested that other compounds targeting these ectosteric sites may also function as collagenase specific inhibitors of CatK. Chapters 3 and 4 of this thesis discuss the development of high-throughput and computational library screening methods to identify such inhibitors.  47   1.5. Recent Developments in Protease Inhibitor Identification and Design  With the advent of contemporary biotechnology and bioinformatics techniques, modern protease inhibitor development programs have evolved from substrate modifications and endogenous protein inhibitor isolation. Screening hundreds of thousands of compounds through virtual or experimental high-throughput screening (HTS) is now routine in the search for novel inhibitors in therapeutic drug development programs. In addition to improved screening technologies, chemical synthesis of new compounds as parts of lead optimization have dramatically increased the searchable chemical space (236). For example, combinatorial synthesis using solid phase peptide synthesis systems allow for the output of large and diverse chemical libraries which can be screened for specificity and potency (237).  HTS drug discovery consists of several general steps including target identification, reagent preparation, assay development optimization followed by the library screening itself. The most common HTS methods include miniaturized cell-based and fluorescence signal-based assays. Cell-based assays allow for screening of a molecules for a wide range of biological activities through the analysis of cellular responses as well as compound toxicity. For enzymatic targets such as proteases, HTS assays often involve the detection of the catalytic conversion of a non-fluorogenic substrate into a fluorescent product or FRET based-methods for quickly identifying compounds which block the active site activity of the enzyme (238–240). Fluorescence polarization (FP) is a widely used technology in high throughput screening (241–243). This assay is based on the principle of which emission from a small fluorescent molecule excited by polarized light is depolarized due to rotational diffusion during the lifetime 48  of its fluorescence (243). This degree of polarization is directly dependent on the molecular weight of the fluorescent species as such it is able to follow processes which involve a binding event or the enzymatic cleavage of a substrate. HTS assays often go hand-in-hand with computationally based modeling for the prediction of structure-activity relationships, lead optimization and development (244, 245). Molecular modeling studies are often used in conjunction with other structural biology methods such as X-ray crystallography, nuclear magnetic resonance (NMR), and more recently cryo-electron microscopy (cryo-EM) (246). Multiple rounds of computational design and experimental assays are required to refine and improve potency and affinity of any ligand identified. The general computational drug design program is outlined in Figure 1-17.  49   Figure 1-17 Overview of computation-based drug design A computational drug design program often goes through many recursive steps before a relevant lead compound is tested for clinical validation.  50  Structure-based virtual screening is one of the most frequently used drug design strategies due to its speed and cost-effectiveness at evaluating large compound libraries (244). This involves the screening of chemical libraries for effective interactions with a target-binding site. Once a therapeutically relevant target has been identified, drug development programs often start with screening of small-molecule libraries for the assessment of a wide chemical search space for the prediction of potential ligands (236). The compound library is docked into the selected therapeutically relevant target binding site, producing a prediction of binding modes and a ranking of the docked molecules. The conformational search algorithm evaluates the binding between each compound and the target receptor and predicts the overall affinity (247). Multiple rounds of docking are often employed and higher precision search algorithms are employed after an initial set of compounds has been identified (246). The predicted affinities serve as a base criterion for the selection of promising molecules and is combined with other predicted pharmacological properties to produce a selection of compounds to be explored in biological experiments (246).  Upon the identification of potential ligands through HTS or computational screening methods, the most promising compounds are synthesized or procured and then experimentally evaluated. Depending on the type of target, experimental assays can range from high throughput fluorescence-based screening or low throughput cell-based efficacy assays. Furthermore, compounds with different chemical properties are selected for follow-up SAR studies (245). Drug design programs often go through multiple rounds of biological assays and computational modeling for refinement of the lead compounds (245). Experimental evaluation of compounds plays a significant role in the computational drug design process as they provide valuable information required to further refine potency and affinity (246). 51   Structure-based virtual screening examples of exosite inhibitors are less common than their active site-directed counterparts but its efficacy has been demonstrated for the discovery of exosite inhibitors of botulinum neurotoxin serotype A (BoNT/A) (248). BoNTs is a neurotoxin characterized to block neurotransmitter release and induce paralysis (249, 250). Mutagenesis studies and structural determination in complex with synaptosomal-associated protein (SNAP-25) have identified distant exosites (α- and β-exosites) that are required for substrate recognition and catalysis (251). Computational library screening of the Molecular Libraries Small Molecule Repository (MLSMR) library containing approximately 250,000 compounds with molecular dynamic simulation identified 167 potential hits for in vitro evaluation (248). Experimental confirmation using a FRET-based assay with SNAP-25 based peptides identified eight active inhibitors of BoNT/A (248). Kinetic analysis of the top two inhibitors showed non-competitive inhibition, consistent with exosite inhibition and refined docking analysis of the top two inhibitors predicted effective binding at both of the exosites (248). As part of my thesis I sought to develop a novel screening method to identify collagenase specific inhibitors targeting ectosteric site 1. In Chapter 3, I used multiple docking algorithms to develop a composite docking method as a structure based virtual screening method for identification of novel inhibitors. This was used to screen the NCI chemical repository containing over 280,000 compounds to identify ectosteric inhibitors of CatK. In Chapter 4, a high throughput fluorescence polarization assay was developed to screen and characterize multiple compound libraries to identify ectosteric inhibitors that inhibit the oligomerization of CatK required for its collagenase activity.  Computational methods have also been used to identify NSC-13345 as a small molecule allosteric inhibitor of CatK (252). However, its structure was determined with an incompletely 52  processed CatK containing a propeptide region in close proximity to the proposed allosteric binding site (252). Despite being characterized as an allosteric inhibitor in literature (252), no conformational changes were observed upon inhibitor binding. Kinetic studies suggested the compound selectively inhibited CatK cleavage of peptidic and macromolecular substrates including collagen. Due to its selective inhibitory properties, I investigated the binding of the compound to the fully processed enzyme to investigate its unique inhibitory properties using X-ray crystallography and site-directed mutagenesis studies and compared it to a typical ectosteric inhibitor of the tanshinone class as described in Chapter 5.  1.6. Hypotheses and Specific Aims The overall aim of the thesis is to provide structural and kinetic insight into the mechanisms of inhibitors of CatK and to characterize a novel class of ectosteric inhibitors that regulate the different enzymatic activities of CatK.  Hypothesis 1: Structural differences between the mouse and human CatK enzymes exist in the active site region which explain the selectivity of previously described active site-directed inhibitors for the human enzyme. (Chapter 2) Odanacatib was synthesized as an active site-directed inhibitor of CatK and has been clinically evaluated for the treatment of osteoporosis (180). However, the compound was recently abandoned after Phase III trials due to cardiovascular related side effects (179). The inhibitor is over 500-times more potent towards the human enzyme when compared with the mouse orthologue (253). Therefore, understanding the exact structural elements in rodent CatKs that prevent the effective binding of hCatK targeting inhibitors could be translated into the development of a transgenic CatK mouse to be used in efficacy studies of inhibitors targeting the 53  human enzyme and studying the various side effects seen in clinical trials. The work in Chapter 2 describes structural and mutagenesis studies on human and mouse CatKs to elucidate the difference in inhibitory potency of the active site inhibitors, odanacatib and balicatib, on the respective enzymes.  Hypothesis 2: A computational approach will allow for the identification of substrate-specific inhibitors that target the collagenase activity associated ectosteric site 1 in CatK. (Chapter 3) Major efforts have been undertaken to develop potent cathepsin inhibitors for the treatment of various diseases. However, all compounds in development are active site-directed inhibitors, which completely block the activity of the enzyme. Because cathepsins are multifunctional proteases, it is likely that blocking their entire proteolytic activity will have adverse side effects as previously seen in clinical trials. Earlier studies have demonstrated that the degradation of collagens by CatK requires the formation of protease oligomers in the presence of glycosaminoglycans. Disruption of these protein-protein and protein-glycosaminoglycan binding sites with small molecules will allow the selective inhibition of the collagenase activity of CatK without affecting its active site. Chapter 3 outlines the development of a library docking method with the aim of identifying novel scaffolds for ectosteric substrate-specific CatK inhibitors. Potential hits identified were subsequently characterized in in vitro collagenase assays and cell-based bone resorption assays.    54  Hypothesis 3: A high-throughput fluorescence polarization will allow for the identification of ectosteric inhibitors for CatK which selectively inhibit its collagenase activity. (Chapter 4)  Fluorescence polarization screening is widely used in chemical library screening and drug discovery programs to detect the disruption of interactions between ligands and proteins. This is usually performed through the detection of binding events between a fluorescently labelled ligand and a larger molecule such as a protein through the measurement of its rotation in solution. This method has been previously demonstrated to detect the formation of chondroitin 4-sulfate/CatK complexes and its disruption by polypeptides and polyamino acids (254). In Chapter 4, this technique was refined to assess two chemical libraries (Known Drugs 2 and Biomol libraries) to identify selective anti-collagenase inhibitors of CatK. Compounds active in the FP assays were then tested in HTS peptide assay to exclude active site-targeting inhibitors followed by a low throughput collagenase assay. Finally, the active compounds were further analysed using computational molecular docking studies and cell-based osteoclast resorption assays to further assess their binding modes to the enzyme and their in vitro activity.   Hypothesis 4: Substrate selective inhibitors modulate CatK activity through binding at ectosteric sites and not via an allosteric mechanism. (Chapter 5) NSC-13345 is a compound previously identified through a library screening effort and described as a putative allosteric regulator of CatK for a partially processed CatK molecule. The work in Chapter 5 describes X-ray crystallography and kinetic studies that elucidate the binding of NSC-13345 and T06, using the fully processed and active CatK. Their binding modes were 55  further characterized using mutagenesis studies and computational-based approaches. The results supported an ectosteric mode of action for both inhibitors and excluded an allosteric mechanism.   56  2. Structural Insights into Enzyme Mechanism and Inhibitory Differences Between hCatK and mCatK Abstract Cathepsin K (CatK) is the predominant mammalian bone-degrading protease and thus an ideal target for anti-osteoporotic drug development. Rodent models of osteoporosis are preferred due to their close reflection of the human disease and their ease of handling, genetic manipulation, and economic affordability.  However, large differences in the potency of CatK inhibitors for the mouse/rat versus the human protease orthologues have made it impossible to use rodent models. This is even more of a problem considering that the most advanced CatK inhibitors including odanacatib and balicatib failed in human clinical trials due to side effects and rodent models are not available to investigate the mechanism of these failures.  Here, we elucidated the structural elements of the potency differences between mouse and human CatK using odanacatib (ODN). We determined and compared the structures of inhibitor-free mouse CatK (mCatK), human CatK (hCatK) and ODN bound to hCatK. Two structural differences were identified and investigated by mutational analysis.  Humanizing subsite 2 in mCatK led to a 5-fold improvement of ODN binding whereas the replacement of Tyr61 in mCatK with Asp resulted in an hCatK with comparable ODN potency. Combining both sites further improved the inhibition of the mCatK variant.  Similar results were obtained for balicatib. These findings will allow the generation of transgenic CatK mice that will facilitate the evaluation of CatK inhibitor adverse effects and to explore routes to avoid them.   57  A version of this work has been published in the Biochemical Journal. Law, S., Andrault, P., Aguda, A., Nguyen, N., Kruglyak, N., Brayer, G., Brömme, D. Identification of mouse cathepsin K structural elements that regulate the potency of odanacatib. Biochemical Journal, 474(5), 851-864. Feb., 2017.   58  2.1. Introduction Cathepsin K (CatK) is a lysosomal cysteine protease highly expressed in osteoclasts and responsible for the bulk degradation of the collagenous bone matrix (50, 71).  It is a member of the papain-like cysteine protease family (CA clan, C1 family) comprising 11 members encoded in the human genome (cathepsins B, C, F, H, K, L, O, S, V, W and X) (42). The collagenolytic activity of CatK has been reported to depend on protease dimers formed in the presence of glycosaminoglycans chondroitin sulphate (232).  Overexpression of CatK activity increases bone resorption and highlights CatK as an attractive target for antiresorptive drug development (83, 175). Osteoporosis affects 50% of women aged 50 years or older and medical costs are estimated to be 17 to 20 billion dollars in the US (105). Various CatK inhibitors have been evaluated in clinical trials of osteoporosis but despite showing efficacy by increasing bone mineral density and reducing fracture rates, they failed because of various side effects (112). These side effects include cardiovascular complications, and skin fibrosis.  The mechanisms of these side effects remain unknown and their elucidation is hindered by the absence of suitable animal models. CatK deficient mouse models have been used to elucidate the involvement of CatK in bone resorption but do not display the typical human CatK deficient phenotypes (112).  The ovariectomy (OVX)-induced osteoporosis rodent models are widely used to study bone loss but they are not suitable to evaluate CatK inhibitors due to their low activity toward mCatK (255).  IC50 values of three CatK inhibitors that have been clinically tested but failed because of side effects show a 94-fold lower potency for the mCatK when compared with the human orthologue for relacatib, a 343-fold lower potency for balicatib, and 540-fold lower potency for odanacatib (ODN) (256, 257). Similar ratios have been reported for the rat CatK (257). Thus, it would be helpful to understand the exact structural elements in rodent CatKs that prevent the effective 59  binding of hCatK targeting inhibitors. This could be translated into the design of transgenic CatK mice, which would allow for the thorough evaluation of clinically effective antiresorptive hCatK inhibitors and their clinical failures.    Here, we report for the first time the inhibitor-free mouse and human CatK three-dimensional structures and the structure of a human CatK complex with odanacatib, the most thoroughly investigated CatK inhibitor to date.  Two structural elements were identified as affecting the binding of ODN and balicatib to mCatK and its mutants revealed equal or even better binding of both inhibitors than to hCatK.   2.2. Materials and Methods Materials Benzyloxycarbonyl-Phe-Arg-7-amido-4-methylcoumarin (Z-Phe-Arg-MCA) and benzyloxy-carbonyl-Leu-Arg-7-amido-4-methylcoumarin (Z-Leu-Arg-MCA) were purchased from Bachem (Bachem Americas, Inc, Torrance, California, USA). The pan-cysteine cathepsins inhibitor, L-3-carboxy-trans-2,3-epoxypropionyl-leucylamido-(4-guanidino)butane (E64), was purchased from Sigma-Aldrich (Sigma-Aldrich Canada, Oakville, Ontario, Canada). The selective CatK inhibitors odanacatib (ODN) and Balicatib were purchased from Selleck Chemicals (Houston, TX) and Tocris Bioscience (Bristol, UK), respectively.  Construction of hCatK, mCatK, and its variant vectors and expression of recombinant proteins  The cDNA for mProCatK (OriGene, Burlington, Ontario, Canada) was used to generate PCR inserts with the Phusion Green High-Fidelity DNA polymerase (Thermo-Fisher 60  Scientific™, Waltham, Massachusetts, USA) using the non-mutagenic primers listed in Table S2-1. The PCR program used was: denaturation: 95°C, annealing: 61°C, extension: 72°C; 30 cycles. Inserts were digested with EcoRI and NotI restriction enzymes and cloned into the pPic9k expression vector (Life Technologies, Burlington, Ontario, Canada). The wild-type mProCatK vector served as a template to generate the humanized mCatK mutants by hybridization of the mutagenic primers (Table S2-1) and subsequent amplification of the full plasmid using the following PCR conditions: denaturation: 95°C, annealing: 58-63°C, extension: 72°C; 30 cycles. The pPic9k vector containing hProCatK was used for protein expression as previously described (258). All plasmids were then transformed into DH5α E. coli by the heat shock method and transformants were screened on Luria Bertani plates containing 100 µg/mL Ampicillin. Sanger sequenced plasmids (1µg) were linearized with SacI before transformation into competent GS115 P. pastoris cells by electroporation (V = 2kV, t = 5ms).  Transformants were screened, expressed and pepsin activated as previously described for wild-type cathepsin K and various mutant variants (91, 258). The activity of processed CatK was tested at room temperature with Z-Phe-Arg-MCA fluorogenic substrate (5 µM) in 0.1 M sodium acetate buffer pH 5.5, 2.5 mM dithiothreitol (DTT), and 2.5 mM ethylenediaminetetraacetate (EDTA) using a Perkin Elmer LS 50B Fluorimeter (Perkin Elmer, Waltham, Massachusetts, USA) set up at an excitation wavelength of 380 nm and an emission wavelength of 450 nm. Maximum activity was usually reached after 5 days of induction and cells were harvested, spun down at 4500 x g for 10 minutes and supernatants were concentrated using an Amicon 10 kDa Molecular Weight Cut-Off Ultrafiltration Membrane (EMD Millipore, Billerica, Massachusetts, USA). Concentrated and pepsin treated supernatants were conditioned with 2M ammonium sulfate and clarified at 15,000 x g at 4°C for 30 minutes before loading on an N-butyl Sepharose column connected to an 61  AKTÄ purifier (GE Healthcare, Fairfield, Connecticut, USA). Protein was eluted with a linear gradient of ammonium sulfate from 2 M to 0 M in 30 min. Fractions containing the active enzyme were pooled and submitted to a buffer exchange (0.1 M sodium acetate, pH 5.5, 0.5 mM EDTA, 0.5 mM DTT) using an Amicon 10kDa Ultra Concentrator (EMD Millipore) to eliminate remaining ammonium sulfate and further purified on SP-Sepharose column (GE Healthcare) as previously described (258). Fractions containing the active enzyme were concentrated to 6 mg/mL using an Amicon 10 kDa Ultra Concentrator (EMD Millipore) and the purified enzyme stock was stored at -80°C. Crystallization of inhibitor-free mouse and hCatK and odanacatib-bound hCatK hCatK (6 mg/mL) was crystallized in 20% polyethylene glycol (PEG) 8000, 0.1 M phosphate/citrate, and 0.2 M sodium chloride while mCatk (6 mg/mL) crystals were grown in 0.1 M sodium/potassium phosphate, pH 6.2, 25% 1,2-propanediol, and 10% glycerol. To obtain an ODN-hCatK complex, ODN was dissolved in DMSO at 200 mM concentration and added to the hCatK sample diluted to 1 mg/mL in 100 mM sodium acetate buffer, pH 5.5, containing 2.5 mM DTT and EDTA at a 25:1 inhibitor : protein ratio. The protein-inhibitor complex was then concentrated to 10 mg/mL for crystallization using a sitting drop vapor diffusion method. Optimal crystallization buffer was 0.1 M HEPES, pH 7.5 containing 0.05 M cadmium sulfate hydrate and 1.0 M sodium acetate trihydrate. The sitting drop of 2 μL consisted of a 1:1 dilution of protein-inhibitor complex/well solution. Suitable crystals were formed in 3 weeks and 30% 2-Methyl-2,4-pentanediol was added to the well before the crystals were flash-frozen in liquid nitrogen prior to data collection on beamlines 7-11 (hCatK), 9-2 (ODN-hCatK) and 12-2 (mCatK) at the Stanford Synchrotron Radiation Lightsource facility (Menlo Park, California, USA). All crystals were grown at room temperature. 62  Structural determination of hCatK, mCatK and odanacatib-bound hCatK enzymes Diffraction data were collected using a Dectris Pilatus 6M detector at 100 K with an X-ray wavelength of 0.98 Å. All data sets were processed with the program iMosflm version 7.2.1 (259) and the intensities scaled with SCALA (260) to a resolution of 1.62 Å for hCatK, 2.9 Å for mCatK, and 1.4 Å for ODN-bound hCatK. Phasing was performed by molecular replacement using human wild-type structures with PDB IDs 1ATK (hCatK) and 4X6H (ODN-hCatK) in the program PHASER (261). For the mCatK structure, the search model (PDB ID: 4X6H) was modified with CHAINSAW using the mCatK sequence. The restraints for the complexed ODN were generated using ELBOW (262) and fitted into the model during refinement. All modeled structures were refined by cycles of automated refinement in PHENIX (263) and manual adjustments in COOT. The quality of the final model was evaluated using SFCHECK and PROCHECK (264) in the CCP4 program suite (265). No non-glycine residues were found in the disallowed or unfavored regions on the Ramachandran plot (260)(136). All the structures herein were illustrated using PyMOL software (PyMOL Molecular Graphics System, Version 1.8 Schrödinger, LLC).  Enzyme kinetic assays All assays were performed at room temperature in 0.1 M sodium acetate, pH 5.5, containing 2.5 mM DTT and 2.5 mM EDTA.  The cleavage of the fluorogenic substrates, Z-Phe-Arg-MCA and Z-Leu-Arg-MCA, was recorded with a Perkin Elmer LS 50B Fluorimeter (Perkin Elmer, λex = 380 nm, λem = 450 nm). Active site titration of the protease variants was carried out using E-64 (Sigma-Aldrich) as a titrating agent according to a previously described method (136).  The determination of kcat and Km constants for both substrates was performed at a constant enzyme concentration (2 nM) and increasing substrate concentrations (1-50 μM). Assays were at 63  least repeated 3 times in duplicate. Data were plotted in Michaëlis-Menten representation and analyzed with GraphPad Prism software (GraphPad Software, Version 5.0, La Jolla California USA) using nonlinear regression. Inhibition assays were carried out in the presence of increasing concentration of ODN (0-100 nM) or balicatib (0-200 nM) and Z-Phe-Arg-MCA (5-30µM). The enzyme (2 nM) was first pre-incubated with ODN or balicatib for 15 min at room temperature before recording activity. All measurements were at least repeated two times. For tight inhibition, data were plotted in Morrison representation using GraphPad Prism software and Ki was determined according to the Morrison method (168). For non-tight inhibition, data were analyzed by the Dixon’s graphical method (266).  Molecular docking of Balicatib The appropriate three-dimensional structures for balicatib was generated using LigPrep using OPLS3 force fields and ionization states generated between pH 5.5 ± 2.0. Covalent docking with balicatib using the ODN-bound structure (PDBID: 5TDI) with ODN removed with CYS25 set as the reactive residue and the reaction type as nucleophilic addition to a triple bond using Glide docking suite (267). Only one pose was identified, and it was examined visually to confirm the orientation and interactions with the protein. Appropriate figures were made using PyMOL software.   2.3. Results Comparison of the inhibitor-free hCatK and mCatK protein structures  To gain insight into the enhanced selectivity of ODN in the inhibition of hCatK over the mouse enzyme, we pursued an X-ray crystallographic approach to solve and compare their 64  respective structures. Crystals of both inhibitor free human and mouse CatKs, along with those of the ODN bound hCatK, were grown and the resultant structures compared. Data collection and refinement statistics for these structural studies are tabulated in Table 2-1. The respective resolution cutoffs for each structure were selected based on the merging statistics outlined in Table 2-1. Notably, this work also presents the first structures of mouse and human CatKs crystallized in their uninhibited form with a free unoxidized and unmodified active site residue Cys25. The asymmetric unit for all three enzymes contained one copy of the molecule. All three enzymes adopt the characteristic two-domain conformation of papain-like cathepsins, a left (L) domain and right (R) domain connected by two β-strands (62).   The propeptide-free catalytic domain of mCatK shares 87% sequence identity with its human counterpart and the alignment of the sequences shows 29 non-conserved residues between the human and the mouse enzymes (Figure 2-1A). These non-conserved residues are evenly distributed between the L-domain and the R-domain of the enzyme and with a number of them located in the active site groove. The overall polypeptide chain folds of these enzymes are similar (Figure 2-1B) and superimposition of the structures of mCatK on the hCatK revealed a rmsd of 0.394. The accessible surface areas calculated for the mouse and human enzyme are 9860.5 and 9938.2 Å2 respectively. We did not observe any substantial conformational changes in the local folding of the enzyme as the positions and orientations of the secondary structures remained the same for both enzymes. Based on these results, the difference in amino acids sequences between the human and the mouse enzymes does not significantly alter the global structure of the enzyme.    65   Figure 2-1 Structural comparison between hCatK and mCatK.  (A) Sequence alignment of hCatK and mCatK shows 86% sequence similarity between the two enzymes. Surface view of hCatK: C25 and H162 from the catalytic triad are respectively shown in yellow and dark blue. The non-conserved residues located outside the active site in mCatK are shown in red. The S2 subsite pocket differences are shown in light blue. C25 and H162 are shown in yellow and dark blue. The non-conserved residues located outside the active site in mCatK are shown in red, the S2 subsite pocket differences are shown in cyan. (B) Superposition of the hCatK (green) and mCatK (blue) crystal structures. (C) 3D structures of the S2 pockets of hCatK and mCatK, the solvent accessible surface of each enzyme is in transparent gray. C25 is shown in yellow and the three residues that are different in the S2 pocket are in cyan.  When we compared the structure of the entire active site cleft of the mouse and the human enzyme, we observed that the entire active site cleft of the mouse enzyme is markedly smaller when compared with the human orthologue and had a volume of only 875.5 Å3 (947.7 Å3 66  for the human enzyme) (Table 2-2 and Figure S2-1). In particular, the depth of the mCatK S2 subsite was noticeably reduced when compared to that of the hCatK (Figure 2-1C).  This was due to shifts of the side chain residues of Ser134, Val160 and Met209 in the S2 subsite pocket of mCatK when compared to counterpart residues in hCatK (Ala134, Leu160, and Leu209). In particular, Met209 and Ser134 are directly pointing into the S2 pocket of mCatK, decreasing its accessible volume and likely affecting the binding of substrates that occupy position P2. The distance measured between the α-carbon of Cys25 and the δ-carbon of Met209 and γ-oxygen of Ser134 was respectively 10.8 Å and 8.3 Å in mCatK (data not shown). This was shorter than the corresponding distances in hCatK measured between the α-carbon of Cys25 and the δ’ carbon of Leu209 and the β-carbon of Ala134 which were respectively 11.5 Å and 9.3 Å.  Quantification of the accessible surface areas of these three residues of mCatK showed an average decrease of 29% in the residues lining the S2 pocket (data not shown).  The configuration of the catalytic triad (Cys25, His162, Asn182) between both enzymes is however very similar including identical distances between the imino nitrogen of His162 and the oxygen of amide group of Asn182 with only a slight increase in the distance between the sulfur of Cys25 and the protonated nitrogen in the imidazole ring of His162 from 3.5 Å in hCatK to 3.8 Å in mCatK (Figure S2-2).  67   Figure 2-2 Chemical structures of CatK Inhibitors.  Chemical schematic diagrams of odanacatib, E64, NFT, and balicatib.  Characterization of odanacatib binding to hCatK ODN (Figure 2-2) binds in the active site region of hCatK, forming a covalent linkage with the sulfur atom on the side chain of residue Cys25 (Figure 2-3). The CF3 group is directed out of the active site just above residue Tyr67. As illustrated in the Ligplot diagram in Figure 2-3C, hydrogen bonding can be seen between the Gly66 and the nitrogen adjacent to the CF3 group in ODN. In addition, the adjacent ODN amide bond also forms hydrogen bonds with the oxygen atom on the side chain of residue Asn161. Extensive hydrophobic interactions can be seen between most of the ODN atoms and hCatK. The strongest of these are the interactions between the two inhibitor phenyl rings and hCatK residues Gly65 to Tyr67. 68   Figure 2-3 Binding of ODN into the active site cleft of hCatK.  (A) Overall structure of hCatK (orange ribbon, transparent grey surface) with boxed area highlighting the fit of ODN (yellow sticks) binding in the active site. (B) Fit of ODN drawn with the difference omit map depicted at 1.6 σ into the active site pocket with the neighboring side chains of hCatK in stick representation. (C) Ligplot diagram depicting the binding of ODN into the active site of hCatK. The ligand backbone is colored in yellow and the protein bonds are colored in brown. The covalent bond formed with C25 is in purple, hydrogen bonds are depicted in dashed green lines with the corresponding distances.  Hydrophobic interactions are labeled with red dashes.   Superposition of our inhibitor-free and ODN bound hCatK structures with an ODN-analog N-(2-aminoethyl)-N2-{(1s)-1-[4'-(aminosulfonyl)biphenyl-4-Yl]-2,2,2-trifluoroethyl}-L-leucinamide (NFT)-bound structure (Figure 2-2) (PDB ID: 1VSN) bound to a hCatK variant 69  containing 15 mutations (due to the patent protection of the wild-type hCatK gene) (268) resulted in an rmsd of 0.248 and 0.232, respectively. These results indicate that the inhibitor-free hCatK is structurally similar to both inhibitor-bound proteins and the process of inhibitor binding did not induce any significant conformational change in the overall structure of this enzyme (Figure 2-4A). However, local conformational changes do occur around the active site. (Figure 2-4B). Here, the protein folds slightly differently with a small but significant opening of the left and right domains to accommodate ODN binding, which can be seen in the increased accessible surface area of the residues that line the active site calculated in the absence of the bound ODN (Figure 2-4C). Structural changes are most obvious near the N-terminus, where in the ligand-free enzyme structure adopts a beta-sheet conformation for residues 108-112 and 209-214 in contrast to the disordered conformation seen in the ODN-bound structure (Figure S2-3).  Furthermore, as a result of ODN inhibitor binding, the side chains of the hCatK active site residues are displaced and create a small opening of the enzyme lobes to accommodate this binding (Figure 2-4B). The catalytic triad residues (Cys25, His162, and Asn182) of the enzyme are consequently pushed slightly apart upon ODN binding, increasing their respective distances to each another by 0.1 to 0.2 Å when compared to the unbound hCatK structure. This opening effect is most pronounced for residue Asp61, which is flipped upwards at 113° upon ODN binding. In addition, the main chain containing residues 63 to 67 is displaced to accommodate the binding of ODN. The binding position and orientation for ODN was similar to the related analog, NFT, previously identified (PDB ID: 1VSN) (268). To quantify the extent of the active site opening, we examined the accessible surface areas (ASA) of the active site cleft residues using the tool, Protein Interfaces, Surfaces and Assemblies (PISA) (269) (Figure 2-4C). The ASA of a residue refers to the surface area accessible to water molecules with a 1.4 Å in 70  diameter. 11 out of 15 residues in the lining of the active site cleft showed an increase in ASA in the ODN bound structure. In particular, residues Gln19, Gln60, and Ala134 all showed more than 30% increases in ASA upon binding of ODN. However, since the overall surface areas of the enzymes remained relatively unchanged, the enzyme reveals only local dynamic adjustments between the left and right domain upon ODN binding.   Figure 2-4 Structural analysis of the binding of ODN on hCatK.  (A) Superposition of the inhibitor-free hCatK (green) and the -hCatK (orange) structures bound to ODN (yellow sticks).  (B) Overlay of the active site region between the ODN-bound hCatK (orange) and inhibitor free hCatK (green) depicted in stick representation. (C) Variation of the accessible surface area of the active site residues upon ODN binding calculated with PISA (in %). (D) Comparison between ODN (yellow sticks), NFT (magenta sticks PDB ID: 1VSN), and E64 (green sticks, PDB ID: 1ATK) binding on hCatK. 71   In addition, we checked the cavity volumes in the entire active site cleft using GHECOM with probe radii from 2.0 to 10 Å (270). The calculated volumes of the active site groove of the ODN-bound and the free hCatK enzymes were determined to be 1027.6 Å3 and 947.7 Å3, respectively, representing an 8.4% increase in the cleft volume (Table 2-2). This is comparable with the NFT analog (PDB ID: 1VSN), which opened up the cleft approximately 5.4%. Notably E64 (Figure 2-2), another active site covalent inhibitor (PDB ID: 1ATK), increased the cleft by approximately 20.7% in volume, possibly due to a greater degree of hydrogen bonding with the active site residues. Superposition of the binding positions of NFT and ODN revealed that they bound in near-identical orientations (Figure 2-4D). The binding of the active site inhibitor E64 and ODN was also similar and they occupy the S1 and S2 positions in the same manner.  Prediction of the binding of odanacatib on mCatK To investigate the structural features that regulate the binding of ODN in mCatK, we compared ODN-bound hCatK and the mCatK inhibitor-free protein structures (Figure 2-5). The amino acid composition of the hCatK S2 pocket differs by residues Ala134, Leu160, and Leu209, which are replaced by Ser134, Val160, and Met209 in mCatK. Ser134 in the S2 pocket of the mCatK may have a role in determining the affinity of ODN by obstructing the binding of the P2 substituent 4-fluoroleucine. As a result of its potentially close proximity to the fluorine atom in the ODN-bound structure, it places two electronegative atoms in close proximity, making it likely to be electrostatically unfavorable. Residue Asp61 in hCatK is also replaced by Tyr61 in mCatK and was of particular interest when we compared the two structures. In the mCatK structure, the side chain orientation of Tyr61 would create a steric clash due to its protrusion into the ODN binding site. In hCatK, the equivalent residue Asp61 is flipped away from this binding site. Based on these observations, we speculated that the exchange of the S2 pocket and the 72  Tyr61 with their human counterparts would be required for effective ODN binding on the mouse enzyme. To test this hypothesis, we engineered three humanized mCatK variants by targeted mutagenesis of either the S2 cavity (Ser134Ala/Val160Leu/Met209Leu), for the Tyr61 (Tyr61Asp), or for both (Ser134Ala/Val160Leu/Met209Leu/Tyr61Asp).   Figure 2-5 Putative binding of ODN in the mCatK enzyme.  Alignment between the inhibitor-free mCatK (light blue) with the hCatK (orange) bound to ODN (yellow). Residue 61 and the residues lining the S2 pocket (134, 160, 209) are depicted in stick representation; the surface of the mCatK is in transparent gray.   Characterization of the substrate specificity of wild-type and variants of CatK, and the impact of odanacatib and balicatib inhibition  Characterization of the substrate specificity of the wild-type hCatK, the wild-type mCatK, the mCatK human-like S2 variant (Ser134Ala/Val160Leu/Met209Leu), the mCatK Tyr61Asp variant, the mCatK Tyr61Asp/Ser134Ala variant, and the mCatK human-like S2 Tyr61Asp 73  tetramutant (Ser134Ala/Val160Leu/Met209Leu/Tyr61Asp) was performed using two fluorogenic substrates, Z-Phe-Arg-MCA and Z-Leu-Arg-MCA (Table 2-3). Compared to wild-type mCatK, wild-type hCatK showed a better S2 affinity for Phe (Km = 5.1 ± 0.8 vs. 24.2 ± 6.8 µM) and Leu (Km = 2.6 ± 0.3 vs. 30.4 ± 6.8 µM). The kcat/Km specificity constants of wild-type hCatK obtained with Z-Phe-Arg-MCA and Z-Leu-Arg-MCA were respectively ~110- and ~94-fold higher compared to the wild-type mCatK constants. The Tyr61Asp mCatK mutant displayed a similar S2 specificity for Phe (kcat/Km = 2.4 ± 0.5 vs. 1.7 ± 0.3 104 M-1-1) and Leu (kcat/Km = 8.0 ± 0.8 vs. 6.3 ± 1.0 104 M-1s-1) compared to the wild-type mouse enzyme, which is expected since Tyr61 is not a constitutive residue of the S2 pocket. Introducing the hCatK S2 specificity in the mouse enzyme increased the Z-Phe-Arg-MCA and Z-Leu-Arg-MCA specificity constants by a factor of ~102 and ~67, respectively, which is close to those from wild-type hCatK.  Comparable kinetic results were observed when Tyr61 (Tyr61Asp) and either all three residues (Ser134Ala/Val160Leu/Met209Leu) or one residue (Ser134Ala) in the S2 subsite were exchanged for their human counterparts. ODN potency towards the mutant mouse variants dramatically increased when compared to the wild-type protease.  ODN inhibited the wild-type mCatK with a Ki-value of 32.7 ± 6.8 nM, which was 182 times less potent than against the human orthologue (Ki = 0.18 ± 0.06 nM) (Table 2-4).  A moderate improvement of ODN inhibition was observed when the mCatK S2 subsite was exchanged with the human S2 pocket (Ki = 6.5 ± 3.6 nM). Conversely, ODN was a tight binding inhibitor of the Tyr61Asp mCatK mutant (Ki = 0.30 ± 0.05 nM) and showed only a 1.7-fold decreased potency than toward wild-type hCatK (Ki = 0.18 ± 0.06 nM). The Tyr61Asp/Ser134Ala mCatK variant showed slightly higher potency than the single mutant (Ki = 0.19 ± 0.07 nM), and was almost equal to the Ki observed for the wild-type hCatK enzyme.  Surprisingly, a combination of both the S2 pocket 74  and Tyr61 mutation in the mouse enzyme (Ser134Ala/Val160Leu/Met209Leu/Tyr61Asp) improved the efficiency of ODN by a factor of 5 (Ki = 0.037 ± 0.001) when compared to the hCatK.   We also performed the inhibition assays on balicatib (N-[1-(cyanomethylcarbamoyl)cyclohexyl]-4-(4-propylpiperazin-1-yl)benzamide) (Figure 2-2), a clinically tested CatK inhibitor structurally related to ODN.  Balicatib displayed a similar potency like ODN against wild-type hCatK (Ki = 0.24 ± 0.03 nM) and wild-type mCatK (Ki = 30 ± 4.2 nM) (Table 2-4). The potency of balicatib was respectively ~18 and ~5 times lower against the Tyr61Asp and the human S2 mCatK variants compared to the WT hCatK.  Combining both Tyr61Asp and Ser134Ala or all three S2 mutations in mCatK restored an hCatK-like balicatib efficiency (Ki = 0.56 ± 0.15 and 0.47 ± 0.28 nM, respectively).  2.4. Discussion Recently, Merck&Co decided to discontinue the development of ODN as a novel anti-osteoporotic drug. The main concern was a slightly increased risk of stroke in the treatment arm of a more than 16,000 postmenopausal women comprising a phase III clinical trials (179). Similarly, Novartis dropped the development of its CatK inhibitor, balicatib, several years ago in a phase II clinical trial because of skin-fibrotic adverse events (271). Other compounds such as ONO3445 were withdrawn for marketing reasons after an efficacious phase II trial (181). However, published reports revealed similar adverse effects as ODN indicating that this compound may have also faced regulatory concerns (112, 181, 272). This raises the question of the mechanism of CatK inhibitor-induced side effects. Rodents in particular mouse models of osteoporosis are well-characterized, economic, and genetically easily accessible models to study 75  for side effects.  Unfortunately, neither ODN, balicatib nor a third failed inhibitor (relacatib from GlaxoSK), are suitable for their evaluation in mice as their potency is reduced by more than two orders of magnitude when compared to hCatK (271). Humanized mouse models can overcome this problem but standard procedures such as replacing the entire mouse gene with the human gene or the placement of a cDNA minigene have the risk to alter expression rates and sites of the target gene. The identification of a limited number of amino acids required for the desired specificity and their replacement using novel cloning strategies such as CRISPR/Cas9 might be preferable. The generation of appropriately “humanized” mouse models can better evaluate drug candidates designed against human targets and might identify potential adverse effects earlier in human drug trials (273, 274). To dramatically underscore the importance of these models, a high level of toxicity in an experimental hepatitis B drug was revealed in a humanized liver mouse model (275). In contrast, previous studies of the drug showed no toxicity in wild-type mouse, dog, and monkey experiments but were lethal to six out of 15 patients in a phase II study (276).   For this reason, we investigated the structural features responsible for the weak binding affinity of hCatK inhibitors in mCatK. Previous reports demonstrated that the S2 pocket of hCatK is a major determinant in substrate specificity and in inhibitor selectivity and potency (61). As seen in our crystallographic analysis, the accessible surface of the S2 pocket of the mCatK is reduced when compared to the hCatK subsite which would explain the lower binding affinity of the mCatK S2 pocket for both dipeptide substrates containing either Phe or Leu or the ODN (4-fluoroleucine) substituent in the P2 position. This is reflected by the differences of substrate specificity between the hCatK and the mCatK (Table 2-3). The wild-type mouse enzyme displayed a low affinity (high Km values) and a weak catalytic efficiency (low kcat values) against both Z-Phe-Arg-MCA and Z-Leu-Arg-MCA substrates. This suggests that the overall 76  equilibrium between every reactional species (substrate, enzyme, enzyme-substrate complex, and product) is governed by low association (k1) and catalytic (kcat) constants and a high dissociation constant (k-1). In other words, the complex between mCatK and both substrates are more prone to dissociate than to form a product compared to the hCatK situation. Notably, exchanging the S2 subsite of the mCatK with the human S2 pocket improved both catalytic efficiency and inhibition.  The kinetic parameters (kcat, Km, kcat/Km) for this variant and the wild-type hCatK are very comparable: however, ODN still maintained an approximately 35-fold decreased potency against the mCatK human-like S2 variant compared to the wild-type hCatK. This suggested that the S2 cavity is not the only factor that governs the binding of ODN on the mouse enzyme. The alignment of structures between the inhibitor-free mCatK and the ODN-bound hCatK revealed a steric hindrance of the mCatK Tyr61 residue upon the ODN interaction with the left domain of the enzyme (Figure 2-5). ODN showed a 93-fold improved potency against the Tyr61Asp mCatK variant which was only 1.7 times less effective than against the human enzyme.  The effect on the specificity constant kcat/Km, was negligible when compared with wild-type mCatK. Thus, we demonstrated that Tyr61 plays a critical role in the potency of ODN against the mCatK. Remarkably, when both the S2 subsite and Tyr61 were replaced with the analogous amino acid residues present in hCatK, the Ki value for ODN was even 5 times lower than for hCatK. It appears that the S2 subsite and Tyr61 effects are additive in their overall effect on ODN binding to CatK.  Balicatib, another clinically tested CatK inhibitor, likely has a similar binding mode as ODN as shown in the 3-D structure of a complex between hCatK and a balicatib derivative, (1R,2R)-N-(1-cyanocyclopropyl)-2-{[4-(4-fluorophenyl)piperazin-1-yl]carbonyl}cyclohexanecarboxamide (PDB ID 4DMX) (277). Similar to ODN the binding of balicatib improved in the Tyr61Asp mutant and the combined S2 and Tyr61Asp mCatK variant 77  displayed a Ki which was only 2-fold higher than the wild-type hCatK. Conversely to ODN, the Tyr61 mutant alone was not as efficient at restoring balicatib activity and the Ki was about 18-fold higher than hCatK.  The binding of balicatib likely depends on both Tyr61 and the S2 pocket. In addition, molecular docking of balicatib to hCatK suggests that the second piperazin ring bends away from residue 61 and may be the reason why there was less improvement observed since the compound can be accommodated easier (Figure S2-4).    It should be noted, that rabbit CatK also has the Tyr61 residue but still revealed IC50 values only 5-fold and 4.5-fold less potent for ODN and balicatib, respectively, when compared with hCatK (154, 180). The rabbit CatK shares 94% sequence homology with the human enzyme compared to 86% and 88% for the mouse and the rat enzymes, respectively. The S2 subsite pocket in the rabbit enzyme only differs from the hCatK S2 subsite by Leu160Val and therefore would better accommodate ODN binding in the active site groove. Since no structure of the wild-type rabbit CatK is available, we hypothesize that Tyr61 in the rabbit enzyme does not adopt the same orientation seen in mCatK, which would explain why CatK inhibitors are more efficient against the rabbit CatK. The dramatic differences in inhibitor potencies for ODN and related inhibitors between CatK orthologues might be a rather unique situation. However, similar issues may arise for selected inhibitors for other cathepsins as well considering that the average sequence identities between human and mouse cathepsins is between 69% (CatW) and 87% (CatK, F, and X).  Distinct sequence differences between the two species at inhibitor binding sites may lead to marked differences in their susceptibility toward them but have not yet been reported for other cathepsins. This should also be considered when drugs are evaluated for other proteases in murine disease models. 78   In conclusion, we identified the S2 pocket, in particular Ser134, and Tyr61 of the mCatK as the structural motifs that mediate its poor inhibition by ODN and likely balicatib as well. Tyr61, located in the left domain, seems to be the main obstacle in ODN interaction with the mouse enzyme while the S2 pocket alone has only a mild impact. Importantly, this information can be used to design an appropriate transgenic mouse by introducing one or two mutation sites that would fully express the high potency toward ODN or balicatib as observed for hCatK. Such a mouse model would be highly beneficial to study the observed side effects in human trials for both compounds and to develop strategies to circumvent them.  Table 2-1: Data collection and refinement statistics of crystal structures Data Collection Human CatK (5TUN) Mouse CatK (5T6U) ODN-bound hCatK (5TDI) Space group P 21 21 21 P21 2 21 P 21 21 21 Unit cell dimensions (Å) a=32.08 b=  66.64 c=93.37 α=β=γ= 90° a=37.92 b=43.26 c=133.97 α=β=γ= 90° a=45.25 b=54.62 c= 80.30 α=β=γ= 90° Number of total reflections 121131 (16726) 27358 (4047) 213992 (31844) Number of unique reflections 26167 (3744) 5253 (492) 39625 (5726) Mean I/σI 6.2 (2.5) 10.66 (4.97) 8.1 (3.7) Redundancy 4.6 (4.5) 5.4 (5.8) 5.4 (5.6) Merging R factor (%) 19.6 (58.2) 12.2 (32.0) 13.2 (37.0) Maximum resolution, (Å) 1.62 2.90 1.40 79  Refinement    Resolution range (Å) 27.12 - 1.62 43.26 – 2.90 45.16 – 1.40 Completeness (%) 99.7 (99.5) 99.53 (99.80) 99.0 (96.20) Number of protein atoms 1663 1644 1659 Number of differing residues from hCatK/ ligand atoms 0/0 29/0 0/36 Number of water atoms 392 22 278 B factors (Å2) (all/protein/ligand/solvent) 12.62/9.60/0/25.67 28.50/28.30/0/29.30 15.22/13.25/9.49/27.72 R factor (%) 15.19 20.50 14.27 R free (%) 19.45 22.06 17.05 Rms deviations    Bond lengths (Å) 0.006 0.002 0.007 Bond angles (°) 0.925 0.611 1.06  Values in parentheses refer to the respective highest resolution shell, for hCatK (1.71 – 1.62 Å), mCatK (3.06 – 2.90Å), and ODN-bound hCatK (1.48 – 1.40 Å)   80  Table 2-2: Total surface areas and active site cleft volumes for uninhibited human and mouse CatKs, and ODN, E64 and NFT bound human CatK enzymes.  Structure (PDB ID) Total Accessible Surface Area (Å2)1 Active Site Cleft Volume (Å3) Uninhibited hCatK (5TUN) 9995.2 947.7 Uninhibited mCatK (5T6U) 9860.5 875.5 ODN-bound hCatK (5TDI) 10115.6 1027.6 E64-bound hCatK (1ATK) 10036.4 1144.3 NFT-bound hCatK (1VSN) 9762.5 1002.0  Total accessible surface areas were calculated using PyMOL and the active site cleft volumes calculated using GHECOM cavity detection server.    81   Table 2-3: Determination of the kinetics constants of the wild-type mouse and human CatK, and the mCatK variants.    wild-type hCatK  wild-type mCatK  mCatK  human-like S2  mCatK Tyr61Asp  mCatK Tyr61Asp Ser134Ala mCatK human-like S2 Tyr61Asp  Z-Phe-Arg-MCA kcat (sec-1) 13.6 ± 1.7 0.6 ± 0.1 16.7 ± 3.5 0.7 ± 0.2 15.7 ± 0.6 15.0 ± 2.4 Km (µM) 5.1 ± 0.8 24.2 ± 6.8 8.0 ± 2.9 44.8 ± 22.3 7.4 ± 1.0 6.4 ± 2.9 kcat/Km (104 sec-1 M-1) 268 ± 8 2.4 ± 0.5 216 ± 34 1.7 ± 0.3 213 ± 64 254 ± 65 Z-Leu-Arg-MCA kcat (sec-1) 19.7 ± 1.2 2.4 ± 0.6 24.8 ± 4.6 4.6 ± 1.3 14.5 ± 1.3 20 ± 2.0 Km (µM) 2.6 ± 0.3 30.4 ± 6.8 4.3 ± 1.4 76 ± 29 3.6 ± 0.2 2.4 ± 0.3 kcat/Km (104 sec-1 M-1) 753 ± 61 8.0 ± 0.7 600 ± 110 6.3 ± 1.0 406 ± 27 828 ± 64  The activity of the wild-type human and mouse CatK, the mouse human-like S2 mutant (Ser134Ala/Val160Leu/Met209Leu), the mouse Tyr61Asp mutant and the mouse human-like S2 Tyr61Asp (Ser134Ala/Val160Leu/Met209Leu/Tyr61Asp) tetramutant were tested against rising concentrations of Z-Phe-Arg-MCA and Z-Leu-Arg-MCA. Kinetic constants were calculated using Michaëlis-Menten representation under GraphPad Prism software (GraphPad software). Data are represented as mean ± SD (n = 3).  82   Table 2-4: Inhibition constants (Ki) of ODN and balicatib against mCatK, hCatK, and the mCatK variants  Ki (nM)  ODN Fold difference to hCatK Ki (nM) balicatib Fold difference to hCatK WT hCatK 0.18 ± 0.06 1 0.24 ± 0.03 1 WT mCatK 32.7 ± 6.8 182 30 ± 4.2 125 mCatK human-like S2 6.50 ± 3.65 36 1.25 ± 0.07 5.2 mCatK Tyr61Asp 0.30 ± 0.05 1.7 4.35 ± 0.9 18 mCatK Tyr61Asp Ser134Ala 0.19 ± 0.07 1.1 0.56 ± 0.15 2.3 mCatK human-like S2 Tyr61Asp 0.037 ± 0.001 0.2 0.47 ± 0.28 2  Ki constants corresponding to tight-binding inhibition were calculated with GraphPad Prism (GraphPad Software) using Morrison representation and the graphical method of Dixon was used to calculate Ki corresponding to non-tight-binding inhibition. Data are represented as mean ± SD (n = 3).   83  Supplementary Information (Chapter 2) Table S2-1: Sequences of Non-Mutagenic and Mutagenic Primers Used    Primer Name Sequence EcoRI mCatK Forward 5' - AAAGAATTCCTGTCTCCGGAGGAAATGCTG - 3' NotI mCatK Reverse 5' - AAAGCGGCCGCTCACATCTTGGGGAAGCTGGC - 3' Y61D Forward 5' - GTGGACTGTGTGACTGAGAATGATGGCTG - 3' Y61D Reverse 5' - AAGATTCTGGGGACTCAGAGCTAAGAG - 3' S134A Forward 5' - AGCTTGCATCGATGGCCACA - 3' S134A Reverse 5' - TGTGGCCATCGATGCAAGCT - 3' V160L Forward 5' - GTGACCGTGATAATCTGAACCATGCAGTGTTGGTG - 3' V160L Reverse 5'- CACCAACACTGCATGGTTCAGATTATCACGGTCAC - 3' M209L Forward 5'- GGCATTACCAACTTGGCCAGCTTCCCCAAGATG - 3' M209L Reverse 5' - CATCTTGGGGAAGCTGGCCAAGTTGGTAATGCC - 3' 84   Figure S2-1. The active site cleft volume calculated using GHECOM using 2.0 to 10.0 Å spherical probes as represented by the colored spheres filling the active site cavity. hCatK (magenta on the left) showed a larger active site volume (947.7 Å3 vs. 875.5 Å3) as compared with mCatK (green on the right).    Figure S2-2: Atomic distances between the constitutive residues of the catalytic triad of hCatK and mCatK.  The distances between the sulfur of the Cys25 (yellow sticks) and the protonated nitrogen in the imidazole ring of His162 (blue sticks) and between the imino nitrogen of His162 and the oxygen of amide group of Asn182 (green sticks) of hCatK (green ribbon) and mCatK (blue ribbon) were measured with PyMOL. 85   Figure S2-3: Local structural conformation change upon ODN binding on hCatK.  Cartoon (left) and stick (right) depictions of the local conformation differences near the N-terminus between unbound (green) and ODN-bound (orange) hCatK enzymes.   86    Figure S2-4. Molecular docking of balicatib and the hCatK enzyme.  Balicatib (magenta sticks) was docked to hCatK (orange) extracted from the ODN-bound structure using Glide and the final result was aligned with the inhibitor-free mCatK (light blue). Residue 61 and the residues lining the S2 pocket (134, 160, 209) are depicted in stick representation; the surface of the mCatK is in transparent gray.      87  3. A Composite Docking Approach for the Identification and Characterization of Ectosteric Collagenase Inhibitors of Cathepsin K  Abstract Cathepsin K (CatK) is a cysteine protease that plays an important role in mammalian intra- and extracellular protein turnover and is known for its unique and potent collagenase activity.  Through studies on the mechanism of its collagenase activity, selective ectosteric sites were identified that are remote from the active site.  Inhibitors targeting these ectosteric sites are collagenase selective and do not interfere with other proteolytic activities of the enzyme.  Potential ectosteric inhibitors were identified using a computational approach to screen the druggable subset of and the entire 281,987 compounds comprising Chemical Repository library of the National Cancer Institute-Developmental Therapeutics Program (NCI-DTP).  Compounds were scored based on their affinity for the ectosteric site.  Here we compared the scores of three individual molecular docking methods with that of a composite score of all three methods together. The composite docking method was up to five-fold more effective at identifying potent collagenase inhibitors (IC50 < 20 µM) than the individual methods.  Of 160 top compounds tested in enzymatic assays, 28 compounds revealed blocking of the collagenase activity of CatK at 100 µM. Two compounds exhibited IC50 values below 5 µM corresponding to a molar protease:inhibitor concentration of <1:12. Both compounds were subsequently tested in osteoclast bone resorption assays where the most potent inhibitor, 10-[2-[bis(2-hydroxyethyl)amino]ethyl]-7,8-diethylbenzo[g]pteridine-2,4-dione, (NSC-374902), displayed an 88  inhibition of bone resorption with an IC50-value of approximately 300 nM and no cell toxicity effects.  A version of this work has been published in PLoS One. Law, S., Panwar, P., Li, J., Aguda, A., Jamroz, A., Guido, R., Brömme, D. A composite docking approach for the identification and characterization of ectosteric inhibitors of cathepsin K. PLoS One. 12(10): e0186869. Oct., 2017.   89  3.1. Introduction Thiol-dependent cathepsins are found in all life forms and have a vital role in mammalian intra- and extracellular protein turnover (278).  They are members of the papain-like family (CA clan, C1 family) and have 11 proteases encoded in the human genome (cathepsins B, C, F H, K L, O, S, V, W and X).  In particular, cathepsins K, S, and V are potent elastases with cathepsin K (CatK) also being a highly effective and unique collagenase capable of cleaving at multiple sites within triple helical collagens (47, 279, 280).  These proteases have been implicated and targeted in various cardiovascular and musculoskeletal diseases (34, 281, 282).  Major efforts have been undertaken to develop potent cathepsin inhibitors (268, 277, 283).  However, all compounds in development are active site-directed inhibitors, which completely block the activity of the enzyme.  Because cathepsins are multifunctional proteases, it is likely that blocking their entire proteolytic activity will have unwanted side effects (112).  This may explain in part the failing of clinical trials of CatK inhibitors for the treatment of osteoporosis. Patients experienced scleroderma-like phenotypes and revealed increased risks in cardiovascular events such as stroke despite showing excellent bone-preservation outcomes (112, 271).   Our previous studies have demonstrated that the degradation of extracellular matrix (ECM) proteins such as collagens and elastin require specific exosite binding sites. These sites are needed for the formation of protease oligomers in the presence of glycosaminoglycans in the case of CatK-mediated collagenase degradation (207, 230).  Blocking protein-protein, protein-glycosaminoglycan, or specific substrate binding sites with small molecules will allow the selective inhibition of the collagenase and elastase activities of cathepsins without affecting their active site and thus the hydrolysis of non-ECM substrates.  We termed these sites ectosteric sites to differentiate them from allosteric sites as they do not affect the catalytic site upon inhibitor 90  binding.  Ectosteric inhibitors targeting these sites will thus represent substrate specific inhibitors, which selectively can block the disease-relevant activities of cathepsins.  We have recently demonstrated that the selective inhibition of the enzyme’s collagenase activity in osteoclast bone-resorption assays and in an osteoporosis mouse model can be achieved without blocking its TGF-ß1 degrading activity correlated to some of the side effects seen in CatK inhibitor clinical trials (114, 284).  In this study, we adopted the library docking method with the aim to identify novel scaffolds for ectosteric substrate-specific CatK inhibitors.  Potential inhibitors for CatK-mediated collagen degradation were identified using a computational approach involving multiple docking algorithms.  We identified four common chemical scaffolds and several other compounds that may be used as a starting point for further development.  Of 160 compounds identified from the NCI-DTP repository and tested in enzymatic assays, 28 compounds effectively blocked the collagenase activity without disrupting the active site activity in CatK.  Two of these compounds were active at about a 12-fold molar excess over CatK and revealed potent antiresorptive activity in osteoclast bone degradation assay.  3.2. Materials and Methods Molecular docking of NCI/DTP chemical repository library to ectosteric site 1 Chemical structure data of the NCI/DTP Chemical Repository was downloaded from PubChem for molecular docking analysis.  The preliminary study subset of compounds was selected using BioActive and rule of 5 filters, leaving 14,045 compounds.  The appropriate three-dimensional structures were generated using LigPrep and OPLS3 force fields and ionization states generated at pH 5.5. (Schrödinger Inc.) (285). Geometric rotamers generated for each compound was limited to twelve and three per ligand for the preliminary and complete library 91  studies, respectively, and were exported as SDF files.  The enzyme molecule used for docking was an inhibitor-bound CatK (PDBID: 1ATK) with the inhibitor and heteroatoms removed; an inhibitor-free CatK structure was unavailable at the time.  Additional processing of the enzyme molecule was performed in the respective programs prior to docking. Surflex docking Docking and similarity calculations were carried out using standard protocols with Surflex-Dock Geometric (SFXC) as part of the Sybyl-X Suite (286).  The protein structure was prepared using the Sybyl-X protein preparation wizard with hydrogen atoms added and side chains rotamers corrected.  We generated the protomol used for docking by exploiting the residue mode (Ser95) with a threshold of 0.5 and a bloat of 4.  The prepared ligand sets were then docked to the protomol using the default settings with protein hydrogen movement and CScore calculations enabled.  Poses were sorted by their CScore calculations and exported for visualization into PyMOL.  For the re-evaluation of the docking of compounds 1 and 3, maximum conformations per fragment were increased from 20 to 50 and the search density was increased ten-fold from 6 to 60. Glide docking Glide docking was performed using Glide docking suite (Schrödinger Inc.) with the previously prepared ligand set (267).  The protein was prepared using the protein preparation wizard with the standard configuration, where hydrogens were added, and the overall structure was refined with its free energy minimized.  The receptor grid was generated using Ser95 as the centroid for the binding site with a binding box of 10 Å in the putative ectosteric site.  No additional constraints were specified, and the settings were left to their default values. The prepared ligands were docked using the aforementioned receptor grid using SP (standard 92  precision).  Ligand sampling was allowed to be flexible and other settings were left at the default values.  Post-dock minimization was performed and a maximum of five poses were written for each compound.  Poses were sorted by their Glide score and exported for visualization into PyMOL.  For the re-evaluation of the docking of compounds 1 and 3, extra precision docking was used (XP mode) with a maximum of 100 poses used for post-docking minimization.  Physicochemical properties including cLogP and cell membrane permeabilities were determined using QikProp for compounds 1 and 3. (Schrödinger Inc.) GOLD docking GOLD docking was performed using the GOLD Suite (Hermes 1.7.0) with the prepared ligand set (287).  Protein preparation was performed in GOLD where hydrogens were added, and side chains rotatable bonds were fixed.  The binding site was defined using Ser95 as the center with a surrounding box of 10 Å.  Cavity detection and docking were performed using the default GOLD settings with ligand flexibility and rescoring enabled.  Poses were ranked by the CHEMPLP scores and exported for visualization into PyMOL.  For the re-evaluation of compounds 1 and 3, search efficiency was increased from 25% to 200% and all solutions were kept for evaluation. Collagenase and gelatinase assays The collagenase inhibitory activities of all identified compounds were measured using a collagen degradation assay.  Soluble bovine type I collagen (0.6 mg/mL) was incubated with 400 nM recombinant human CatK, in the presence or absence of 200 nM chondroitin 4-sulfate in 100 mM sodium acetate buffer, pH 5.5, containing 2.5 mM dithiothreitol (DTT) (Sigma-Aldrich Canada, Oakville, Ontario, Canada) and 2.5 mM ethylenediametetraacetate (EDTA) (Sigma-Aldrich).  In the experiments involving detergent to minimize inhibitor aggregation, 0.001% or 93  0.005% Triton X-100 was added to the buffer with the inhibitor.  The recombinant human CatK was expressed in Pichia pastoris and purified as previously described (258).  Selected compounds identified by molecular docking were ordered from the National Cancer Institute through the Developmental Therapeutics Program (Rockville, MD).  All compounds were dissolved in DMSO as a 20 or 10 mM stock.  To minimize the effect of solvent on the activity of CatK, all reactions were kept below 1% DMSO, where no solvent effect was observed.  After incubation at 28°C for 4 hours, 1 μM of E-64 was added to block the residual activity of CatK.    Digestion with MMP-1 was performed in 50 mM Tris-HCl, pH 7.4, containing 200 mM NaCl and 5 mM CaCl2 at 28 °C for 4 h.  Reactions were stopped by adding 25 mM EDTA as previously described (288).  Samples were analyzed using 10% SDS-PAGE gels and stained with Coomassie.  The resulting bands of α-collagen chains were quantitatively assessed using ImageJ (Version 1.5) (289) and IC50 graphs were plotted using GraphPad Prism (GraphPad Software, Version 5.0, La Jolla California USA).    Gelatin (0.6 mg/mL) degradation assays were performed in the same manner as the collagen degradation experiments with 10 nM enzyme and were incubated at 37°C for one hour at the appropriate pH values for the individual proteases (CatK, pH 5.5; MMP-1, pH 7.4; trypsin, pH 8.8). Z-Phe-Arg-MCA cleavage assays Evaluation of a potential active site inhibition of the compounds was performed using the fluorogenic Benzyloxycarbonyl-Phe-Arg-7-amido-4-methylcoumarin (Z-Phe-Arg-MCA) substrate (Bachem Americas, Inc, Torrance, California, USA) as previously described (50).  The enzymatic activity of CatK was monitored by measuring the rate of release of the fluorogenic group, amino-methyl coumarin at an excitation wavelength of 380 nm and an emission 94  wavelength of 450 nm using a Molecular Devices SpectraMax Gemini spectrofluorometer.  Inhibitors were added prior to measurement of enzyme activity.  The assays were performed at 25°C at a fixed enzyme concentration (5 nM) and substrate concentration (5 μM) in 100 mM sodium acetate buffer, pH 5.5, containing 2.5 mM DTT and 2.5 mM EDTA.  Z-Phe-Arg-MCA hydrolysis with trypsin was carried out in 50 mM Tris-HCl, pH 8.8 at 10 nM enzyme concentration. Human osteoclast cultures and bone resorption analysis Osteoclasts were generated from mononuclear cells isolated from human bone marrow tissue (Lonza, Walkersville, MD).  The bone marrow cells were centrifuged at 400 g for 5 min and the pellet was re-suspended in 10 mL α-MEM (α-Minimal Essential Media) and layered on 10 ml Ficoll-Plaque media solution.  After centrifugation at 500 g for 30 min, the white interface containing the monocytes was harvested and washed twice with α-MEM.  Cells were cultured in α-MEM containing 10% FBS and 25 ng/mL M-CSF for 24 hours and then cultured in 25 ng/mL RANKL and 25 ng/mL M-CSF for 7 days.  Differentiated osteoclasts (100,000 cells per slice) were then seeded on each bone slice (5.5 mm diameter and 0.4 mm thickness; self-made) in the presence or absence of inhibitors and incubated for 72 hours at 5% CO2 and 37ºC with the DMSO concentration at 0.1%. The inhibitor concentration range tested varied between 50 nM to 3 µM.  To compare the effects of the compounds on cell survival, the metabolic activity of osteoclasts was determined using the CellTiter-Blue Viability Assay (Promega, Madison, WI, USA).  Bone slices from each condition (inhibitor-treated and control groups) were fixed in 4% formaldehyde and subsequently stained for tartrate-resistant acid phosphatase (TRACP) activity (Acid Phosphatase, Leukocyte (TRAP) Kit; Sigma-Aldrich).  Aliquots from cell culture media 95  were used to determine the CTx-1 concentration (MyBiosource ELISA kit, San Diego, CA).  CTx-1 is a CatK-specific C-terminal cleavage product of triple helical type I collagen.  The total number of osteoclasts per bone slice was determined after TRAcP staining.  Cells with ≥2 nuclei were counted as osteoclasts.  After 72 h, bone slices from each condition were incubated in filtered water for cell lysis and cells were removed using a cotton stick.  Subsequently the resorption cavities were stained with toluidine blue and observed by light microscopy.  The number of resorption events and eroded bone surface area were determined as previously described (290).  All light microscopic analyses were performed using a Nikon Eclipse LV100 microscope and a Nikon Eclipse Ci microscope.  3.3. Results Identification and characterization of ectosteric site 1 Previous studies of the mechanism of the collagenase activity in CatK have implicated ectosteric site 1 as a site required for efficient collagen cleavage (207, 232).  Ectosteric site 1 is located on the surface of the L-domain of CatK and represents a protein-protein interaction site required for the formation of collagenolytically active dimers (232).  The site consists of a well-defined cavity and is part of a surface loop consisting of residues 84-100. (Figure 3-1A)  In order to identify inhibitors binding in ectosteric site 1, a molecular docking approach was used.  We first investigated the druggability of this site by analyzing the molecular surface and performing computational analysis using Sitemap (Schrödinger Inc.).  The binding pocket of ectosteric site 1 consists mostly of negatively charged surface residues with Tyr87 forming the hydrophobic interior of the pocket and residues Glu93 and Glu94 serving as potential hydrogen bond interacting partners with compounds binding in the 96  pocket (Figure 3-1B).  Residues 95 to 100 form the outer rim of the binding site and mostly consist of neutral residues.  Other potential interacting residues including Gln92 and Glu84 are found near the exterior of the binding pocket but may play a role in determining the binding of larger compounds that occupy space outside the defined pocket. The binding site was further evaluated using Sitemap, which characterizes the surface properties of the protein to determine its druggability using parameters such as its cavity size, exposure to solvent, hydrogen bond acceptors and donors as well as its hydrophobicity and hydrophilicity (291).  Sitemap analysis of the binding site revealed a favorable site score of 0.935 and a druggability score of 0.816 (druggability threshold = 0.80) (291).  The analysis also revealed hydrogen bond-donating regions surrounding the surface of the binding site as predicted by the amino acid sequence as well as a hydrophobic interior consisting of residue Tyr87. (Figure 3-1C)   Hydrogen bond-accepting regions were found towards the periphery of the site.  Based on these results, ectosteric site 1 was shown to be a druggable binding site and compounds that strongly interact with this binding site are likely to be small molecules with hydrophobic characteristics.  These compounds would also interact with the hydrogen bond donor and acceptor residues lining the pocket. 97   Figure 3-1: Binding site analysis of ectosteric site 1 in CatK  (A) Overview of CatK (PDB ID: 1ATK) shown in surface and ribbon form.  The active site residues, Cys25 and His162, are colored in yellow and blue, respectively.  Ectosteric site 1 is highlighted in the box and colored in orange.  (B) The electrostatic potential surface of ectosteric site 1 in CatK displays electronegativity throughout the protein-protein interface as a result of negatively charged residues in this region.  (C) The binding pocket in ectosteric site 1 displayed high theoretical druggability (Sitemap, Schrödinger Inc.), (druggability score: 0.816; druggability threshold: 0.80; site score: 0.715) as a result of its favorable geometry and size.  Most of the area surrounding the cavity is hydrophilic (red for hydrogen bond-accepting sites, blue for hydrogen bond-donating sites).  The potential interacting residues are located surrounding the binding site, Glu93, Glu94, and Gln99 and are shown in blue.  The hydrophobic residue, Tyr87, can be seen in the center of the binding site and is shown in yellow.  Screening of druggable subset library using composite docking method The NCI/DTP Chemical Repository containing pure synthetic compounds and natural products with diverse sets of chemical scaffolds comprises 281,987 small molecules ranging from 18 Da to 3,880 Da.  This library of compounds was used as the ligand set for docking to ectosteric site 1 and the complete workflow can be found in Figure 3-2.  Based on the size and pharmacological properties of the binding site determined in SiteMap, a subset of compounds was first selected using the compounds’ drug-like properties and bioactivity based on selection 98  criteria using the Lipinski’s rules of five for initial testing [30].  These rules limit compounds to no more than five hydrogen bond donors and ten hydrogen bond acceptors with a molecular weight less than 500 Daltons for druggability.  With these filters, the number of compounds tested was reduced to 14,045.  Three-dimensional structures of the ligands were generated and prepared using LigPrep in Maestro (Schrödinger Inc.) prior to screening.  OPLS3 force fields were generated with the possible ionic states limited between pH 5.5 ± 2.0 due to the pH activity profile of CatK and the pH of the in vitro experiments.  Ligand flexibility was considered and up to 12 conformers were generated for each compound.  The resulting prepared ligand sets were used for all three screening methods.  99   Figure 3-2 Screening and evaluation workflow for the identification of ectosteric site 1 inhibitors of CatK  The screening procedure used to identify potential collagenase inhibitors of CatK from the NCI/DTP Repository.  A composite virtual screening method involving GOLD, Glide, and Surflex was used to screen and identify potential hits and a total of 160 compounds were tested 100  in in vitro assays.  Eight compounds were identified as potent collagenase inhibitors with IC50 values below 20 µM and two compounds were tested in osteoclast-based bone degradation assays with both inhibitors displaying bone resorption inhibition.   To overcome some of the weaknesses associated with computer screening approaches using a single algorithm, we chose to employ a composite docking method involving three separate algorithms, Surflex, Glide, and GOLD.  Each docking algorithm uses its own method in the configuration of the binding site.  For Surflex, a protomol (binding site) of ectosteric site 1 was generated using residues 88 to 92 with a bloat value of 4.  The resulting protomol encompassed the entire binding site evaluated using SiteMap and was representative of ectosteric site 1.  Both Glide and GOLD used a receptor grid for defining the binding site and a 10 Å box was generated using Ser95 as the centroid site.  Both of these grids covered the entire volume occupied by ectosteric site 1 and the binding site identified in Sitemap.  Since the scoring algorithms are unique and address different binding parameter, we expected that compounds with simultaneous high scores in all three docking methods have a higher likelihood to be potential inhibitors.  The top-ranked compounds from each individual docking method were combined and hits were defined as compounds with scores in the top 10% in all three methods.  Of the 14,045 compounds docked to ectosteric site 1, 99 compounds fulfilled this inclusion parameter.  Compounds were then ranked based on their composite score, which was the average of the ranking attained in each individual method.   In order to visualize the chemical relatedness and potential scaffolds of the structures identified, the compounds were then clustered according to chemical similarity using their Tanimoto fingerprints in Sybyl-X (Figure 3-3) and an outlier cut-off of a minimum of 80% chemical similarity.  The resulting structural similarity map revealed four groups of high chemical similarity containing a total of 26 compounds.  The remaining 73 compounds did not 101  meet the similarity cut-off and displayed different scaffolds.  One particular cluster (Group 4) contained a family of 13 acridinone-related compounds with a common chemical scaffold that interacted favorably with ectosteric site 1 based on their predicted poses.  Groups 1 to 3 contained compounds with more flexible scaffolds such as the purine-thione and chromanone-like structures that fit into the defined binding site of ectosteric site 1.  The chemical scaffolds of all the families of compounds identified contained the potential interactions identified from the Sitemap analysis.  These include a hydrophobic core and several hydrogen bond donators that can interact with the ectosteric site residues.  The predicted poses for these compounds take advantage of these properties and display numerous interactions with these residues. 102   Figure 3-3 Chemical similarity mapping of hits identified through molecular docking The composite method identified 25 compounds that could be grouped into four different scaffolds.  Group 4 contained the highest number of compounds with 13 identified putative inhibitors.  A complete list of compounds can be found in the Supplementary Information (Table S3-2).  Since compounds with the same scaffold may also bind in a similar manner to the ectosteric site, a chemical similarity search was performed using scaffolds from the four high similarity clusters in the NCI-DTP database and yielded an additional 16 compounds, which were also considered for further in vitro testing.  Since the compounds all scored in a similar 103  range in the respective docking methods, further selection was performed based on the visual examination of the individual binding poses and the interactions the compounds made with the protein. Higher priority was placed on compounds with multiple hydrogen bonding interactions with the protein as well as those with substantial hydrophobic interactions with the core of the ectosteric site 1.  Compounds with multiple conformers which all scored highly were also given greater consideration.  From the total of 115 (99+16) compounds, we evaluated the activity of 80 compounds in collagenase and peptidase assays.  Eleven compounds that were unavailable from the NCI/DTP Repository were not included.    To evaluate the efficacy of these compounds in inhibiting the collagenase activity of CatK and without blocking the active site of the protease, type I collagen degradation and Z-Phe-Arg-MCA hydrolysis assays were performed.  Of the 80 compounds screened in the collagenase assay, 16 compounds showed inhibition at 100 μM and were further subjected to testing at lower concentrations (Table S3-1).  IC50-values were determined for the most potent compounds.  Five of these compounds had IC50-values of below 20 μM for collagen degradation in the presence of 400 nM CatK enzyme (representing a 50-fold molar excess over the enzyme concentration) (Table 3-1).  These included compounds: 10-[2-[bis(2-hydroxyethyl)amino]ethyl]-7,8-diethylbenzo[γ]pteridine-2,4-dione (compound 1; NSC-374902); 2-((8-hydroxy-6-oxo-6H-imidazo[4,5,1-de]acridin-5-yl)amino)-N,N-dimethylethanaminium (compound 2; NSC-645808); 5-(3-(dimethylamino)propylamino)-6H-imidazo[4,5,1-de]acridin-6-one (compound 3; NSC-645836); 5-(2-(dimethylamino)ethylamino)-1-ethyl-6H-imidazo[4,5,1-de]acridin-6-one (compound 4; NSC-645835); and 5-{[2-(diethylamino)ethyl]amino}-6H-imidazo[4,5,1-de]acridin-6-one hydrochloride (1:1) (compound 5; NSC-645831). 4-(dimethylamino)-3,10,11,12a-tetrahydroxy-6-methyl-1,12-dioxo-3,4,4a,5-tetrahydro-2H-tetracene-2-carboxamide 104  (compound 6 ; NSC-118670) had an IC50 value of approximately 35 µM (Table 3-1).  Compounds 2 through 5 all contained the common acridinone (Figure 3-3) chemical scaffold identified during the chemical similarity mapping after composite docking (Group 4).  Eight out of the ten tested compounds from this group inhibited then collagenase activity of CatK at 100 μM or lower (Table S3-2).  Table 3-1 Summary of collagenase inhibitors identified through composite docking from the NCI-DTP Repository Compound Compound NCI-DTP code Structure Collagenase IC50 (μM) Composite Docking Rank (Glide/Surflex/GOLD Ranks) Z-Phe-Arg-MCA Cleavage Inhibition (100 μM) 1 374902  4.7 ± 0.4 48 (691/121/857) 9.5 ± 2.6 2 645808  7.6 ± 0.7 45 (703/516/381) 6.1 ± 3.9 105  3 645836  4.9 ± 0.3 21 (10/38/1040) 5.0 ± 2.7 4 645835  9.9 ± 0.7 34 (86/484/788) 2.3 ± 0.9 5 645831  13.0 ± 1.0 5 (163/58/204) 3.7 ± 0.7 6 118670  34.8 ± 1.9 38 (1325/43/120) 11.5 ± 2.6 7 39471  10.6 ± 2.9 74 (1727/648/1152) 6.5 ± 1.4 8 53298  12.8 ± 1.3 185 (1289/85/10203) 7.2 ± 3.6 106  9 359463  8.5 ± 0.9 180 (1651/10426/24) 3.2 ± 0.8  The most potent compounds identified from the druggable library subset were compounds 1 and 3, which had IC50 values of 4.7 ± 0.4 μM and 4.9 ± 0.3 μM, respectively (representing a ~12-fold molar excess over the enzyme concentration) (Figure 3-4).  107   Figure 3-4 Collagenase inhibitory activity of compounds 1 and 3 The two most potent compounds, 1 (A) and 3 (B), identified through molecular docking are shown with their respective structures.  Collagenase inhibitory activities with representative collagenase degradation gels are depicted with the corresponding IC50 curves determined from three separate experiments (n=3).  The IC50 values for the inhibition of collagenase activity of CatK were 4.7 ± 0.4 μM and 4.9 ± 0.3 μM for compounds 1 and 3, respectively. * represents the α1 type I collagen peptide used to quantify the collagenase activity of CatK.   To exclude an active site inhibition for the top-rated compounds, their inhibitory activity on the cleavage of Z-Phe-Arg-MCA and gelatin was evaluated.  In these inhibition assays, a 5,000-fold molar excess of the inhibitor over the enzyme concentration was used.  None of the five compounds with IC50-values of less than 20 μM in collagenase inhibition displayed inhibitory activity towards gelatin degradation at 100 μM.  Some compounds revealed a minor inhibition of 108  Z-Phe-Arg-MCA hydrolysis of up to 10% (Table 3-1) at high excess inhibitor concentrations.  These results suggest that the collagenase inhibitory activity observed is not a result of active site inhibition of CatK. Proposed binding of compounds 1 and 3 To understand the binding modes of the most potent anti-collagenase inhibitors, 1 and 3, their binding to ectosteric site 1 was re-evaluated using three different protocols with higher precision parameters and increased rotational sampling as outlined in the Methods section.  The overall best binding poses from each docking algorithm are shown in Figure 3-5.  The poses for compound 3 are all accommodated in the cleft of ectosteric site 1 with the aromatic rings of the compound making favorable hydrophobic interactions with the residues in the area.  The strongest of these interactions include those with residues Ala86, Tyr87, and Pro88.  In addition to the hydrophobic interactions, the compound also forms electrostatic interactions with residues Asn99 and Met97 and forms a hydrogen bond with Glu94.  For compound 1, similar favorable hydrophobic interactions can be seen with residues Ala86, Met97, and Asn99.  It also forms strong electrostatic interactions with the residues in the loop including hydrogen bonding with Pro88 and Glu94.  Both of the compounds had high scoring poses with theoretical Ki in the range of 5-10 µM determined from the Glide and Surflex scores, which corroborates with their experimental in vitro IC50 values.  The best binding mode predicted by the GOLD algorithm for both compounds 1 and 3 were in a different conformation than those predicted by Glide and Surflex as shown in Figure 3-5 and showed strong hydrophobic interactions with the protein (S2 Fig).  Nonetheless, the revaluated empirical CHEMPLP scores for both compounds showed strong binding and were among the top 3% of the scores observed for the entire druggable library set.  109   Figure 3-5 Top binding poses of compounds 1 and 3 from composite docking Top binding poses of the most potent collagenase inhibitors, 1 (A) and 3 (B), as docked using the three docking methods.  The poses are depicted using sticks and colored orange (Glide), green (Surflex), and yellow (GOLD).  Ligplot diagrams depicting the predicted binding of compounds 1 (C) and 3 (D) into ectosteric site 1 using the best binding pose calculated from Glide.  Hydrogen bonds with the binding site residues are highlighted in green with the respective distances and hydrophobic interactions are shown with red dashes.   As predicted by the Sitemap evaluation of ectosteric site 1, both compounds contained a hydrophobic core which interacted with the hydrophobic center of ectosteric site 1 (Tyr87) and contained multiple functional groups which could interact with the hydrogen-bond donating side 110  chains in the area.  Compound 3 shares a chemical scaffold with 12 other compounds found in Group 4 (Figure 3-3 and Table S3-2), which were identified from the set of druggable compounds that also scored highly in the composite docking approach and which displayed an anti-collagenase activity.  Complete library screening using composite docking method Based on the results using the 14,000-member compound sub-library, we proceeded to complete the docking and testing of the entire NCI-DTP repository consisting of a total of 281,987 compounds.  As in the sub-library study, the compounds were prepared with LigPrep using the previous parameters with conformers limited to three per ligand due to the increased number of compounds.  The binding site definitions and docking methods were the same as used in the subset library study for each docking method.  A cut-off of 4% was used for the selection of hits, which limited it to 314 compounds for further selection.  The 14,000 compounds in the sub-library were also rescreened as a part of the complete library.  Of the 99 compounds identified from the composite hits, only two compounds were found in the 314 compounds identified in the complete library at the 4% cut-off but were not among those with anti-collagenase activity. However, all six collagenase inhibitors identified in the subset library listed in Table 3-1 were found in the top 10%, representing a total of 839 compounds, of the composite hits identified in the complete library.  The highest rated compound, 5, was ranked at the 422nd position.  Compounds 1 and 3 ranked 803rd and 609th, respectively. Due to the large chemical diversity of the compounds identified, a chemical similarity map did not yield particular scaffolds of interest.  However, some of the compounds identified may still be of interest for further optimization and development for higher potency.  Most of the compounds identified had higher molecular weights (>700 Da), possibly as a result of the 111  inherent bias for larger molecules in interaction scoring due to an increased number of interaction atoms (292).  The average molecular weight of the hits identified was 791 ± 281 Da, and was significantly higher than the average of 316 ± 72 Da from the screen using the initial subset of compounds (S1 Fig).  The compounds were then given a composite rank as the average of their individual ranks and were also screened visually by their binding pose as in the preliminary study.  Higher priority was given to compounds with extensive hydrogen bonding and hydrophobic interactions with ectosteric site 1.  Compounds that primarily interacted outside of ectosteric site 1 were given less consideration.  Compounds that were unavailable from the NCI/DTP repository were skipped.  This led to a selection of 80 compounds for testing in in vitro assays.   The compounds were first screened in the collagenase assay with 12 compounds displaying inhibitory activities at 100 μM; those compounds were further evaluated at lower concentrations.  Three of these compounds displayed IC50 values below 20 μM: dexamethasone acetate (compound 7; NSC-39471), N-[3-(4,5-dihydro-1H-imidazol-2-yl)phenyl]-3-[[3-[[3-(4,5-dihydro-1H-imidazol-2-yl) phenyl]carbamoyl]phenyl]carbamoylamino]benzamide hydrochloride (compound 8; NSC-53298), and 1-[2-[3-(diethylaminomethyl)-4-hydroxyphenyl]hydrazinyl]-5-[(2E)-2-[3-(diethylaminomethyl)-4-oxocyclohexa-2,5-dien-1-ylidene]hydrazinyl]naphthalene-2,6-dione hydrochloride (compound 9 ; NSC-359463).  The most potent compound, compound 9, had an IC50-value of 8.5 μM (Table 3-1).  In contrast with the sub-library screen, no compounds with IC50 values <5 µM were identified with the complete library.  To test the potential active site inhibition of each of these compounds, their ability to inhibit the cleavage of non-ectosteric site-dependent substrates, gelatin and Z-Phe-Arg-MCA was tested at a 5,000-fold molar excess over the enzyme concentration.  All three compounds (7, 8, 9) displayed low (<10%) active site-112  directed inhibition (Table 3-1).  Similar to the previous compounds identified, the inhibition of the collagenase activity is thus likely caused by the disruption of the ectosteric site interactions and not by active site inhibition.  Screening for aggregation and off-target inhibition for active compounds To test whether the inhibition observed for the identified compounds are due to non-specific interactions such as compound aggregation, we tested the compounds under detergent conditions as well as their activities towards two other proteases, trypsin and matrix-metalloproteinase-1 (MMP-1).  We first tested the collagenase inhibition activity of the compounds in Table 3-1 in the presence of detergent, Triton X-100.  Two detergent concentrations were tested (0.001% and 0.005% (v/v)).  Higher concentrations of Triton X-100 inhibited the collagenase reaction.  At both tested concentrations, all nine compounds retained their collagenase inhibitory activities at 10 µM.  Quantification of the degradation in the presence of detergent at 10 µM inhibitor concentration showed no significant difference to that observed without detergent (S3A-B Fig). Moreover, the IC50 values of compounds 1 and 3 were determined in the presence of detergent were not significantly different than those characterized without detergent (S4 Fig). To further rule out non-specific inhibition, we tested the inhibitors with two unrelated proteases, with the serine protease, trypsin, and with matrix metalloproteinase, MMP-1.  None of the compounds inhibited the cleavage of the macromolecular substrate gelatin trypsin at 50 µM concentrations (S3C Fig).  Likewise, the degradation of the fluorogenic peptide, Z-Phe-Arg-MCA by trypsin was not significantly inhibited by the compounds (<10%) with the exception of compound 6, which showed 60% inhibition at 50 µM (5,000-times molar excess).  Compounds 1 and 3 were further evaluated for their inhibition of MMP-1.  Both compounds showed no 113  inhibitory activity at 50 µM (approximately 10-times their IC50 concentrations for the collagenase activity of CatK) towards the collagenase activity of MMP-1.  Additionally, both compounds showed no inhibitory activity at 50 µM (5,000 times excess enzyme concentrations) for the gelatinase activity of MMP-1. (S3D Fig) Comparison of composite docking method with individual methods In order to evaluate which method is more efficient at identifying ectosteric inhibitors, we compared the composite docking method with the individual docking methods for the identification of active compounds from the NCI-DTP Repository.  We evaluated the top 25 available compounds from each individual docking method for their in vitro inhibitory activities towards collagen, gelatin, and Z-Phe-Arg-MCA degradation and compared the results with those obtained from the composite method.   For Surflex identified ligands, only three compounds displayed inhibitory activity at 100 µM in the collagenase assay (Table S3-3).  However, none of these compounds were active at 50 μM.  None of the top 25 compounds identified by Surflex alone were found among the composite hit list.  From the top 25 compounds identified in the Glide screen, three compounds showed inhibitory activity for the collagenase activity of CatK at 100 μM and one compound was active at 50 μM (Table S3-3).  Of the top 25 compounds identified from Glide alone, only two were identified by the composite method, but these compounds did not display any significant collagenase inhibitory activity.  Finally, the top 25 compounds identified with GOLD showed four compounds with inhibitory activity at 100 μM and two compounds effective at 50 μM (Table S3-3).  This included compound 9, which had been identified during composite docking and had an IC50 value of 8.5 µM in the collagenase assay.   114   Compared with the composite docking method, the individual methods were less effective at identifying potent compounds of the collagenase activity in CatK (Table 3-2).  In vitro testing of 75 compounds in total from the three individual methods only yielded 10 compounds with inhibitory effect at 100 μM.  The most potent compound was 9, which was identified with GOLD, with an IC50 value of 8.5 µM.  In comparison, the composite docking method screening the entire library yielded 12 compounds active at 100 μM, with three compounds (7, 8, 9) having IC50 values below 20 μM from 80 tested.  Composite docking using the sub-library had the highest hit rate with 16 compounds showing inhibition at 100 μM.  Five compounds (1, 2, 3, 4, 5) had IC50 values below 20 μM and two (1, 3) of which had IC50 values of approximately 5 μM from the 80 compounds tested from the composite screen (Table 3-2).  Table 3-2 Comparison of the hit rates of each individual docking method and composite docking methods Docking Method Glide Surflex GOLD Total from Individual Methods Composite Docking (Druggable) Composite Docking (Complete) Top Compounds Tested 25 25 25 75 80 80 Active at 100 µM 3 3 4 10 16 12 Active at 50 µM 1 0 2 3 12 8 IC50 below 20 µM 0 0 1 1 5 3 115  IC50 below 5 µM 0 0 0 0 2 0  Osteoclast-bone resorption assays using most potent compounds identified from NCI-DTP Repository The two most potent compounds identified in the collagenase assays were tested in cell-based osteoclast bone resorption assays to evaluate their ability to inhibit the degradation of bone.  Figure 3-6A shows toluidine-stained osteoclast-mediated resorption events on bone surfaces in the absence or presence compounds 1 and 3 using human osteoclasts.  In untreated cultures, long deep trenches with small round pits were present, indicating extensive bone resorption. Compound 1-treated cultures (1 µM) showed mostly small round demineralized pits indicating an almost complete inhibition of CatK-mediated bone resorption.  Round small resorption pits represent CatK-independent demineralization events.  Quantification of the number of osteoclasts and metabolic activity showed no significant changes between inhibitor-treated and untreated samples, suggesting no toxicity at the tested inhibitor concentrations (Figure 3-6B-C).  Quantitative analysis of the resorption parameters revealed that 1 µM of compound 1 was very effective in reducing the total eroded surface by osteoclasts.  The number of trenches and total trench-eroded surface was significantly less than in the untreated sample (p<0.005) (Figure 3-6D-E).  The IC50 value for the inhibitory effect on the total eroded surface/bone surface for compound 1 was 312 ± 63 nM (Figure 3-6F).  For comparison, the active site-directed inhibitor, odanacatib, had an IC50 value for human-osteoclast-mediated bone resorption of 14.6 nM (284). However, odanacatib as most other active site-directed inhibitors showed significant side effects 116  leading to the termination of the further development of active site-directed CatK inhibitors (179).    Figure 3-6 The effects of NSC-374902 and NSC-645836 on human osteoclasts and bone resorption activity   (A) Representative images of osteoclast-generated resorption cavities in the presence or absence of compounds 1 and 3.  Large meandering cavities known as trenches can be observed and represent collagen degradation event.  Small round cavities are demineralized pit area with no or little collagen degradation.  Mature human osteoclasts were cultured on bovine bone slices for 72 hours in the absence or presence of inhibitors (1µM).  (B) Metabolic activity and (C) number of osteoclasts after treatment with NSC compounds compared with untreated cells show no 117  significant differences.  (D) Number of trenches and (E) percentage of eroded surface area under untreated and inhibitor treated (1 µM) conditions show significant reduction for compound 1 (p<0.001) and 3 (p<0.05).  (F) The IC50 value of compound 1 for the inhibition of trench eroded surface per bone surface was 312 ± 63 nM.  The IC50 value was determined from three independent experiments where 5 bone slices in each condition were analyzed.  Data represent mean ± SD. ‘ns’, not significant; * p< 0.05; ***p< 0.001.      Surprisingly, compound 3 had a lesser effect on trench formation (Figure 3-6A).  The compound was less effective in reducing bone resorption; however, the total eroded surface and number of trenches was still significantly reduced when compared to the untreated samples (p<0.01).  The solubility and the ability to cross the cell membrane may play a role in the reduced potency.  Both compounds had octanol/water partition coefficients (cLogP) in the druggable range (-0.4 to 5.6) but compound 3 (3.71) had a higher value than compound 1 (0.60).  This indicates a lower water solubility, which may have played a role in its lower cell efficiency (293).  Despite the lower potency, the scaffold of this compound may still be used to further refine potential collagenase inhibitors of CatK.  3.4. Discussion Molecular docking approaches have been shown to aid in identifying potential chemical scaffolds for novel targets (294).  Several computational methods are available to find potential ligands for a druggable target.  These include library docking, fragment-based synthesis, as well as molecular dynamics simulations (295, 296).  Computational approaches are faster and more cost-effective when compared with traditional high-throughput experimental screens.   Here, we employed various molecular docking methods to identify non-active site-directed inhibitors of CatK.  CatK has been previously shown to be an attractive target for the treatment of osteoporosis.  Multiple active site-directed CatK inhibitors have been evaluated in past clinical trials of osteoporosis.  Despite superior efficacy in increasing bone mineral density 118  and reducing fracture rates all compounds have failed because of side effects, including cardiovascular complications and skin fibrosis (112, 271). The exact mechanisms of these side effects remain unknown but may be due to the inhibition of CatK-mediated degradation of regulatory proteins such as TGF-ß1.  We have previously shown that inhibitors targeting the ectosteric site 1 specifically blocks osteoclast bone resorption in vitro and in vivo without disrupting its active site activities (114).  Therefore, we aimed at identifying novel drug scaffolds which target the ectosteric site I of the protease.  Sitemap characterization of this binding site shows druggability and many interacting residues for compounds to bind favorably.  Using a combination of three different docking algorithms, we identified several ectosteric site-based inhibitors of CatK.  Most screens use a maximum of two different docking methods (297–299), which may account for the high false positive rate (300).  Our composite approach was up to five-fold more effective at identifying potent collagenase inhibitors (IC50 <20 µM) when compared to the single docking methods and was able to identify two effective inhibitors with IC50 values of below 5 µM (Table 3-2).  By filtering the library to drug-like compounds using the Lipinski rule of five and bioassay activities, we were able to further increase the hit rates and potency of identified compounds.  This difference in potency and hit rates might be due to the difficulty in estimating accurate binding affinities for larger compounds with multiple interacting atoms and the general bias for interaction scoring for compounds with higher molecular weights (301, 302).  Starting with a smaller library of low molecular weight compounds and employing multiple scoring strategies may help in increasing both the potency and hit rate of identifying compounds and decrease the number of false positives.  Through structural similarity mapping of the composite docking hits, we were able to distinguish several novel chemical scaffolds with a selective collagenase inhibitory activity.   119   Examination of the best predicted binding poses of the two most potent compounds, 1 and 3, revealed favorable interactions with the residues in the pocket of ectosteric site 1.  Both of the compounds contained a hydrophobic ring system, acridinone or pteridine for compounds 3 and 1, respectively, and functional groups, which can participate in non-covalent interactions with the other residues identified in the binding site.  The poses and interactions calculated by the respective algorithms were similar and matched closely with the potential interactions predicted in the Sitemap analysis.  Despite similar potency in collagenase inhibition, there was a noted difference in their efficacy in the cell-based resorption assays.  Compound 1 was significantly more potent in preventing bone resorption than compound 3.  Their calculated membrane permeabilities (nm/s) (2.4 for compound 1 and 1.5 for compound 3) determined from the calculated physicochemical properties and cLogP values (Schrödinger Inc.) may have played a role in the differences in potency observed for the osteoclast-based assays (303).   We also investigated whether the presence of assay interference such as non-specific binding due to small molecule aggregates could generate false positives in our screening assays (304).  Detergents break up aggregation and thus would reduce potential aggregation-based inhibition (305).  All nine compounds in Table 3-1 retained their anti-collagenase activities in the presence of Triton X-100 at two concentrations (0.001% and 0.005% v/v).  Triton X-100 had no effect on the IC50 values of the two most potent compounds (1 and 3) (S3A-B Fig). In order to exclude off-target effects of the identified CatK ectosteric inhibitors, we evaluated the potential inhibitory effect on trypsin and MMP-1. All nine compounds in Table 3-1 did not show any inhibitory activity towards the cleavage of gelatin by trypsin at their highest inhibitor concentrations (50 µM) (S3C Fig).  Moreover, compounds 1 and 3 did not show inhibitory activity towards MMP-1 for the degradation of collagen or gelatin (S3D Fig).  Taken 120  together, this indicates that the identified compounds are selective CatK inhibitors, which inhibit specifically the collagenase activity of the protease.    The IC50 value for compound 1 in the osteoclast resorption assay was 312 nM.  This is close to the IC50 value of tanshinone IIA sulfonate (240 nM), which showed efficacy in ovariectomized mice (114).  The overall potency is still about 15-20 times lower than for the most potent active site-directed CatK inhibitor, odanacatib but may have the advantage of avoiding side effects seen with odanacatib and other CatK inhibitors (51, 253).  Compound 1 may serve as a scaffold for the optimization of more potent CatK collagen specific inhibitors.  It should be noted that some of the compounds identified in our screening process have shown bioactivities in other assays.  Compounds 1 and 9 in particular, were previously shown to be active in an NCI Yeast Anticancer Drug Screen.  Compound 3 was also shown to be active in an inhibitor screen of a quinone oxidoreductase and a structurally related compound has been shown to be a potent inhibitor of the enzyme and may have chemo-protectant potential (306).  In conclusion, we have demonstrated that in silico molecular docking methods can be successfully employed to identify ectosteric inhibitors of CatK.  The hit rate was substantially increased when a composite docking approach was employed.  We believe that this method can be applied to other targets as well and can be effective for targets where a high-throughput assay has not been developed or is not cost-effective.    121  Supplementary Information (Chapter 3) Table S3-1 Summary of Collagenase Inhibitors at 100 µM Identified Through Composite Docking Using Druggable Compounds from the NCI/DTP Repository Structures of Collagenase Inhibitors Identified Through Composite Docking Listed by NSC Number  374902  645808 645831  645835  645836  118670  124845  611243  54055 122   60472  62378  76356  79058  81540  106540  108608    123  Table S3-2 Summary of Scaffolds Identified Through Composite Docking Using Druggable Compounds from the NCI/DTP Repository Listed by NSC Number Group 1  642740  623611  642890  647595 Group 2  688955   664938  102816  651036  690204 Group 3  147786  76356  145386 124   Group 4  645808  645811 645812 645815 645816  645821  645823  645824  645825  645831  645833 645835  645836    125  Table S3-3 Summary of Collagenase Inhibitors Active at 100 µM Identified with Individual Docking Methods from the complete NCI/DTP Repository Surflex Only Identified Compounds  42322  645812  80975  Glide Only Identified Compounds  80116 (Collagenase IC50: 89 ± 9.5 μM)  136985  126  85206  GOLD Only Identified Compounds  53213  70530  719315 (Collagenase IC50: 88 ± 6.3 μM)     127   Figure S3-1 Distribution of Molecular Weights of Composite Docking Hits Identified From the NCI/DTP Repository The frequency distribution data of the hits identified through composite docking shows a higher average molecular weight for the complete library (791 ± 281 Da) than the druggable subset (316 ± 72 Da).   128   Figure S3-2 LigPlot Diagrams of Top Binding Poses of Compounds 1 (A) and 3 (B) Using GOLD  Ligplot diagrams of the top binding poses of compounds 1 and 3 using GOLD show strong interactions with the protein.   129   Figure S3-3 Effect of Triton X-100 on the collagenase activity of compounds 1-9 and lack of off-target inhibition The collagen inhibition activity of compounds 1-9 was not affected by 0.005% (A) Triton X-100 shown with the corresponding representative SDS-PAGE gels. (B) Quantification of the α1 type I bands (*) from three separate experiments (n=3) showed no significant effect of the detergent on collagen degradation inhibition.  Compounds 1-9 also did not show off-target inhibition of trypsin-mediated digestion of gelatin (C) at 50 µM in the presence of 10 nM enzyme.  (D) Compounds 1 and 3 did not show inhibition of MMP-1-mediated degradation of collagen and gelatin at 50 µM inhibitor concentrations.  400 nM and 10 nM MMP-1 was used for collagen and gelatin degradation, respectively.  Representative SDS-PAGE gels for the degradation experiments are shown. 130   Figure S3-4 IC50 curves of the collagenase inhibitory activity of compounds 1 (A) and 3 (B) with 0.005% Triton X-100 The IC50 values of the collagenase inhibitory activity of compounds 1 and 3 were 5.1 ± 1.0 µM and 5.2 ± 1.5 µM, respectively.   131  4. Identification of Ectosteric Inhibitors of CatK through High-Throughput Screening  Abstract Cathepsin K (CatK) is a cysteine protease that is critically involved in protein turnover and known for its unique and potent collagenase and elastase activity. The formation of oligomeric complexes involving CatK and chondroitin sulfate (C4-S) has been implicated in its collagenase activity. Inhibitors that disrupt these complexes from forming are selective for the collagenase activity and do not interfere with other proteolytic activities of the enzyme. Here, we have developed a fluorescence polarization (FP) assay to screen two compound collections (Known Drugs 2 and Biomol libraries) totaling 5,071 compounds for ectosteric collagenase inhibitors of CatK. A total of 38 compounds were identified with the FP assay that did not interfere with the hydrolysis of the small synthetic peptide substrate, Z-FR-MCA and gelatin. Eight of these compounds exhibited a specific anti-collagenase activity with IC50 values below 200 µM; with three of them with IC50 values below 5 µM. Two of the anti-collagenase inhibitors were highly effective in preventing the bone-resorption activity of CatK in osteoclasts. Interestingly, some of the ectosteric inhibitors were capable of differentially blocking the collagenase and elastase activity of CatK depending on the exosite utilized by the compound.   132  4.1. Introduction Cysteine cathepsins are found in all life forms and play an important role in mammalian intra- and extracellular protein turnover. They are members of the papain-like family (CA clan, C1 family) and have 11 proteases encoded in the human genome (cathepsins B, C, F H, K L, O, S, V, W and X) (281). Cathepsins K, V, and S are potent extracellular matrix (ECM) protein-degrading proteases. All three cathepsins are potent elastases (230, 280). In addition, cathepsin K (CatK) is a unique collagenase capable of cleaving at multiple sites within triple helical collagens and has been implicated in various cardiovascular and musculoskeletal diseases, including osteoporosis (71).  Since its first crystal structure was reported in 1997, major interest has been placed in developing potent CatK inhibitors as a potential treatment for osteoporosis (62). However, all compounds developed thus far are active site-directed inhibitors that completely block the activity of the enzyme (154). Because CatK is a multifunctional protease, it is likely that blocking its entire proteolytic activity will cause unwanted side effects (179). This may explain in part the failing of clinical trials using CatK inhibitors such as odanacatib and balicatib for the treatment of osteoporosis. Patients treated with these drugs experienced heightened risks of cardiovascular events and skin fibrotic phenotypes as side effects despite displaying a dramatic increase in bone mineral density (179, 271).   Our previous studies have demonstrated that the degradation of extracellular matrix (ECM) proteins such as collagens and elastin requires specific exosite binding sites (207). These sites are needed for the formation of collagenolytically active CatK oligomers in the presence of glycosaminoglycans (232) or act as secondary binding sites for ECM proteins such as elastin (207, 230). Blocking the formation of these collagen degrading complexes or secondary substrate 133  binding sites with small molecules would allow the selective inhibition of the collagenase and elastase activities of cathepsins without affecting the cleavage of other, potentially regulatory substrates (113). Thus, targeting the formation of these active oligomers may serve as a substrate specific target for CatK inhibitors. We have recently demonstrated that inhibitors can be identified through a molecular docking approach targeting ectosteric site 1, a site required for the collagenase activity of CatK (233). High throughput library screening is also an effective tool at discovering new scaffolds for inhibitors and often serves as a starting point for drug design programs (307). Screening assays frequently use fluorescent-based assays for their cost-effectiveness, quick detection and high signal-to-noise ratios (308). Once a fluorescent-based assay has been developed and optimized, it can be easily scaled up to screen large libraries (236). Pharmacological active compound libraries and natural compound collections are readily available and either contain compounds which can be further optimized by medicinal chemistry (309–311) or form the foundation of rational drug design strategies (312–314). In this study, we have employed a high-throughput fluorescence polarization (FP) assay for identifying selective collagenase inhibitors of cathepsin K (CatK). This assay allows for detecting compounds that disrupt the oligomer formation required for collagenase activity (254). Based on our previous findings, complex formation can be interrupted by either blocking protein-protein or glycosaminoglycan-protein interaction sites in CatK complexes. FP is widely used in library screening and drug discovery to detect the potential disruption of interactions between ligands and proteins (243). We have previously demonstrated the validity of this method by investigating the prevention of CatK/chondroitin 4-sulfate (C4-S) complexes by polypeptides and polyamino acids (254).  134  In this study, we used this technique to assess two chemical libraries, the Known Drugs 2 (KD2) and the Biomol libraries containing 4,761 and 310 compounds, respectively, to identify inhibitors that prevent the collagenolytically active CatK complexes from forming. Active compounds were also screened using a fluorogenic peptide cleavage assay to test for active site-directed inhibition. Compounds confirmed in these assays were then tested in low throughput collagenase assays to specifically screen for anti-collagenase inhibitors. Next, the anti-collagenase inhibitors were evaluated in cell-based osteoclast resorption assays and computational molecular docking studies to further assess their activities and binding modes. (Figure 4-1). Finally, selected compounds were evaluated for their selective inhibition of the collagenase and elastase activities of CatK.   Figure 4-1 Experimental workflow for the identification of collagenase inhibitors of CatK using FP and active site screening assays. Chemical libraries such as the KD2 and Biomol screens are first screened using the FP assay to identify compounds which inhibit the formation of collagenolytically active complexes. Active compounds from the FP assay are then tested in vitro for their ability to block collagen degradation and their active site inhibitory activity. The most active compounds are further investigated in osteoclast resorption assays and molecular docking studies to predict their interactions with the enzyme. 135  4.2. Materials and Methods CatK/C4-S Complex Formation The CatK/C4-S complex was generated by combining purified human CatK and C4-S in a 2:1 molar ratio in 100 mM sodium acetate buffer, pH 5.5, containing 2.5 mM dithiothreitol (DTT) (Sigma-Aldrich Canada, Oakville, Ontario, Canada) and 2.5 mM ethylenediametetraacetate (EDTA) (Sigma-Aldrich). Wild-type human CatK was expressed in Pichia pastoris and purified as previously described (258). Fluorescinamine labelled C4-S (C4-S*) was prepared as previously described (254) and CatK/C4-S* complexes were generated in the same manner as the unlabeled complexes.. Screening for Complex Formation Inhibitors with Fluorescence Polarization Assay  Library screening for complex formation inhibitors with the fluorescence polarization (FP) assay was performed as previously described using 20 nM fluorescently labelled C4-S and 40 nM CatK (254). The FP signal of the Known Drugs 2 library (Sigma Aldrich, the Prestwick library, BIOMOL and Microsource) (315, 316) was screened using 384-well plates (Corning) with an assay volume of 50 µL in a Synergy 4 multiplate reader. The Biomol library was screened under identical conditions in 96-well plates (Corning, USA) using an assay volume of 100 µL in a fluorescence polarimeter Fluostar optima (BMG LABTECH, Germany).  Z-Phe-Arg-MCA Enzymatic Assays Evaluation of potential active site inhibition of the compounds identified by FP was performed using Benzyloxycarbonyl-Phe-Arg-7-amido-4-methylcoumarin (Z-Phe-Arg-MCA) as fluorogenic substrate (Bachem Americas, Inc, Torrance, California, USA). The enzymatic activity of CatK was monitored by measuring the rate of release of the fluorogenic group, amino-136  methyl coumarin at an excitation wavelength of 380 nm and an emission wavelength of 450 nm using a Molecular Devices SpectraMax Gemini spectrofluorometer. Each inhibitor was added prior to measurement of enzyme activity and the assays were performed at 25°C at a fixed enzyme concentration (5 nM) and substrate concentration (5 μM) in 100 mM sodium acetate buffer, pH 5.5, containing 2.5 mM DTT and 2.5 mM EDTA.  Collagenase, Gelatinase, and Elastase Assays The collagenase inhibitory activities the active compounds were measured using a soluble bovine skin type I collagen (0.6 mg/mL) (Life Technologies) incubated with 400 nM human CatK, in the presence or absence of 200 nM C4-S in 100 mM sodium acetate buffer, pH 5.5, containing 2.5 mM dithiothreitol (DTT) (Sigma-Aldrich Canada, Oakville, Ontario, Canada) and 2.5 mM ethylenediametetraacetate (EDTA) (Sigma-Aldrich). All inhibitors were dissolved in DMSO as a 20 or 10 µM stock.  To minimize a solvent effect on the activity of CatK, all reactions were kept below 1% DMSO, where inhibitory effect was negligible.  After incubation at 28°C for 4 hours, 10 μM of E-64 was added to stop the residual activity of CatK. Collagen cleavage products were separated by SDS-PAGE and stained with Commassie Blue before imaging using the ImageQuant LAS500 gel imaging system (GE Healthcare).  Gelatin (0.6 mg/mL) degradation assays were performed using 10 nM enzyme and incubated at 37°C for one hour before visualization using SDS-PAGE. Gelatin was produced by heating soluble bovine neck type I collagen for 30 min at 70˚C.  Elastin-Congo red (Sigma-Aldrich) was used as an insoluble elastin substrate at 10 mg/mL concentration. Degradation was performed in the presence of 1 µM of recombinant CatK 137  at 37ºC overnight. The amount of degradation was quantified from the released degradation products in the supernatant and was measured using a UV-vis spectrophotometer at 490 nm.  Molecular Docking of Identified Active Anti-Collagenase Inhibitors The appropriate three-dimensional structures for the anti-collagenase compounds were generated using LigPrep and OPLS3 force fields and ionization states generated at pH 5.5 to mimic the assay conditions. Geometric rotamers generated for each compound was limited to ten per ligand and were exported as SDF files prior to docking. The enzyme molecule used for docking was the inhibitor-free CatK structure we previously determined (PDBID: 5TUN). Preprocessing of the enzyme molecule was performed in Maestro with the heteroatoms and water molecules were removed prior to energy minimization. The appropriate receptor grid for ectosteric site 1 was generated as previously described (233). For the C4-S binding site, the grid was generated using the position of C4-S from the C4-S CatK bound structure (PDBID: 3C9E). The prepared ligands were docked to the enzyme in extra precision (XP) mode with flexible ligand sampling. Post-dock minimization was performed and a maximum of twenty poses were generated for each compound. The final poses were visually examined in Pymol (Version 1.8) and ranked by Glidescore. Human Osteoclast Cultures and Bone Resorption Analysis Osteoclasts were generated from mononuclear cells isolated from human bone marrow tissue (Lonza, Walkersville, MD). The bone marrow cells were centrifuged at 400 g (5 min) and the pellet was re-suspended in 10 mL α-MEM (α-Minimal Essential Media) and layered on 10 mL Ficoll-Plaque media solution. Centrifugation at 500 g was performed for another 30 min and the white interface containing the monocytes was harvested and washed twice with α-MEM. Cells were cultured in α-MEM containing 10% FBS and 25 ng/mL M-CSF for 24 hours and then 138  cultured in 25 ng/mL RANKL and 25 ng/mL M-CSF for 7 days. Differentiated osteoclasts (100,000 cells per slice) were seeded on each bone slice (5.5 mm diameter and 0.4 mm thickness) with and without inhibitors and incubated for 72 hours at 5% CO2 and 37ºC with the DMSO concentration at 0.1%. The inhibitor concentration range tested varied between 200 nM and 3 µM.  To compare the effects of the compounds on cell survival, the metabolic activity of osteoclasts was determined using the CellTiter-Blue Viability Assay (Promega, Madison, WI, USA). Bone slices from each condition (inhibitor-treated and control groups) were fixed in 4% formaldehyde and subsequently stained for tartrate-resistant acid phosphatase (TRACP) activity (Acid Phosphatase, Leukocyte (TRAP) Kit; Sigma-Aldrich). Aliquots taken from cell culture media were used to determine the CTx-1 concentration (MyBiosource ELISA kit, San Diego, CA).  CTx-1 is a CatK-specific C-terminal cleavage product of triple helical type I collagen. The total number of osteoclasts per bone slice was determined after TRAcP staining. Cells with ≥2 nuclei were considered as osteoclasts. After 72 hours, bone slices from each condition were incubated in filtered water for cell lysis and cells were removed using a cotton stick. The resorption cavities were stained with toluidine blue and observed using light microscopy. The number of resorption events and eroded bone surface area were determined as previously described (290). All light microscopic analyses were performed using a Nikon Eclipse LV100 microscope and a Nikon Eclipse Ci microscope.   139  4.3. Results Fluorescence Polarization Screening The KD2 library consists of 4761 pharmacologically active small molecules with high chemical diversity. The FP assay was found to be robust, with a Z’ value range of 0.60-0.82 and an average of 0.71 for the 16 tested plates, suggesting minimal overlap between the positive and negative controls. Out of the 4,761 compounds tested at 10 µM HTS concentration, 59 compounds disrupted the formation of C4-S/CatK complexes with greater than 40% inhibition. All 59 compounds were analysed in a secondary screening assay using 5 µM Z-FR-MCA to exclude an active site inhibition. A total of 26 compounds remained and 6 representative commercially available compounds were selected (aurintricarboxylic acid (ATC), ellipticine, sanguinarine chloride (SGC), suramin, reactive blue 2, and sepiapterin). Among the FP positive and Z-FR-MCA assay two main classes of compounds were recognized: negatively-polycharged (6 compounds) and polyaromatic compounds (16 compounds). A complete list of FP active compounds that had no inhibitory activity on the hydrolysis of Z-FR-MCA is shown in Table S4-1. The Biomol Natural Products library consists of 310 pharmaceutically active natural products (317). These compounds were screened in the FP assay and 18 compounds were active at the disruption of complex formation (>40% in FP) and four of these compounds were highly active in preventing complex formation (>80%). Six of these compounds displayed active site-directed inhibition and thus were dismissed from further analysis. In all, 12 compounds were further evaluated in follow-up assays. A complete list of FP active compounds that had no inhibitory activity on the hydrolysis of  Z-FR-MCA is shown in Table S4-1. 140   Figure 4-2 Distribution of potential inhibitor hits from KD2 and Biomol libraries based on their inhibitory potencies. (A) Hits identified from the KD2 screen in the FP assay. A total of 62 compounds were found to disrupt the complex formation (>40% FP signal) and represent a total hit rate of 1.3%. 30 compounds were highly effective (>80% FP signal; hit rate: 0.6%) in disrupting the CatK/C4-S* complex formation. (B) From the Biomol library, a total of 12 compounds were effective in disrupting the CatK/C4-S* complex formation (>40% FP signal hit rate: 3.9%) and four compounds were highly effective at complex disruption. (>80% FP signal hit rate: 1.3%)  Anti-Collagenase and Inhibitory Activity on Other Substrates Commercially available active compounds identified in the FP assay were then tested for their ability to inhibit collagen degradation in a collagenase assay. Out of the 18 commercially available compounds from both libraries, eight compounds displayed collagenase inhibitory activities with IC50 values between 5 and 200 µM (Table 4-1). The most potent compounds included suramin, sclareol, and ATC with IC50 values below 10 µM. Ellipticine, abscisic acid (AA), epigallocatechin gallate (EGCG), and acetyl-strophanthidin were active as collagenase inhibitors between 10 and 100 µM, which translates into 25 to 250-fold molar excess over CatK assay concentrations. SGC had the weakest IC50 value with 186 µM (Table 4-1).  141  Any yet undetected active site inhibition of CatK by the anti-collagenase inhibitors was ruled out with two other substrates, the peptide-based Z-Phe-Arg-MCA (at 100 µM) and the macromolecular substrate gelatin. Both of these substrates do not require exosites and rely exclusively on the active site activity of the enzyme. None of the 8 compounds displayed an inhibition of Z-FR-MCA hydrolysis of more than 16% at a molar CatK to inhibitor ratio of 1:20,000. This molar ratio is markedly higher than the 465-fold molar excess characterized for the IC50 value of the weakest collagenase inhibitor, SGC. None of these compounds displayed any observable inhibition of the gelatinase activity of CatK (Table 4-1). Table 4-1 Active compounds identified from KD2 and BM Library Chemical Name Chemical Structure Collagenase IC50 (µM) Other Substrate Inhibition (100 µM) Z-FR-MCA Gelatin Aurintricarboxylic Acid (ATC)  9.0 ± 3.0 1.7 ± 0.3 0% Ellipticine  88 ± 28 4.0 ± 1 0% Sanguinarine Chloride (SGC)  186 ± 37 14 ± 2 0% 142  Epigallocatechin Gallate (EGCG)  75 ± 10 12 ± 1 0% Suramin  5.0 ± 1.0 16 ± 4 0% Acetyl-strophanthidin  14.3 ± 4.5 5.4 ± 1.2 0% Abscisic Acid (AA)  19.5 ± 2.5 4.9 ± 1.8 0% Sclareol  9.3 ± 3.8 3.7 ± 0.8 0%    143  Osteoclast Bone Resorption Assays Using the Most Potent Anti-Collagenase Compounds We selected six compounds that showed potent to moderate anti-collagenase activity (Table 4-1) (suramin, ATC, sclareol, EGCG, AA, SGC and tested them in an osteoclast-mediated bone degradation assay to evaluate their ability to inhibit bone resorption. Figure 4-3A-B shows the quantification of the osteoclasts numbers and metabolic activity.  None of the compounds tested revealed cytotoxic effects at 5 µM concentration. The analysis of the release of CTx (C-terminal telopeptide from type I collagen specifically generated by CatK during type I collagen hydrolysis (318) revealed that only EGCG and ATC were effective in significantly reducing the bone resorption by osteoclasts (Figure 4-3C). All compounds were then subjected for a more detailed bone resorption analysis. Figure 4-3D shows toluidine-stained osteoclast-mediated resorption events on the bone surface in the absence or presence of inhibitors (each at 2 µM) using human osteoclasts. In the untreated cultures, there were numerous large trenches and smaller round pits, indicating extensive bone resorption. Only cultures treated with EGCG and ATC showed a significant reduction of trenches and revealed mostly small round demineralized pits indicating an effective inhibition of CatK-mediated bone resorption. Small resorption pits represent CatK-independent demineralization events. This corroborates well with the CTx analysis. Finally, we determined the IC50 values for both compounds for the inhibitory activity on the % eroded surface and % trench surface. The IC50 values for EGCG were 2.1 ± 0.3 µM and 2.1 ± 0.4 µM, respectively. For ATC, the IC50 values were 1.7 ± 0.3 µM and 1.8 ± 0.4 µM, respectively (Figure 4-4A-B) 144   Figure 4-3 Analysis of anti-collagenase inhibitors on the viability and activity of human osteoclasts  (A) Metabolic activity and (B) number of TRAP+ osteoclasts after treatment in the presence of the compounds compared with untreated cells show no significant differences on cell viability at 5 µM inhibitor concentrations. (C) Quantification of bone resorption using CTx levels under untreated and inhibitor treated (2 µM) conditions show significant reduction for EGCG and ATC (p<0.05). The values were determined from three independent experiments where 5 bone slices in each condition were analyzed. (D) Representative images of osteoclast-generated resorption cavities in the presence or absence of the compounds (Suramin, abscisic acid (AA), sclareol, 145  epigallocatechin gallate (EGCG), aurintricarboxylic acid (ATC), sanguinarine chloride (SGC)).  Large trenches were observed and represent collagen degradation events. Small round cavities indicate demineralized pit area with no or little collagen degradation. (Scale bar = 40 µm) Mature human osteoclasts were cultured on bovine bone slices for 72 hours in the absence or presence of inhibitors (2 µM). Data represent mean ± SD. ‘ns’, not significant; * p< 0.05; ***p< 0.001.     Figure 4-4 IC50 determination of human osteoclast bone resorption parameters for EGCG and ATC (A) For the compound EGCG, the IC50 values for the inhibitory effect on the % eroded surface and % trench surface were determined to be 2.1 ± 0.3 µM and 2.1 ± 0.4 µM, respectively. (B) For ATC, the IC50 values for the inhibitory effect on the % eroded surface and % trench surface were 1.7 ± 0.3 µM and 1.8 ± 0.4 µM, respectively. The IC50 values were determined from three independent experiments where 5 bone slices in each condition were analyzed.   146  Molecular Docking of Collagenase and Resorption Inhibitors The binding of the most potent compounds in the collagenase and osteoclast-based assays were then investigated using a molecular docking approach to predict their binding with the enzyme. Two potential binding sites previously implicated for complex formation were evaluated, including ectosteric site 1 and the C4-S binding site. Ectosteric site 1 lies on the L-domain of CatK consisting of the loop ranging from residues Ser84-Pro100. The binding site has been shown to be a protein-protein interaction site required for oligomerization of collagenolytic CatK complexes (232). It has also been characterized to be a druggable binding site with a hydrophobic core and potential hydrogen bonding residues surrounding the binding site (233). The formation of collagen degrading CatK complexes also requires the binding of C4-S (232). The C4-S binding site lies on the opposite side of the active site of the CatK molecule (73). The C4-S binding site contains several electropositive residues which interact strongly with the negatively charged C4-S. Arg8, Lys9, Asn190, and Lys191 are also surface exposed residues which interact with the sulfate and sugar atoms on the C4-S molecule (319). Compounds binding at either ectosteric site 1 or the C4-S binding site on the enzyme were predicted to have an inhibitory effect on complex formation and thus to prevent collagen degradation. Molecular docking of the four most potent compounds from the in vitro and cell-based assays (EGCG, ATC, suramin, sclareol) was performed using Glide (Maestro) suite with extra-precision (XP) mode docking to both ectosteric site 1 and the C4-S binding site at the physiologically relevant pH of 5.5. The overall best predicted binding poses for these compounds are shown in Figure 4-5. Table 4-2 summarizes the theoretical Kd values for the best poses of EGCG, ATC, suramin and sclareol for both ectosteric binding sites.  147  Sclareol was well accommodated in ectosteric site 1 and had a predicted Kd of 28.9 µM (Figure 4-5B). The best pose predicted hydrogen bond interactions between the aliphatic region of the compound and residues Ala86 and Asn89 (Figure S4-1A). The aromatic portion of the compound interacted with the hydrophobic interior (Tyr87) of the binding site. Hydrophobic interactions can also be observed between the compound and residues Tyr89 and Val90 and can be seen in the LigPlot diagram. Binding at the C4-S binding site was predicted to be weaker with a Kd of 580 µM. ATC, EGCG and suramin were all predicted to effectively bind in the C4-S binding site (Figure 4-5) on the enzyme with predicted Kd values correlating well to their determined IC50 values in the collagenase and osteoclast assays (Table 4-2). The best predicted pose for ATC in the C4-S binding site (Kd of 3.17 µM) showed extensive hydrogen bond interactions between the three carboxylic acid groups and the positively charged Arg8, Lys9, and Lys191 found in the C4-S binding site (Figure 4-5C and Figure S4-1C). Additional hydrophobic interactions were seen for Tyr145, Gly148, Asn190, and Leu195. The compound was not predicted to bind at ectosteric site 1 and did not have any predicted poses.  Due to its high molecular weight, suramin was expected to only bind at the C4-S binding site. Its calculated Kd was of 1.9 µM for this site (Figure 4-5D). Hydrogen bonding interactions were observed between suramin and Arg8, Lys9, and Asn 190 residues, which also interact directly with the C4-S (Figure S4-2A). Overlay of the best-predicted suramin binding pose and the CatK/C4-S structure (PDBID: 3C9E) showed similar geometry in the positioning of the sulfate and part of the aromatic sections of the compound (Figure S4-2B). Suramin did not return any binding poses for ectosteric site 1. 148  EGCG was also shown to bind effectively at the C4-S binding site with a predicted Kd of 54 µM (Figure 4-5E). Hydrogen bond interactions are predicted between the alcohol groups of the molecule and the residues surrounding the C4-S binding pocket, including Arg8, Ile171, and Gln172 (Figure S4-1B). Binding at ectosteric site 1 was predicted to be significantly weaker with a Kd of 350 µM and rather poor hydrogen bond interactions between the two of compound’s alcohol groups and the enzyme residues Asp85 and Ala86. 149   Figure 4-5 Ectosteric sites required for CatK-mediated collagen degradation and the best binding poses predicted for the most potent anti-collagenase activity inhibitors. (A) Two ectosteric binding sites implicated in the collagenase activity of CatK. The active site Cys25 residue is coloured in yellow. Ectosteric site 1 represents the protein oligomerization site 150  and is coloured in orange. The C4-S interaction site located on the R-domain is coloured in blue. The CatK molecule on the right is a 180-degree rotation. (B-E) Top predicted poses for the most potent anti-collagenase compounds docking at either ectosteric site 1 or the C4-S-binding site using Glide. (B) Sclareol (blue) was predicted to effectively interact with ectosteric site 1 due to its hydrophobic character. (C) ATC, (D) suramin, and (E) EGCG (yellow) were predicted to interact effectively with the residues surrounding the C4-S binding site due to their negatively charged or hydrophilic functional groups. The predicted binding affinities are listed in Table 4-2.     Table 4-2 Predicted binding affinities for the most potent collagenase and osteoclast resorption inhibitors.  Predicted Binding Affinities (Kd) for Each Binding Site (µM) Compound Ectosteric Site 1 C4-S Binding Site ATC Not predicted to bind. 3.2 EGCG 350 54 Suramin Not predicted to bind. 1.9 Sclareol 28.9 575  Differential Inhibition of the Collagenase and Elastase Activity of CatK Based on the molecular docking results and our previous studies elucidating the structural requirements for the collagenase and elastase activity of CatK (73, 207, 232, 320), we investigated whether it would be possible to selectively inhibit either the collagenase or elastase activity with a selected ectosteric inhibitor. We used dihydrotanshinone I (DHT1) as a selective inhibitor binding at ectosteric site 1 (113, 114) which inhibits both the collagenase and elastase activity by blocking the protein-protein interaction site required for the oligomerization of CatK as well as a secondary elastin binding site which is identical with the protein-protein interaction site. As expected and previously reported, DHT1 blocks both activities at 10 µM concentration (Figure 4-6).  151  EGCG and ATC were predicted to bind preferentially or exclusively at the C4-S binding site (Table 4-2) on the R-domain of CatK. Binding in this region would thus prevent complex formation and therefore collagenase activity without disrupting elastase activity.  Consequently, we observed a strong inhibition of the collagenase activity by both compounds but only a mild inhibition of the elastase activity at 100 µM or 10 µM inhibitor concentration, respectively (Figure 4-6). To our knowledge this is for the first time that a differential inhibition of an activity of a protease was achieved by targeting different ectosteric sites.   Figure 4-6 Enzymatic inhibition of CatK by DHT1, ATC, and EGCG on its collagenase and elastase activities. DHT1 (red) at 10 µM concentration was observed to inhibit completely both collagenase and elastase activities; whereas ATC (10 µM) (blue) and EGCG (100 µM) (green) only inhibited the collagenase activity of the enzyme without disrupting its elastase activity.  4.4. Discussion In this study, we used a high-throughput screening approach to identify novel scaffolds for ectosteric inhibitors of CatK. This cysteine protease has been previously characterized as a 152  promising target for the pharmacological treatment of osteoporosis and multiple drugs have been evaluated in clinical trials (98, 321). However, these compounds have all been active site-directed inhibitors, and have failed due to adverse side effects. The exact mechanisms of these side effects remain unknown but might be caused by the complete inhibition of the activity of CatK including its regulatory functions (112). Using a computational approach, we have recently identified inhibitors targeting ectosteric site 1 to block the collagenase activity of the enzyme without affecting its other proteolytic activities (233). In this study, we used a high-throughput FP screen to identify novel compounds which selectively block collagen degradation by CatK without a bias for a distinct ectosteric site.  Using two drug libraries with a total of 5,071 compounds, we identified 38 compounds (FP hit rate: 0.75%) that were active in disrupting the complex formation of CatK and C4-S without affecting the active site of the protease. Eight of these compounds were able to prevent collagen degradation by CatK with IC50 values of up to 200 µM. The most potent compounds, suramin, ATC, and sclareol had IC50 values of below 10 µM (Table 4-1). When tested for potential active site-directed inhibition, none of the compounds displayed significant inhibition in the cleavage of Z-FR-MCA or gelatin even at a 20,000-fold molar excess of inhibitor over protease (Table 4-1) These results suggest that the inhibitory activity observed is not due to inhibition of the active site but is selective for the collagenase activity of the enzyme by blocking an ectosteric site.  Subsequently, we investigated the potential binding mode of the four potent compounds with CatK using a computational approach. Sclareol showed the best binding at ectosteric site 1, whereas ATC, EGCG, and suramin were predicted to bind preferentially at the ectosteric C4-S binding site. Both ectosteric site 1, which is located at the protein-protein interaction site, and the 153  glycosaminoglycan binding site are required for the oligomerization of CatK as a requisite for its collagenase activity (Figure 4-5A) Sclareol displayed the highest calculated affinity to ectosteric site 1 (Figure 4-5B). Previous characterization of this binding site showed a hydrophobic core with hydrogen-bond donating regions around the pocket (233). Sclareol displays a hydrophobic aromatic ring system and functional groups that can engage in hydrogen-bonding interactions with residues surrounding the binding site. Inspection of the best predicted binding pose showed hydrophobic interactions with the residues surrounding the binding site including Pro88, Met97, and Val90 (Figure 4-5B and Figure S4-1A). EGCG showed a markedly lower affinity to ectosteric site 1 when compared to its predicted binding at the C4-S binding site. The best predicted pose also reflects a lack of interactions between the compound and the enzyme. Suramin was the most potent collagenase inhibitor identified through the library screen with an IC50 value of 5 µM. As expected, molecular docking of the compound to ectosteric site 1 did not yield any binding poses, likely due to the size of the compound and the relatively small size of the binding site. However, the compound docked highly efficiently to the C4-S binding site with a predicted Kd of 1.9 µM (Figure 4-5D). The enzyme contains numerous positively charged residues near the C4-S binding site which allows for interactions with the negatively charged C4-S. Similar types of interactions were observed for the predicted pose of suramin at this site (Figure S4-2A). One of the sulfates on the compound occupies a position in close proximity to the one occupied by a sulfate residue from C4-S in the glycosaminoglycan/CatK complex structure (PDBID: 3C9E) (Figure S4-2B). The compound also makes significant hydrogen bond interactions with the enzyme, in particular with residues surrounding the C4-S binding site, including residues Arg8, Lys9, Gln172, Asn190, and Tyr193. Binding at this site 154  would likely disrupt interactions with C4-S and perturb the complex formation required for collagen degradation. Despite its potent anti-collagenase activity and its listing as an essential medicine by the WHO for the treatment of African Sleeping Disease (322) and river blindness (323), suramin is unlikely to be a suitable drug candidate for the treatment of osteoporosis.  The compound has a long list of moderate to severe side effects that will not allow for a long-term treatment regime as required for various musculoskeletal diseases including osteoporosis. Its lack of usefulness as an osteoporosis drug is also corroborated by its weak efficacy in the osteoclast resorption assay.  Therapeutically of more interest are EGCG and ATC as both compounds showed potency in the osteoclast resorption assay (Figure 4-4). Their IC50 values for the prevention of osteoclastic bone resorption were approximately 2 µM and approximately 7-8 times less potent than for tanshinone IIA sulfonate (240 nM), an ectosteric inhibitor that was highly potent as an anti-resorptive in ovariectomized mice (114). Both compounds were predicted to bind preferentially with the C4-S binding site.  EGCG is the principal catechin in green tea. It has been shown to mitigate bone loss in OVX rats (324, 325) and the mechanism was thought that EGCG inhibits osteoclast differentiation via the RANKL pathway over a concentration range of 10-100 µM. Our current data indicate that at lower concentration (<10 µM) it may directly block the collagenase activity of CatK and thus the main bone resorbing protease. Interestingly, consumption of green tea correlates with a reduced risk of osteoporosis (326).  There is no information in literature available supporting an anti-osteoporotic activity of ATC.  However, the compound has been described as a powerful inhibitor of protein-nucleic acid interactions and thus inhibits enzymes such as topoisomerase (327) and ribonuclease (328). 155  Based on our docking experiments, the binding mode of ATC may be comparable to those described for various nucleases. We also noted that from the six tested anti-collagenase inhibitors, only ATC and EGCG were effective in the osteoclast-mediated bone resorption assays. This difference in potencies could be due to the pharmacological properties of the compounds which prevented them from entering the osteoclasts or the resorption lacuna where collagenolytic CatK is active. The membrane permeabilities calculated from physicochemical properties for ATC and EGCG were 2.0 and 2.1 nm/s, respectively. Suramin, the most potent compound in the collagenase assay, had a predicted permeability of only 0.03 nm/s and as a result, was ineffective in the cell-based assay. Other compounds which were active in the collagenase assays but ineffective in the cell-based assays (sclareol, AA, SGC) had membrane permeabilities ranging from 0.5-1.2 nm/s and may have also played a role in their low activity observed.  Our study also demonstrated that it is possible to exploit different ectosteric sites to inhibit distinct ECM degrading activities of CatK. For example, EGCG and ATC are potent collagenase inhibitors but display only weak or no anti-elastase activity. So far, we have not identified an inhibitor capable of blocking the elastase activity of CatK. Theoretically, this would be possible if a compound is found that specifically interacts with ectosteric site 2 in CatK which is needed for the elastase activity but not for the collagenase activity (207). However, it should be noted that in the case of CatK, the dissection of its collagenase and elastase activities might be only of theoretical interest, as for diseases where the enzyme is a pharmaceutical target, the inhibition of both activities is likely to be more beneficial. In bone and joint diseases, the inhibition of collagen degradation is most relevant, and in cardiovascular diseases, the prevention 156  of collagen and elastin is preferred. As osteoporosis patients are normally of older age and thus more prone to cardiovascular problems, the inhibition of both activities would be favoured.   In conclusion, we have demonstrated that the FP library screening method can be successfully exploited for the identification of compounds that disrupt the complex formation of CatK and thus allows for the selective inhibition of the enzyme’s collagenase activity. At least two of the identified drug candidates were effective at inhibiting CatK-mediated bone resorption by osteoclasts without interfering with the viability of the cells. This is considered highly beneficial for anti-resorptive drugs as it will not interfere with the cross talk between osteoclasts and osteoblast, which is required for the maintenance of healthy bones (329). We believe that this method can be applied to other pharmaceutical targets that require complex formation for their activities and will be effective at identifying novel ectosteric inhibitors for these targets.    157  Supplementary Information (Chapter 4)   Figure S4-1: LigPlot diagrams showing the interactions in the best binding pose for sclareol, EGCG, and ATC. Predicted ligand interactions of sclareol (A) to ectosteric site 1, EGCG (B) and ATC (C) to the C4-S binding site. 158   Figure S4-2. LigPlot diagram of suramin binding at the C4-S binding site and overlay of C4-S bound complex and the predicted suramin sodium binding position. (A) Predicted ligand interactions of suramin to the C4-S binding site. (B) Overlay of the suramin binding position and the C4-S-CatK complex structure (PDBID: 3C9E).   159  Table S4-1 Compounds identified from FP assays active in disruption of collagenase active complex Compounds from KD2 Library Compounds from Biomol Library Reactive blue 2 Abscisic acid Suramin sodium salt Hypericin Ellipticine Silymarin Aurin tricarboxylic acid Ryanodine Nf 023 Heteratisine PPNDS tetrasodium Pratol Methylergonovine Norfluorocurarine Sp600125 Sclareol Sepiapterin Acetyl-strophantidin Lysergol Methylergonovine Sanguinarine chloride Biochanin A Remerine HCI Epigallocatechin Gallate  Clotrimazole Aloe-emodine Phenylephrine hydrochloride Methylergometrine maleate Nomifensine maleate Chicago sky blue 6B Chrysarobin Propidium iodide Homidium bromide Pararosaniline Pamoate Zaprinast Terbutaline hemisulfate Caffeic acid phenethyl ester (6R)-5,6,7,8-Tetrahydro-L-biopterin hydrochloride  Note: Compounds above the red line in the shaded box had greater than 80% inhibition in the FP assay. Compounds in bold were tested in in vitro assays. Compounds shaded in orange are polyanionic compounds and compounds shaded in blue are polyaromatic compounds.  160  5. Allostery or Ectostery?  Substrate specific inhibition of cathepsin K  Abstract Cathepsin K (CatK) is the predominant mammalian bone-degrading protease and thus an ideal target for the development of anti-osteoporotic drugs. Numerous active site-directed inhibitors have been tested in clinical trials, but all have failed due to side effects. Ectosteric inhibitors targeting ectosteric sites can modulate enzyme activity on specific substrates and avoid indiscriminately blocking all catalytic activity. In this study, we have determined the structure of two ectosteric inhibitors in complex with CatK. For the first time, we have identified multiple unique ectosteric binding sites for each compound that play a role in their inhibitory activity. Multiple biochemical approaches have confirmed these ectosteric sites and their effect on substrate cleavage. Thus, it suggests that individual protease activities can be selectively targeted to avoid side effects often associated with inhibiting the entire enzyme activity with active site-directed drugs.    161  5.1. Introduction Proteases have been traditionally targeted with active site-directed inhibitors (271, 330). Several well characterized examples include the HIV-1 protease, leucocyte elastase, thrombin, the proteasome, several MMPs, and cathepsins (132, 133, 136, 331–333). However, many active site-directed inhibitors are plagued by off-target effects or side effects caused by the inhibition of regulatory activities of the same target in addition to the inhibition of the disease relevant activity of the protease (122). Blocking the active site of a protease indiscriminantly inhibits the hydrolysis of all its substrates. Therefore, increasing attention has been given to the identification and characterization of exosite inhibitors that promise higher target specificity and selectivity for a distinct activity of the protease.  One class of exosite inhibitors are allosteric inhibitors, which bind remotely from the active site and affect its structure of through multiple mechanisms (334). Therefore, an allosteric inhibitor, like active site-directed inhibitors is expected to affect the hydrolysis of all substrates of a protease and may not solve the problem of the lack of substrate specificity of a given inhibitor which may result in side effects. (271). Consequently, it would be desirable to find mechanisms to selectively block only the disease relevant activity of an individual protease. Investigating substrate-specific exosites of cysteine cathepsins, we identified  secondary substrate binding sites needed for the degradation of fibrous elastin by cathepsins V and K, and sites required for the oligomerization of collagenolytically active cathepsin K (CatK) in the presence of glycosaminoglycans (73, 207, 230–232). We proposed that blocking these sites will not alter the active sites of these cathepsins through allosteric mechanisms but will simply either prevent the binding of a specific substrate outside of  the non-active site areas or will prevent the formation of protease complexes needed for a specific activity such as the collagenase activity of 162  CatK.  To distinguish this type of exosite-dependent inhibition from allosteric inhibitors, we introduced the term ectosteric inhibitors.  Here, we describe the binding and inhibitory effects of two different exosite inhibitors, 2-[(2-Carbamoylsulfanylacetyl)amino]benzoic acid (NSC-13345) and tanshinone IIA-sulfonate (T-06) on the activity of human CatK. We demonstrate that both inhibitors do not act as allosteric regulators but rather ectosteric inhibitors by either hindering the binding of certain substrates or interfering with the collagenolytically relevant oligomerization of CatK.  We also identify multiple binding sites of these inhibitors on the surface of the protease where some of these binding sites might be irrelevant for the activities of CatK.  5.2. Materials and Methods Molecular docking of NSC-13345 and Tanshinone IIA Sulfonate (T-06) to CatK Global docking to identify binding sites of NSC-13345 and T-06 on human CatK was performed using Autodock Vina algorithm in the PyRx Virtual Screening docking program (Version 0.8) (335). The protein structure (PDBID: 1ATK) was first prepared by removing the ligand and water molecules and adding hydrogens in standard geometry in the program Maestro. (Version 10.2, Schrodinger, New York). The NSC-13345 and T-06 ligands were prepared using the standard options using Ligprep with OPLS3 force field at the possible ionization states at pH 5.5 ± 1. The resulting output was imported into PyRx for docking. A docking grid which completely encompasses the enzyme structure was used for the search. After the poses were generated, they were exported to mol2 format, visualized using PyMOL (Version 1.8), and scored according to the calculated binding energies. Potential binding sites were defined based on the scored affinities. 163  Crystallization of NSC-13345-CatK and T-06-CatK Complexes For the crystallization of the NSC-13345-CatK complex, hCatK (10 mg/mL) was first crystallized in 8% polyethylene glycol (PEG) 4000, and 0.1 M sodium acetate using a sitting drop vapor diffusion method. The sitting drop of 2 µL contained a 1:1 dilution of protein and well solution. Suitable crystals formed after three days and approximately 0.5 mg powdered NSC-13345 were then added to the formed crystals and allowed to soak for two months. The crystallization of the T-06-CatK complex was performed identically with a well solution containing 0.04 M potassium phosphate monobasic, 16% (w/v) PEG-8000, and 20% (v/v) glycerol. NSC-13345 was obtained from the National Cancer Institute through the Developmental Therapeutics Program (Rockville, MD) and T-06 (Chemfaces, China). 30% glycerol was added to the respective wells before the crystals were flash-frozen in liquid nitrogen prior to data collection on beamline 12-2 at the Stanford Synchrotron Radiation Lightsource facility (Menlo Park, California, USA). All crystals were grown at room temperature. Structural Determination of the Exosite Inhibitor/CatK Complexes Diffraction data was collected using a Dectris Pilatus 6M detector at 100 K with an X-ray wavelength of 0.98 Å. All data sets were processed with the program iMosflm version 7.2.1 (259) and the intensities scaled with SCALA to a resolution of 1.41 or 1.85 Å for the NSC-13345 and T-06 structures, respectively (259). Phasing was performed by molecular replacement using a human wild-type structure (PDBID: 4X6H) in the program PHASER (261). The restraints for the complexed NSC-13345 (1XF) and T-06 (J0V) were generated using ELBOW and fitted into the model during refinement (285). Resolution cutoffs were selected based on the merging 164  statistics listed in Table 5-1. All modeled structures were refined by cycles of automated refinement in PHENIX (263) and manual adjustments in COOT. The quality of the final model was evaluated using SFCHECK and PROCHECK in the CCP4 program suite (260, 264). No non-glycine residues were found in the disallowed or unfavored regions on the Ramachandran plot (336). Further validation was performed using the wwPDB validation server. All the structures herein were illustrated using PyMOL software (PyMOL Molecular Graphics System, Version 1.8 Schrödinger, LLC).  Generation of Putative NSC-13345 Allosteric Site Mutant of CatK The cDNA for hProCatK (OriGene, Burlington, Ontario, Canada) was used to generate PCR inserts with the Phusion Green High-Fidelity DNA polymerase (Thermo-Fisher Scientific™, Waltham, Massachusetts, USA) using the non-mutagenic primers. The PCR program used was: denaturation: 95°C, annealing: 61°C, extension: 72°C; 30 cycles. Inserts were digested with EcoRI and NotI restriction enzymes and cloned into the pPic9k expression vector (Life Technologies, Burlington, Ontario, Canada). The wild-type hProCatK vector was used to generate two point mutations at the putative allosteric site described in (252) by hybridization of the mutagenic primers (K119D Forward 5’-GGGGAATGAGGACGCCCTGAAGGACGCA GTGGCC -3’; K119D Reverse 5’-CCTCTTCAGGGCGTCCTCATTCCCCTCGGGGATCTC-3’; K176E Forward 5’-GAAGGGAAACGACCACTGGATAATTAAAAACAGCTGG-3’; K176D Reverse 5’-GTTTTTAATTATCCAGTGGTCGTTTCCCTTCTGGATTCC-3’) and amplification of the full plasmid using the following PCR conditions: denaturation: 95°C, annealing: 58-63°C, extension: 72°C; 30 cycles. The pPic9k vector containing hProCatK was used for protein expression as previously described [14]. All plasmids were then transformed into DH5α E.coli cells by heat shock and transformants were screened on Luria Bertani (LB) 165  plates containing 100 µg/mL ampicillin. DNA-sequenced plasmids (1µg) were linearized with SacI before transformation into competent GS115 P. pastoris cells by electroporation (V = 2kV, t = 5ms). Transformants were screened, expressed, and pepsin activated as previously described for wild-type CatK [14, 15]. The activity of the processed enzyme was monitored using Z-FR-MCA, described in greater detail below. Maximum activity was usually reached after 5 days of induction and cells were harvested, spun down at 4,500 x g for 10 minutes and supernatants were concentrated using an Amicon 10 kDa Ultrafiltration Membrane (EMD Millipore, Billerica, Massachusetts, USA). The enzyme was purified using previously described protocols for wild-type and mutant CatK. Fractions containing the active enzyme were concentrated to 10 mg/mL using an Amicon 10 kDa Ultra Concentrator (EMD Millipore) and the purified enzyme stock was stored at -80°C till used. Z-FR-MCA, Abz-HPGGPQ-EDDnp, and Abz-KLR-XXX-EDDnp Substrate Cleavage Assays Steady-state kinetics were performed using fluorogenic substrates as previously described (50). The enzymatic activity of CatK was followed by measuring the rate of release of the fluorogenic groups, MCA, at an excitation wavelength of 380 nm and an emission wavelength of 450 nm, EDDnp at an excitation wavelength of 320 nm and an emission wavelength of 420 nm using a Molecular Devices SpectraMax Gemini spectromicrofluorometer. In assays involving inhibition by NSC-13345, the inhibitor was added prior to the measurement of enzyme activity. The kcat and Km values were determined using nonlinear regression analysis (Michaelis-Menten kinetics) in the software GraphPad Prism (Version 5.0 for Windows, GraphPad Software, La Jolla California USA). The assays were performed at 25°C using fixed enzyme concentrations 166  (1-5 nM) and variable substrate concentrations (1-50 μM) in 100 mM sodium acetate buffer, pH 5.5, containing 2.5 mM DTT and 2.5 mM EDTA.  For the Ki determination of NSC-13345 and T-06 in the presence of the substrate, Abz-HPGGPQ-EDDnP, the substrate concentration was varied between 5 and 25 μM and the inhibitor concentration between 0 to 400 μM. The Michaelis-Menten and Dixon plot parameters were calculated and analyzed using the enzyme kinetics toolbox in SigmaPlot. Multiplex Substrate Profiling by Mass Spectrometry 5 nM of CatK was preincubated with 0.1% DMSO, 10 µM T-06, 50 µM NSC-13345 and 1 µM E64 and then added to a mixture of 228 synthetic tetradecapeptides (0.5 µM each) in quadruplicate in 20 mM Sodium acetate, pH 5.5, 2.5 mM EDTA, 2.5 mM DTT. 20 µLof the reaction mixture was removed after 60 min of incubation and the enzyme activity was quenched by adding 8 M guanidine hydrochloride (GuHCl) and flash froze immediately at -80 °C. A control reaction consisted of CatK treated with 8 M GuHCl prior to the addition of peptide library. All samples were subsequently thawed, acidified to pH < 3.0 with 1% trifluoracetic acid and desalted using homemade C18 spin-tips (337). 0.22 µg of the peptide mixture dissolved in 0.1% trifluoracetic acid, was injected into a Q Exactive Mass Spectrometer. Peak integration and data analysis were performed using Peaks software (Bioinformatics Solutions Inc.). Label-free quantification results are normalized using LOWESS and filtered with 0.3 peptide significance. Imputation was performed when 3 or more replicates reported missing or as zero for each peptide. Missing and zero values are imputed with normally-distributed random numbers that fit smallest 5% of the data. IceLogo software (338) was used for visualization of amino-acid frequency surrounding the cleavage sites using the sequences of all cleavages whose π-value >2.6138 (significance level of α = 0.005) (339). 167  PRM of Inhibition of Selected Peptides 2.5 µM of 5 peptides whose cleavages by cathepsin K were found to be inhibited by NSC-13345 were selected and digested with cathepsin K preincubated with 0.1% DMSO or 50 µM NSC-13345. Reaction was quenched after 0, 5, 15, 30, 60 and 120 min and desalted as described earlier. Samples were sent to the mass spectrometry and acquired in the parallel reaction monitoring (PRM) mode. Peptides were quantified using Skyline (340). Liquid Chromatography and Mass Spectrometry A Q Exactive Mass Spectrometer (Thermo) was equipped with an Ultimate 3000 HPLC. In MSP experiment, peptides were separated by reverse-phase chromatography on a C18 column (1.7 um bead size, 75 um x 25 cm, heated to 65 °C) at a flow rate of 300 nl min-1 using a 55-min linear gradient from 5% B to 30% B, with solvent A: 0.1% formic acid in water and solvent B: 0.1% formic acid in acetonitrile. Survey scans were recorded over a 250–1500 m/z range at 70000 resolution at 200 m/z (AGC target 3×106, maximum IT 100 ms). MS/MS was performed in data-dependent acquisition mode with HCD fragmentation (NCE 28) on the 12 most intense precursor ions at 17500 resolution at 200 m/z (AGC target 1×105, maximum IT 50 ms, dynamic exclusion 20 s). For PRM assay, peptides were separated by reverse-phase chromatography on a C18 column (1.7 um bead size, 75 um x 25 cm, heated to 65 °C) at a flow rate of 300 nl min-1 using a 30-min linear gradient from 5% B to 40% B. The Q Exactive was operated in the PRM mode (positive polarity, r = 17,500 at 200 m/z, AGC target 2e5, maximum IT 120 ms, MSX count 1, isolation window 2.0 m/z, NCE 30), with acquisition of targeted peptides.  Collagenase Degradation Assay 168  Soluble bovine type I collagen (0.6 mg/mL) was incubated with 400 nM recombinant human CatK, in the presence or absence of 200 nM chondroitin 4-sulfate in 100 mM sodium acetate buffer, pH 5.5, 2.5 mM DTT and 2.5 mM EDTA. Recombinant human CatK was expressed in P. pastoris and purified as previously described (258). NSC-13345 was prepared in 100 mM sodium acetate buffer, pH 5.5 at a 4 mM stock solution. T-06 was prepared as 20 mM stock in 100% DMSO solution. To minimize the solvent effect on the activity of CatK, all reactions were kept below 1% DMSO, where no solvent effect was observed. In assays involving the inhibitors, various concentrations of NSC-13345 (50 to 2000 μM) or T-06 (0.1 to 10 µM) were added prior to incubation at 28°C for 4 hours. After incubation, 5 μM of E-64 was added to stop the reaction. Samples were analysed using 10% SDS-PAGE gels and stained with Coomassie. The resulting bands of α-collagen chains was quantitatively assessed using ImageJ (Version 1.5) and the IC50 graphs were plotted using GraphPad Prism (289). Analysis of Complex Formation with Fluorescence Polarization Assay  Complex formation with the fluorescence polarization assay was performed as previously described using 20 nM fluorescently labelled C4-S and 40 nM CatK (254). The FP signal for the CatK/C4-S complexes in the presence or absence of NSC-13345 or T-06 was analyzed  using 96-well plates (Corning) with an assay volume of 100 µL in a fluorescence polarimeter Fluostar optima (BMG LABTECH, Germany).  Azocasein Degradation Assay To address the effect of the inhibitors on the CatK-mediated hydrolysis of a non-specific substrate, azocasein was used. 200 nM of CatK were incubated with 2 mg/mL of azocasein (Sigma-Aldrich) in the presence of NSC-13345 (0-500 μM) or T-06 (0-100 µM). After 169  incubation at 37°C for 30 minutes, reactions were stopped by the addition of 5% TCA. Residual azocasein was removed by centrifugation at 13,000 x g for 10 minutes. 0.5 M NaOH was added to the supernatant to neutralize the reactions. The amount of azo-dye released by CatK was quantified by measuring the absorbance of the reactions at 440 nm using a Beckman Spectrophotometer Du-530. Samples were blanked with unprocessed substrate and inhibitor and the experimental values were then analyzed using software SigmaPlot or Graphpad Prism.  5.3. Results Global Docking of NSC-13345 and T-06 to CatK and Identification of their Binding Sites We first sought to identify potential theoretical binding sites for NSC-13345 on CatK through a global molecular docking approach of the entire enzyme using Autodock and to compare their relationship with previously described exosites (252). The resulting poses were sorted according to their calculated binding affinities with a total of eight theoretical sites as shown in Figure 5-1. NSC-13345 had overall weak binding affinities with theoretical Kd values ranging from 370 µM to 5.7 mM (Table S5-1). The previously described exosite determined by crystallography of an incompletely processed CatK was partially confirmed in this study. The new site is slightly moved away from the N-terminus and had a calculated Kd of 5.7 mM. Four of the binding sites identified for NSC-13345 through molecular docking overlapped with putative allosteric sites recently proposed through statistically coupling analysis (252).  Next, we determined the potential binding sites of tanshinone IIA sulfonate (T-06) using the same approach. We have previously identified T-06 as an ectosteric inhibitor of CatK that interferes with the oligimerization of the protease by binding in a putative ectosteric site (114) 170  and which demonstrated a potein antiresorptive activity in osteoclast resorption assays and in ovariectomized mice (114). The resulting top 400 poses were sorted according to their calculated binding affinities and a total of eight binding sites were determined (Figure 5-1). Four binding sites with high affinity (Kd between 0.1-5 μM) and four binding sites with lower affinity (Kd between 5-50 μM) were identified (Figure 5-1). Binding sites 6 and 7, which displayed the lowest calculated binding affinities among the 8 sites, corresponded to ectosteric sites previously implicated in protein oligomerization either by protein-glycosaminoglycan or protein-protein interactions needed for the collagenase activity of the enzyme (73, 232) (Table S5-1).   171   Figure 5-1 Global docking of NSC-13345 with CatK identified eight potential binding sites.  (A) Chemical structure of NSC-13345. (B) Eight potential binding sites for NSC-13345 with a range of theoretical Kd values of 370 µM to 5.7 mM. The binding sites were sorted into two groups based on their calculated affinities. The binding sites coloured in blue represent high affinity sites (Kd value between 370 to 500 µM) and binding sites coloured in green represent low affinity sites (Kd values between 500 µM and 5 mM). The highest affinity poses for NSC-13345 for each respective binding site is depicted in the stick model. Binding site 2 represents the previously identified allosteric site for the compound and had a Kd value of 5.7 mM.  172   Figure 5-2 Global docking of T-06 with CatK identified eight potential binding sites. (A) Chemical structure of T-06. (B) Eight potential binding sites for T-06 with a range of theoretical Kd values of 1.1 µM to 43 mM. The binding sites were sorted into two groups based on their calculated affinities. The binding sites coloured in blue represent high affinity sites (Kd value between 1.1µM and 5µM) and binding sites coloured in green represent low affinity sites (Kd values above 5µM). The highest affinity poses for T-06 for each respective binding site is 173  depicted in the stick model. Binding site 7 represents the previously postulated binding site that prevents the oligomerization of CatK (73).   Structural Determination of NSC-13345 and T-06 CatK Complexes To verify the predicted binding sites for both inhibitors, we crystallized fully processed and active CatK with either NSC-13445 or T-06. The refinement statistics are shown in Table 5-1 and both compounds crystallized with one copy of the molecule in each asymmetric unit. For NSC-13445, we identified three binding locations (PDB ID: 6ASH). Two of these binding sites corresponded to binding sites 1 and 6 identified through global docking (Figure 5-3). Simulated annealing omit maps at 1.5 σ levels showed strong electron density covering each of the three ligands. Examining the binding sites in more detail revealed extensive hydrogen bonding and hydrophobic interactions with the protein (Figure 5-3B-D). The overall secondary structure of the enzyme was unperturbed by the binding of the ligands and the overall rmsd differences between the NSC-13345-CatK complex and the wild-type enzyme (PDBID: 5TUN) were minor at 0.193 Å. Slight differences can be seen in the side chain conformations at each of the ligand binding sites to accommodate the binding of the ligand. The three binding sites identified in the crystal structure were spread over the surface of the enzyme (Figure 5-3A). The first binding site was located near theoretical site 1 identified by global docking, in an open area about 10 Å from the catalytic active site residue (Cys25) as part of the S3’ subsite region (Figure 5-3B). NSC-13345 forms hydrogen bonding with the O1 atom on the side chain of Gln143 and coordinates through two H2O molecules with Gln19, Cys22, and Trp184. Additional hydrophobic interactions between the phenyl ring and Trp188 and Asn18 residues can also be observed.  174  Binding site 2 is located on the surface of the L-domain near the interface of two loops containing residues 55-59 and 74-79 (Figure 5-3C). In this position, NSC-13345 forms two hydrogen-bond interactions with Ser58. In addition, the inhibitor forms three additional H2O-coordinated interactions with residues Glu59, Tyr74, and Asn78. Multiple hydrophobic interactions between the phenyl ring can be found with the previously mentioned residues. Examination of the interactions with the neighbouring protein molecule within the crystal contacts yielded no significant interactions and only a H2O-coordinated bond with Arg127 (Figure S5-1A). This binding site was not among those identified through global docking but was previously identified as a potential allosteric site (252). 175   Figure 5-3 Binding of NSC-13345 with wild-type CatK. (A) Overall structure of hCatK (green ribbon, transparent grey surface) showing the three binding sites of NSC-13345. The binding sites identified in molecular docking are coloured by their respective affinities found in Figure 5-1. (B-D) Fit of NSC-13345 drawn with the difference omit map depicted at 1.5 σ into each respective binding site (B: binding site 1; C: binding site 2; D: binding site 3) with the neighbouring side chains of hCatK in stick representation (orange). Hydrogen bond interactions between the inhibitor and side chain residues or water molecules are shown as yellow sticks.   Binding site 3 is located at the bottom surface of the R-domain, directly adjacent to the α-helix forming the bottom of the domain (Figure 5-3D). The compound forms two hydrogen-176  bonding interactions with the side chain of Arg123 as well as strong hydrophobic interactions with residues Pro114, Ala124, Ile113, and Phe212. The binding site showed significantly less contacts and greater interaction distances with only one H2O coordinated interaction with Asn78 (Figure S5-2B) with the neighbouring protein molecule through crystal contacts. Binding site 3 lies in the vicinity of the previously identified putative allosteric site and are approximately 10 Å apart (Figure 5-4A). The current binding site 3 would have been partially disrupted by the remaining propeptide of the incompletely processed CatK molecule used in the study by Novinec et al. (252). Interestingly, binding site 3 also overlaps closely with the C4-S binding site that is required for collagen degradation. Several residues including Lys119, Arg123, and Arg127 that interact with C4-S, are displaced upon the binding of the inhibitor (Figure 5-4B).   177   Figure 5-4 Structural comparison of binding site 3. (A) Overlay of NSC-13345/active CatK complex (green) with the previously determined non-fully processed enzyme NSC-13345 complex (magenta) shows close proximity and a steric clash with the propeptide region. (B) Overlay of binding site 3 (green) with C4-S bound CatK (orange) shows local conformational changes in the residues interacting with C4-S.  The crystal structure of CatK-T-06 complexes revealed four distinct binding sites for the tanshinone (Figure 5-5) (PDB ID: 6E9S). Three of these binding sites corresponded to binding sites previously identified through global docking (Global docking binding sites 2, 6, and 8 in Figure 5-2). Simulated annealing omit maps at 1.8 σ levels show strong electron density covering each of the binding sites. Binding site 4 had slightly weaker densities compared to the other three binding sites but the refined occupancies and B-factors were within the expected range (Table S5-2). Examination of each of the binding sites showed strong hydrogen bonding and hydrophobic interactions with the protein. The overall secondary structure of the enzyme was unchanged from the wild-type uninhibited enzyme (PDBID: 5TUN) with an average rmsd value of 0.174 Å and thus does not support an allosteric mechanism for  T-06.  178  The four binding sites found in the crystal structure were distributed on the surface of the enzyme. Binding site 1 is located above the active site cleft (Figure 5-5A-C) at the interface between the two enzyme domains and was not identified through molecular docking. The sulfate group on T-06 forms hydrogen bond interactions with the side chain N atom on Asn187 (Figure 5-5C). A second hydrogen bond interaction can also be seen between the O atom found in the first ring of the compound and the side chain N atom on Asn18 (Figure 5-5C). Hydrophobic interactions were observed between the aliphatic region of the compound and residues Gly20, Tyr89, Val90, and Trp184.  Binding sites 2 and 3 (Figure 5-5A-E) are in close proximity and located on the L-domain remote to the active site and lie at the base of the α-helix that leads to the active site region. In the crystal structure, binding sites 2 and 3 are also positioned in close proximity to the binding sites identified by molecular docking. At binding site 2, T-06 hydrogen bonds with the side chain N-atoms found on Gln76 and Lys77 and forms hydrophobic interactions with Arg79 and Arg108 (Figure 5-5D). At binding site 3, the compound forms hydrogen bond interactions with the side chain N atoms found in Arg79, Lys103, and Lys106. Strong hydrophobic interactions were seen between T-06 and residues Lys77, Asn78, and Arg79. Binding site 4 lies on the R-domain reverse of the active site on a three-strand β-sheet (Figure 5-5B). The binding site lies in close proximity to the C4-S binding site previously implicated in collagenase activity and binding site 6 identified by molecular docking. Hydrogen bonding to the enzyme is coordinated between the sulfate group and the side chain N atoms found on Arg8 and Asn190. Additional hydrophobic interactions can also be observed between the aliphatic region of the compound with Lys191, and Tyr193 (Figure 5-5F).  179     Figure 5-5 Binding of T-06 with wild-type CatK. (A and B) Overall structure of T-06-CatK complex (yellow ribbon, transparent grey surface) showing the four distinct binding sites of T-06.  B is a 180° rotation of A. The binding sites identified in molecular docking are coloured by their respective affinities found in Figure 5-1. (C-F) Fit of T-06 drawn with the difference omit map depicted at 1.8 σ into each respective 180  binding site (C: binding site 1; D: binding site 2; E: binding site 3; F: binding site 4) with the neighboring side chains of hCatK in stick representation (yellow). Hydrogen bond interactions between the inhibitor and side chain residues or water molecules are shown as yellow sticks.   Comparison of the residue-by-residue backbone rmsd differences between the NSC-13345-bound, T-06-bound, and the uninhibited CatK also confirmed that the overall structure and active site is mostly unchanged by the binding of the compound (Figure 5-6) and thus indicate allosteric regulation is not taking place.    Figure 5-6 Rmsd difference maps for NSC-13345 and T-06 bound structures compared to uninhibited structure. Rmsd difference maps for (A) NSC-13345 and (B) T-06 bound hCatK structures compared to the unbound wild-type hCatK (PDB ID: 5TUN). Backbone rmsd differences are colored by residue using a scale of 0 to 1 Å (blue to red). Ligand binding in both structures does not show significant perturbations to the active site region.    181  Table 5-1 Crystallographic Data Collection and Refinement Statistics for CatK-NSC-13345 and CatK-T-06 complexes Data Collection WT Human CatK-NSC-13345 Complex WT Human CatK-T-06 Complex Space group P 1 21 1 P212121 Unit cell dimensions (Å) a= 39.48 b= 60.40 c= 44.15 α=γ= 90° β= 95.12° a=31.67 b=67.68 c=92.25 α=γ=β=90° Number of total reflections 191317 109924 Number of unique reflections  37046  17518 Mean I/σ 24.2 (14.9) 9.90 (2.90) Redundancy 5.2 6.3 Merging R factor (%) 0.042 (0.071) 10.3 (54.4) CC (1/2) 0.999 (0.993) 0.996 (0.873) Maximum resolution, (Å) 1.42 1.85 Refinement     Resolution range, (Å) 39.32 – 1.42 46.12 – 1.85 Completeness, (%) 95.5 (97.5) 99.1 (99.2) Number of protein atoms 1648 1648 Number of ligand atoms 51 104 Number of water atoms 283 135 B factors (Å2) 12.7/10.0/28.0 24.0/22.6/32.6 182  (all/protein/solvent) R factor, (%) 11.58 15.80 R free, (%) 14.63 18.25 Rms deviations     Bond lengths, (Å) 0.015 0.015 Bond angles, (°) 1.43 1.43  Values in parentheses refer to the respective highest resolution shell. (1.42-1.51 Å for the CatK/NSC-13345 complex; 1.85-1.95 Å for the CatK/T-06 complex). 5% of reflections were used in Rfree test set for both data sets.  Comparison of the Inhibitory Activity of NSC-13345 and T-06 on CatK  The effect of the binding of NSC-13345 and T-06 on the exosites was evaluated using the Michaelis-Menten parameters for two fluorogenic peptide substrates, Z-FR-MCA and Abz-HPGGPQ-EDDnp, in the absence or presence of NSC-13345 or T-06 (Table 5-2 and Figure S5-2). We used two inhibitor concentrations, a low 100 nM (5-fold molar excess over CatK) and a high 50 µM (25,000-fold molar excess over CatK for NSC-13345) and 25µM for T-06 (12,500-fold molar excess over CatK). Z-FR-MCA interacts with the Schechter-Berger subsites between S3 and S1’ whereas Abz-HPGGPQ-EDDnp covers S4-S4’ with the cleavage between the two glycine residues (13, 341). For Z-FR-MCA, the kinetic parameters, kcat, Km, and kcat/Km did not significantly change in the presence or absence of both inhibitors at low or high concentrations (Table 5-2).  183  Table 5-2 Kinetic constants of CatK-mediated hydrolysis of Z-FR-MCA and Abz-HPGGPQ-EDDnp in the presence or absence of NSC-13345 or T-06    CatK CatK NSC-13445 (100nM) CatK NSC-13445 (50µM) CatK T-06 (100nM) CatK T-06 (25µM)  kcat (s-1) 17.0 ± 1.8 16.5 ± 1.5 16.0 ± 1.5 16.4 ± 0.8 15.7 ± 0.7 Km (µM) 8.1 ± 1.2 6.5 ± 1.3 6.9 ± 1.4 7.4 ± 1.3 7.9 ± 1.0 kcat/Km  (104 s-1 M-1) 210 ± 23 253 ± 50 233 ± 50 221 ± 18 199 ± 13  kcat (s1) 1.80 ± 0.8 1.79 ± 0.2 1.13 ± 0.30 1.88 ± 0.2 1.70 ± 0.2 Km (µM) 11.6 ± 2.1 9.40 ± 2.2 13.7 ± 3.8 9.4 ± 2.4 9.6 ± 1.7 kcat/Km  (104 s-1 M-1) 16 ± 3 16.5 ± 1.5 8.2 ± 3.2 20 ± 3 18 ± 2  A similar picture was seen with the Abz-HPGGPQ-EDDP with the exception that at high NSC-13345 concentrations an about 50-60% reduction of the kcat/Km value was observed due to a reduction in the kcat and an increase in the Km values (Table 5-3). These findings correlate well with previous results showing that NSC-13445 inhibits the degradation of this substrate but not that of Z-FR-MCA (252). T-06 had no significant inhibitory activity on either  substrate at high and low concentrations. This is in line with our previous finding that T-06 does not inhibit the degradation of peptide substrates and non-collagenous proteins (114).  As we have observed multiple binding sites of NSC-13445 in our CatK-inhibitor structure with one of these sites in the vicinity of the putative allosteric site reported by Novinec et al (252), we mutated two positively charged residues which have been described to either interacted directly with the benzoic acid moiety of NSC-13345 or to be in the vicinity of the compound (Figure 5-4A). One of the key interactions occurring between NSC-13345 and the Z-FR-MCA Abz-HPGGPQ- EDDnp 184  putative allosteric site was the electrostatic interaction between the ε-amino group of residue Lys119 and the carboxyl group of NSC-13345. Thus, the replacement of this residue with an Asp residue was hypothesized to abolish the binding of the compound. In addition, the ε-amino group of Lys176 was also proposed as an alternate interacting site in a slightly altered conformation of the bound inhibitor. In order to prevent this potential interaction, Lys176 was mutated to Glu. In addition to reversing the charge, Asp and Glu were chosen as replacement residues due to their presence in the structurally highly related cathepsins S and L, respectively.  The mutation had a significant effect on the kcat and thus on the kcat/Km parameters for both substrates when compared with the constants determined for the wild-type enzyme (approximately 4-fold decrease). As expected there were no differences in the kinetic parameters for the hydrolysis of the Z-FR-MCA substrate between the wild-type and variant enzymes in the presence or absence of the inhibitor at its two concentrations (Table 5-2 and Table 5-3). Surprisingly, there was also no difference for the Abz-HPGGPQ-EDDnp substrate which would have been expected if the site is truly an allosteric site. All kinetic parameters were essentially unchanged in the presence or absence NSC-13445. As the removal of the positively charged residues at the putative allosteric binding site had no effect on the inhibitory activity of NSC-13445, it can be concluded that this site does not regulate enzyme activity.    185  Table 5-3 Kinetic constants of CatK and allosteric site mutant-mediated hydrolysis of Z-FR-MCA and Abz-HPGGPQ-EDDnp in the presence or absence of NSC-13345.    Allosteric site mutant Allosteric site mutant NSC-13445 (100nM) Allosteric site mutant NSC-13445 (50µM)  kcat (s-1) 4.0 ± 0.5 4.1 ± 0.3 4.3 ± 0.3 Km (µM) 7.2 ± 1.3 8.0 ± 1.2 8.0 ± 1.6 kcat/Km (104 s-1 M-1) 68 ± 18 51 ± 11 54 ± 20 kcat (s1) 0.40 ± 0.05 0.41 ± 0.04 0.27 ± 0.04  Km (µM) 7.5 ± 2.4 9.5 ± 2.2 10.8 ± 2.0 kcat/Km (104 s-1 M-1) 5.4 ± 1.8 4.3 ± 1.0 3.3 ± 1.1 kcat (s-1) 4.0 ± 0.5 4.1 ± 0.3 4.3 ± 0.3  A Non-Allosteric Inhibitory Mode of NSC-13445 Based on the high affinity location of NSC-13445 in the S3’ area, an alternative model to an allosteric mechanism was proposed to explain the partial inhibition of Abz-HPGGPQ-EDDnp and the lack of inhibition of Z-FR-MCA hydrolysis. The binding of NSC-13445 in S2’-S3’ region would impair the hydrolysis of any substrate that occupies the same area. Peptide docking with the molecular docking program, Glide (Schrodinger Inc.), showed that the peptide, HPGGPQ, fits into the S3 to S3’ positions with a very efficient binding energy of -96.07 kcal/mol as calculated using MMGBSA (Figure 5-7A). When this was repeated in the presence of NSC-13345 bound in the S3’ position, an alternate substrate conformation was adopted in Z-FR-MCA Abz-HPGGPQ- EDDnp 186  order to fit it into the active site (Figure 5-7B). The P3’ residue is forced into the area just above His162. This caused a decrease in the calculated binding energy to -88.35 kcal/mol and pushed the predicted scissile bond between P1 and P1’ approximately 2 Å away from the catalytic residue Cys25. In contrast, when the shorter substrate, Z-FR-MCA was docked to the active site of the enzyme, it did not enter the S3’ site occupied by NSC-13345 and was thus not impeded in the presence of the inhibitor (Figure 5-7C).   Figure 5-7 Comparison of the best poses of peptide substrates, HPGGPQ and Z-FR-MCA, docked into the active site. The best pose obtained through peptide docking of the peptide substrate, HPGGPQ, to the active site using Glide in the absence (A) and in the presence of NSC-13345 in the S3’ position (B). The best pose obtained through Glide docking of Z-FR-MCA to the active site (C).  To further understand this mechanism of inhibition, we examined the cleavage of several quenched hexapeptides of the general structure, Abz-KLR-XXX-EDDnP, containing different amino acid residues in the P1’ to P3’ positions. The cleavage site was limited to the LR-X bond. Comparing the kcat/Km values for the hydrolysis of the substrates in the presence and absence of NSC-13445, we found that changes in the P3’ position had the most dramatic effect on the 187  inhibitor activity (Figure 5-8A). Altering the P1’ position in the sequences SSK and FSK showed almost no effect in the cleavage efficiency of the enzyme. Changing the P2’ position from FSK to FFK showed a slightly larger effect in the kcat/Km value; however, this difference was not determined to be significant (Figure 5-8A). However, altering the P3’ position from FSK to FSS, FSA or FSF showed a significant decrease in the catalytic efficiency of the enzyme, suggesting that the type of amino acid in the P3’ position played an important role in the binding and cleavage of the substrates (Figure 5-8A).  Molecular modeling of these substrates to the active site of CatK supported the partial inhibitory effect of NSC-13345 (Figure 5-8B-I). Substrates with serine, alanine, and phenylalanine in the P3 position overlap with the binding site of the inhibitor and therefore show reduced kcat/Km values. On the other hand, molecular docking of the peptide KLRFSK revealed that the Lys residue can interact favourably with the region adjacent to the active site and interacts with residues Gln21 and Ser24 when the inhibitor is occupying the S3’ position (Figure 5-8F). Therefore, NSC-13345 does not interfere with the hydrolysis of this peptide. Effect of NSC-13345 and T-06 on the Hydrolysis of a Multiplex Peptide Library An in-depth substrate profiling analysis using multiplex substrate profiling by mass spectrometry (MSP-MS) was also performed to characterize the effects of the inhibitors on substrate cleavage. The cleavage of a library of 228 tetradecapeptides were evaluated in the presence and absence of NSC-13345 and T-06. Peptide cleavage was evaluated using LC-MS/MS after 60 min incubation with a final concentration of 2 nM WT CatK. Initial substrate cleavage profiling identified characteristic substrate specificities of CatK with hydrophobic residues (L, M, and P) preferred in S2 and basic residues (K and R) preferentially cleaved in the S1 and S3 positions. (Figure S5-3). 188   Substrate cleavage was then investigated in the presence of the inhibitors (50 µM NSC-13345 or 10 µM T-06). Thirty-two efficiently cleaved substrates were identified and defined as greater than 50% substrate cleaved after 60 minutes in the presence of CatK (q<0.05). Cleavage of 14 of these substrates were inhibited by NSC-13345 with a minimum of 25% inhibition (Figure S5-4) None of the cleavage of the 32 substrates were inhibited by T-06 (min. 25% inhibition). A highly potent inhibitor of CatK, E-64, was used as a control and was found to inhibit the cleavage of all 32 substrates with a minimum of 90% inhibition at 10 µM concentration. We then selected five peptides for detailed cleavage progress analysis. The cleavage progress curves show that NSC-13345 was able to significantly inhibit each respective cleavage (Figure 5-9).   189   Figure 5-8 Specificity constants for the hydrolysis of Abz-KLR-XXX-EDDnp peptides by CatK in the presence of NSC-13345 and molecular docking of the substrates in the absence and presence of the inhibitor.   (A) The catalytic efficiency (kcat/Km) values significantly decreased for substrates with sequence FSS (p<0.002), FFK (p<0.05), FSF (p<0.025), and FSA (p<0.006) in the presence of 250µM inhibitor. The greatest decrease was seen for sequences FSS, FSA, and FSF. (B-E) Molecular docking of four peptide substrates without NSC-13345 occupying binding site 1. All substrates spanned the characteristic subsites (S2-S3’) found in CatK. (F-I) Substrates with P3’ residues containing either phenylalanine, alanine or serine were sterically hindered by NSC-13445 whereas LSFRK with lysine in P3’ can interact with the region adjacent ot the active site. Docking was performed with the pentapeptide sequences LR-FSX.  190   Figure 5-9 Inhibition of cleavage of selected substrates by NSC-13345. (A-E) Substrate cleavage progress curves determined through MSP-MS show significant inhibition by NSC-13345 on the cleavage of multiple substrates. The slash above each graph denotes the cleavage event in each respective peptide and the detected peptide is shown in bold.   Effects of NSC-13345 and T-06 on the Hydrolysis of Macromolecular Substrates CatK is well characterized as a multifunctional protease capable of cleaving many protein substrates; its most relevant substrate being triple helical collagen. Here, we investigated the effect of NSC-13345 and T-06 on the CatK-mediated hydrolysis of two macromolecular substrates, azocasein and type I collagen. The inhibition by NSC-13345 behaved in a hyperbolic fashion for the degradation of azocasein and reached a maximum of 40% inhibition at about 200 µM inhibitor concentrations and was subsequently not further altered up to 500 µM inhibitor concentration. This corroborates with previous research indicating that the inhibitor behaved in a hyperbolic fashion for the inhibition of azocasein degradation (252). There was also no 191  significant difference in the inhibition for the wild-type and the putative allosteric site mutant enzyme, indicating that NSC-13345 was able to prevent the degradation of azocasein in the mutant enzyme, despite the removal of the previously characterized allosteric site (Figure 5-10A).  On the other hand, T-06 had again no inhibitory activity on the hydrolysis at an inhibitor concentration of  up to 100 µM, which reflects a 500-fold molar excess over the protease concentration. These findings overlap well with the results discussed for peptide substrates spanning into the S3 subsite. Azocasein, a non-specific protease substrate, is not expected to require specific exosite interactions outside of the classical Schechter-Berger binding sites which also explain the lack of inhibitor activity of T-06.   In contrast, type I collagen was highly effectively inhibited by T-06 with an IC50 value of 2.1 ± 0.2µM (1:6 molar CatK to inhibitor ratio) as previously reported (114). NSC-13445 demonstrated an IC50 value of 393 ± 30 µM for the wild-type protease and 291 ± 37 µM for the mutant CatK variant (Figure 5-10B-D). However, in contrast to the partial inhibitory effect on azocasein, type I collagen degradation was completely inhibited by NSC-13445 at concentrations higher than 1 mM for wild-type and mutant CatK (1:2,500 molar CatK to inhibitor ratio). This indicated that the inhibitory activity of NSC-13445 on collagen degradation must be caused by another or additional effect of its S3’ site binding. While the partial inhibition of azocasein hydrolysis can be explained by the binding of NSC-13445 in the S3’ binding area, the complete inhibition of the collagen degradation is likely caused by the interference of the glycosaminoglycan binding site on CatK at inhibitor binding site 3. We have previously demonstrated that glycosaminoglycans are required for the oligomerization of collagenolytically active CatK oligomers (73, 232). Using the fluorescence polarization assay, we could 192  demonstrate that both NSC-13445 and T-06 prevent the oligomerization of CatK complexes in the presence of glycosaminoglycans (Figure 5-10E) and the respective IC50 values were determined to be 10 ± 0.8 µM (T-06) and 1.6 ± 0.2 mM (NSC-13345). NSC-13445 may thus affect collagen degradation by a partial inhibition due to its binding in the S3’ substrate binding site and by disturbing the glycosaminoglycan-CatK interaction site leading to a disruption or perturbation of the collagenolytically active CatK complex.   Figure 5-10 Inhibition of the degradation of macromolecular substrates, azocasein and collagen and C4-S complex formation by NSC-13345. (A) Azocasein degradation inhibition was in a hyperbolic fashion for both the wild-type and mutant CatK enzymes, reaching a maximum of 40% inhibition at 200 µM inhibitor. (B) 193  Quantification of collagenase inhibition by NSC-13345 and the IC50 determination by the intensity of the α-bands in three separate experiments using 400 nM CatK. The respective IC50 values were determined to be 393 ± 30 µM (wild-type) and 291 ± 37 µM (putative allosteric site site mutant), which represent near 1000-fold excess of enzyme. (C and D) Representative SDS-PAGE gels for collagen degradation inhibition by NSC-13345 in the presence of C4S for wild-type (C) and allosteric site mutant CatK (D). C1 and C2 represent collagen and CatK alone, respectively. Complete degradation of collagen was only observed in the presence of C4S (C3) with no solvent effect observed (C4). (Lanes 1-8 represent NSC-13345 concentrations: 2 mM, 1.5 mM, 1 mM, 500 µM, 250 µM, 125 µM, 62.5 µM, and 31.25 µM) (E) Both NSC-13345 and T-06 inhibited C4-S complex formation. The respective IC50 values were determined to be 10 ± 0.8 µM (T-06) and 1.6 ± 0.2 mM (NSC-13345).  5.4. Discussion CatK was first identified as an enzyme in osteoclasts responsible for the degradation of collagen in the bone matrix and is an antiresorptive drug target (342). Studies have shown that overexpression of the enzyme can lead to an increase in bone degradation, a key phenotype of osteoporosis (343). Highly potent and specific active site-directed inhibitors have been synthesized and evaluated in several clinical osteoporosis trials (83, 112).  These inhibitors proved effective in increasing bone mineral density and reducing fracture rates but were not approved by regulatory authorities. The main reasons were skin and cardiovascular side effects including an increase risk in stroke incidences in CatK inhibitor-treated patients (179, 271). In addition to off-target side effects, the inhibitors main shortcomings may have been the blocking of the regulatory activities of CatK (114). Thus, a substrate selective inhibition of CatK and other multifunctional drug targets would be beneficial in reducing potential side effects that inflict many drug development programs. We believe that this can be achieved with ectosteric inhibitors which block the hydrolysis of individual, disease relevant substrates (such collagen or elastin) without preventing the normal turnover of other substrates by the given enzyme.   194  In this study, we examined the inhibitory properties and binding sites of two inhibitors of CatK, NSC-13345 and T-06. Structural and kinetic studies revealed distinct inhibitory effects on the substrate specificity of the protease. NSC-13345 was able to partially inhibit the cleavage of longer peptides and a non-specific protein substrate such as azocasein as well as to completely inhibit the degradation of type I collagen. In contrast, T-06 was exclusively inhibitory against the degradation of collagen. This suggests that a substrate-specific inhibition of proteases is possible by targeting distinct ectosteric sites.  NSC-13345 was previously modifier of CatK (252) based on a crystal structure using an incompletely processed enzyme. Here, the partially remaining propeptide region was in close vicinity to the proposed allosteric site and likely has altered the binding site (Figure 5-4A). Our structure of the inhibitor/CatK complex with the fully processed mature CatK protein lacked the binding of NSC-13345 at this site. Moreover, the kinetic analysis of a CatK variant where the putative allosteric site was mutated did not reveal any difference to the wild-type protease in the inhibitory activity of the compound. However, the compound was found at three different other binding sites on the molecule. This also marks the first time that a small molecule inhibitor was structurally determined to bind at multiple sites on CatK.  As seen in the structural analysis, each binding site had multiple hydrophobic and hydrogen-bond interactions, supporting that the inhibitor can favourably bind at each of the sites (Figure 5-3 and Figure 5-5). Two of the binding sites found in the crystal structure may provide insight into a non-allosteric mechanism for the inhibitor activity of NSC-13345. Binding site 1 lies in the S3’ subsite region of the enzyme above its catalytic site and binding site 3 is in close proximity to a C4-S binding site required for collagenase activity. Overlay of the C4-S-bound CatK structure with our NSC-13345 complexed structure reveals that the inhibitor displaces 195  three acidic residues Lys119, Arg123, and Arg127 that directly interact with C4S and may therefore prevent these interactions (Figure 5-4B). Furthermore, binding site 2 also lies in close proximity to the previously determined allosteric site. Overlay of the two structures shows that the propeptide region from the non-fully processed enzyme occupies the same region as binding site 3 (Figure 5-4A). This may in part explain the difference in the inhibitor binding observed between the two structures. In addition, the results from global docking suggest that binding sites 1 and 3 had better calculated affinities than the previous allosteric site (Table S5-1). We hypothesize that the inhibitory activity of NSC-13345 can be attributed to its binding near the S3’ subsite (binding site 1 in Figure 5-3). This is supported by the molecular modeling of the two substrates which where affected by the inhibitor. Peptide docking using Glide with the peptide, HPGGPQ, predicted efficient binding to the enzyme (binding energy: -96.07 kcal/mol) with the peptide P3 to P3’ positions occupying the well established S3 to S3’ positions on the enzyme (Figure 5-7A). However, in the presence of NSC-13345, steric hindrance forced the peptide to bend in the P2’ and P3’ positions to occupy the space just above the catalytic residue His162 (Figure 5-7B). This caused a slight decrease in the predicted binding energy (-88.35 kcal/mol) and moves the scissile bond 2 Å away from the catalytic triad of the enzyme which would explain the about 50% drop kcat/Km value for this substrate. The binding of the inhibitor at the S3’ position also supports a non-competitive mechanism of inhibition as the binding site is now altered by the presence of NSC-13345. The predicted affinity of the substrate for the enzyme is relatively unchanged as calculated by the binding energies but the substrate is now forced into a kinetically unfavourable position, inhibiting its cleavage by the enzyme. This is reflected by the kinetic parameters determined in the presence of 50 µM of NSC-13345; only a significant decrease in the kcat was observed. 196  This mechanism was further corroborated by the characterization of the cleavage of peptide substrates of the general sequence, Abz-KLR-P1’P2’P3’-EDDnp. Our data indicated here as well that changes in the P3’ positions had the greatest effect on the catalytic efficiency, kcat/Km. at 250 µM NSC-13345 concentration. Peptides containing a Ser, Ala, or Phe in the P3’ position saw a minimum of 40% decrease in kcat/Km at 250 µM inhibitor concentration. Lys in the P3’ position had a less pronounced effect on the turnover of the hexapeptides. This was further investigated using molecular peptide docking of these substrates in the active site of CatK. When we compared the P3` side chain positions in the enzyme, the orientations for the different amino acid sequences differed depending on their amino acid in that position. Whereas, smaller side chains like Ser or Ala were restricted to the cleft just above the active site region in the vicinity of binding site 1 of NSC-13345. If binding site 1 is occupied by NSC-13345, the binding of these substrates would be restricted and thus will explain the inhibitory kinetic data measured. The partial inhibition of the azocasein hydrolysis in the presence of NSC-13345 can be explained in a similar manner (Figure 5-10A). In contrast, the shorter substrate, Z-FR-MCA, which does not occupy the S2’-S3’area, the kinetic parameters for the hydrolysis of this substrate in the presence and absence of the inhibitor were highly similar. As T-06 did not bind in the Schechter-Berger subsite areas, no inhibitory activity against any of the tested peptides and the non-specific azocasein substrate was observed.  Finally, analysis using multiplex substrate profiling of 228 peptide substrates corroberated the notion of the S3’ effect of NSC-13345. Out of the 32 substrates, which were at least hydrolyzed to 50% within one hour by CatK in the absence of NSC-13345, fourteen peptides showed a greater than 25% inhibition by the inhibitor from which seven were inhibited by over 50%. All the kinetic analysis data indicate that the hydrolysis of any of these substrates 197  is only partially inhibited by the S3’ bound NSC-13345. Conversely, none of these cleavages were affected by T-06 since T-06 did not bind in the active site region. But how can we explain the complete inhibition of type I collagen degradation by both inhibitors? A sole S3’ effect of NSC-13345 should only lead to a partial inhibition of collagen degradation by CatK. The solution to this might lie in two additional binding sites for this inhibitor. Whereas binding site 2 does not provide a rational explanation for the complete inhibition, binding site 3 that is located at the C4-S binding site, can explain our observation. We have previously shown that glycosaminoglycans such as C4-S are required for the efficient cleavage of collagen (73, 232). Residues Lys119, Arg123, and Arg127 that interact with C4-S are displaced in the presence of NSC-13345 and may thus prevent the glycosaminoglycan-mediated formation of collagenolytically active CatK complexes (73, 232, 320).  The CatK/T-06 crystal structure revealed four tanshinone binding sites. None of these were found to interfere with the Schechter-Berger subsites and explains the lack of inhibitory activity towards non-collagenous substrates. However, the identified binding sites also do not explain the anti-collagenase activity of T-06. We have previously postulated that an ectosteric site formed by a loop (residues Asp85-Ser95) in the L-domain is responsible for the protein-protein interaction in CatK complexes and is likely the binding site for tanshinone inhibitors (113, 114, 232, 284, 320). This site also scored highest in our molecular docking calculations among the eight potential sites. However, this site was occupied by a symmetry-related molecule in the CatK crystal and a T-06 molecule is unlikely to fit into the resulting cleft (Figure S5-4). In addition, the fluorescence polarization results suggested that the compound is able to disrupt the formation of the CatK oligomers. Therefore, it is reasonable to assume that this site is a binding site for the inhibitor which then can explain its anti-collagenase property.  198  Taking the kinetic and structural data of NSC-13345 and T-06 together, we propose that the compounds act as ectosteric inhibitors of CatK with multiple demonstrated binding locations on the enzyme. The binding sites either prevent the optimal binding of peptide and protein substrates by sterically blocking the S3’ binding area or by interfering with glycosaminoglycan or protein-protein interaction sites necessary for the formation of collagenolytically active CatK oligomers. In addition, these molecules can also bind at catalytically and inhibitory irrelevant sites. This needs to be considered when ectosteric or allosteric inhibitors are designed by molecular docking experiments as even predicted binding sites with high affinity may not be therapeutically relevant.     199  Supplementary Information (Chapter 5)  Table S5-1: Calculated binding affinities for binding sites through global docking Binding Site 1 2 3 4 5 6 7 8 NSC-13345 (mM) 0.43 5.7 0.49 0.96 0.40 1.50 0.37 3.7 T-06 (µM) 43 19.4 16.0 5.4 27.5 3.8 1.1 7.6  Table S5-2: Refined occupancies and average B-factors for the respective binding sites (NSC-13345 and T-06) Binding Site for NSC-13345 Refined Occupancy Average B-factor 1 1.00 14.08 2 1.00 15.63 3 1.00 15.26 Binding Site for T-06 Refined Occupancy Average B-factor 1 0.90 31.68 2 0.90 35.96 3 0.85 28.61 4 0.78 40.88   200   Figure S5-1 Ligplot diagrams for binding sites 2 and 3 at symmetry related positions The Ligplot diagrams for binding sites 2 and 3 at their symmetry related positions show minimal interactions with the enzyme and not likely to be effective binding sites.  201   Figure S5-2 Inhibition of NSC-13345 on the cleavage of Z-FR-MCA and Abz-HPGGPQ-EDDnp peptidic substrates. (A and B) No significant inhibition by NSC-13345 was observed on the hydrolysis (kcat, Km, kcat/Km) of Z-FR-MCA at low inhibitor to enzyme ratio (100 nM or 50 times excess) or high inhibitor to enzyme ratios (50 µM or 25,000 times excess) for both the wild-type and putative allosteric site mutant enzymes. (C and D) For the cleavage of Abz-HPGGPQ-EDDnp, at low 202  inhibitor to enzyme ratios (100 nM NSC-13345 or 50 times excess) no significant effect was observed for the kinetic parameters Km, kcat and kcat/Km for both wild-type and putative allosteric site mutant enzymes compared to the uninhibited enzyme. In the presence of 50 µM NSC-13345, no significant difference was observed on the Km on as compared with the uninhibited wild-type enzyme, but a significant difference was seen in the kcat and kcat/Km. (E and F) The Ki was determined for the cleavage of Abz-HPGGPQ-EDDnP using an non-competitive inhibition model with a Dixon plot to be 820 ± 90 µM and 380 ± 30 µM for the wild-type CatK and putative allosteric site mutant of CatK, respectively.   Figure S5-3 Substrate cleavage specificity of CatK by MSP-MS Substrate cleavage analysis of CatK by MSP-MS showed the characteristic substrate specificities for CatK. Cleavage occurs between positions 4 (S1) and 5 (S1’) and amino acids in pink were not found in that position. 203   Figure S5-4 Inhibition of CatK substrate cleavage by NSC-13345 and T-06 by MSP-MS Cleavage of 14 of 32 efficiently cleaved substrates (>50% cleavage after 60 min.) were identified to be inhibited by NSC-13345 at 50 µM but were uninhibited by T-06. Cleavage locations listed within the respective parantheses. The peptide substrate analysis was performed by MSP-MS.   204   Figure S5-4 Symmetry related elements of T-06-CatK complex The symmetry related element in the T-06-CatK complex shows that the protein oligomerization ectosteric site (red) is blocked. Molecular docking (blue) of T-06 to this site show significant steric hindrance with the neighboring molecule and is unlikely to be accommodated. The yellow T-06 molecules are from the CatK-T-06 complex structure.   205  6. Conclusions and Suggestions for Future Work CatK is a lysosomal cysteine protease highly expressed in osteoclasts and has been implicated in numerous physiological processes. One of its most unique features is its ability to degrade the collagen found in bone and CatK’s role in bone remodelling. Deficiency in CatK produces a unique disease phenotype, pycnodystosis, that is characterized by distinctive musculoskeletal pathologies (89). Implication of CatK in osteoporosis highlights CatK as an important target for drug development. Numerous CatK inhibitors have been assessed in clinical trials but despite high treatment efficacy, have all failed due to complicating side effects (179, 271). Therefore, studies into the enzymatic and inhibitory mechanisms of the enzyme are highly relevant in the development of novel therapeutics. This thesis provided insights into the different types of CatK inhibitors through structural, mutational and kinetic studies. A novel type of enzyme modulation based on ligands binding at ectosteric sites was introduced to describe those which selectively modulate enzymatic activities. They are selective for the enzymatic activity and in the case of CatK, can specifically modulate its collagenase activity without disrupting the active site activity.  In chapter 2, the selectivity of odanacatib for mouse CatK over human CatK was investigated using X-ray crystallography and mutagenesis approaches. Despite sharing over 85% sequence homology, mouse and human CatK differ significantly in the active site cleft region. This makes rodent models for the study of active site-directed inhibitors in preclinical trials difficult. Odanacatib is as an active-site directed inhibitor of CatK, which was in development for the treatment of osteoporosis for over a decade. However, it was recently abandoned after Phase III clinical trials due to cardiovascular related side effects (179). The inhibitor is highly selective for the human enzyme and is over 500 times more potent than for the mouse enzyme 206  (256). From the determined X-ray crystal structures, two features were identified in the mouse enzyme which defines odanacatib’s selectivity. A mouse CatK variant enzyme with these residues restored from the human enzyme displayed human phenotype for the inhibitor but maintained otherwise mouse CatK kinetic properties. In chapter 3, a composite docking method using multiple docking algorithms was developed for identifying non-active site inhibitors of CatK by library screening of the NCI Database containing 280,000 compounds (233). Molecular docking approaches have been demonstrated to aid in identifying novel chemical scaffolds for therapeutic targets and have now become an integral technique in modern drug design. Targeting of the protein oligomerization site required for the collagenase activity of CatK allowed for inhibition without disrupting its active site activity. By combining the results from three distinct docking methods, the hit rate and potency of the best compounds identified increased by three to five-fold when compared with the individual docking methods. In total, nine potent compounds were identified with IC50 values below 50 µM. Two compounds were subsequently tested in cell-based bone resorption assays with the most potent compound showing an IC50 value of approximately 300 nM. In chapter 4, a high throughput assay targeting the formation of oligomeric complexes involving CatK and chondroitin sulfate (C4-S) was developed for screening novel collagenase inhibitors (254). Inhibitors that disrupt complex formation are selective for the collagenase activity of CatK and do not interfere with other proteolytic activities of the enzyme. A fluorescence polarization (FP) assay was used to screen two chemical libraries containing a total of 5,017 compounds. Thirty eigth compounds were identified in the FP assay without displaying active site inhibition and were subsequently analysed for their ability to inhibit the degradation of collagen. Eight compounds were selective collagenase inhibitors with IC50 values below 200 µM 207  and three of them revealed IC50 values below 5 µM. Six of the most potent compounds were analyzed for their inhibitory potential in cell-based bone resorption assays where two compounds, EGCG and ATC, revealed anti-resorptive activity. Molecular modeling suggests that these compounds bind effectively at ectosteric sites previously identified as required for collagenase activity. Since none of these compounds displayed active site inhibition, they were predicted to only disrupt the formation of collagenolytic oligomers.  In chapter 5, the binding mode of NSC-13345, a previously proposed allosteric inhibitor of CatK, was characterized in detail by molecular modeling and crystallography and site-directed mutagenesis studies. A non-allosteric binding mode of the compound was demonstrated which also corroborated with the fact that no inhibitor-induced structural changes were observed in the active site region of the enzyme. The binding of the compound in the S3’ subsite and at the C4-S binding site of CatK supported the idea of an ectosteric mechanism to explain the modulation of the turnover of selected substrates. T-06, a potent collagenase inhibitor of CatK was also characterized to bind at several distinct locations on the enzyme which explained the modulating effect of the inhibitors on CatK activity.   In conclusion, the work presented in this dissertation provides structural and kinetic insights into the inhibitory and enzymatic mechanisms of CatK. It also introduces the novel concept of ectosteric sites and ectosteric inhibitors as a way of specifically regulating enzyme activity. Due to the development of active site-directed CatK inhibitors and its unique collagenolytic properties regulated by the presence of ectosteric sites, it represents an excellent model protease to study different inhibitory mechanisms. Significant advancements have been made in the understanding of the specificity found for the promising active site-directed inhibitor through the determination of structure of the hCatK-ODN complex and the wild-type mouse 208  enzymes. This information has allowed for the development of a transgenic mouse model that will be used for the study of side effects associated with active site-directed inhibitors of CatK. In addition, this thesis covers the development of inhibitors targeting ectosteric sites located on the enzyme as a novel approach to target the therapeutically relevant collagenase activity of CatK. Using both computational and high-throughput methods, several of these inhibitors were identified and characterized. Moreover, further insight into an inhibitor previously thought to be an allosteric regulator of CatK suggest that it behaves as a substrate selective ectosteric inhibitor of CatK with unique binding sites that regulate the activity of the enzyme.  Suggestions for Future Work This work focused on elucidating enzymatic and inhibitory mechanisms found in CatK. Additional application-based research of the insights gained in this thesis would further the development of novel therapeutics for inhibitors of CatK and other cathepsins. The thesis formed the basis of further work: Chapter 2: A transgenic mouse model based on the Tyr61 and the S2 subsite structural elements discovered through this research can be used to study the side effects of active site-directed inhibitors such as odanacatib. Rodent models are widely used to study the effects of osteoporosis but due to the difference in drug potencies between the mouse and human CatK, rodent models were unsuitable for use. The availability of huminized murine model will allow understanding the causes of the side effects of inhibitors such as odanacatib which led to late stage termination 209  of clinical trials and developing new treatment strategies such as drug delivery systems which can alleviate those effects and guide future therapeutics for the treatment of osteoporosis. Chapter 3: Using solely computational methods, several potent ectosteric inhibitors were identified and characterized for their collagenase inhibitory activity of CatK. Their potency in cell-based assays lends to further development and studies through X-ray crystallography studies or other binding studies to determine properties such as Ki. In addition, the molecular docking identified several families of compounds which were active in inhibiting CatK. With the determination of the exact binding modes of some of these compounds, future structural activity relationship (SAR) studies will lead to the development of more specific and potent compounds. Our developed composite docking method performed markedly better than the individual methods, and can be applied to other drug targets. For example, targeting the ectosteric sites identified in other enzymes may lead to the identification of novel therapeutics that selectively modulate their disease relevant activities without disrupting their physiological activities. This method demonstrated noticeably higher hit rates and inhibitor potencies compared to the individual docking methods and could facilitate the discovery of novel therapeutics in other hard to drug targets. Chapter 4: Several potent ectosteric collagenase inhibitors for CatK were also identified using a high-throughput fluorescence polarization screening assay. Further studies such as X-ray crystallography can be used to determine the binding modes of the compounds and the specific interactions that the compounds make with the enzyme. This information can then be used to 210  develop more specific and potent inhibitors. Alternatively, the screening method can be exploited to evaluate other compound libraries to identify a greater number of potential inhibitors which subsequently will provide new scaffolds for future SAR studies. Chapter 5: Crystallization of NSC-13345 and T-06 with the fully active and mature CatK enzyme provided greater insight into their inhibitory mechanisms. The respective crystal structures also showed multiple binding sites for a CatK inhibitor with distinctive inhibitory effects for the first time. This study opened the possibility to design inhibitors which selecteively block the hydrolysis of several individual substartes.  This is of therapeutic interest but also suitable for basic research to study individual activities of a target enzyme in celluar and animal models.    211  References 1.  López-Otín, C., and Bond, J. S. (2008) Proteases: Multifunctional enzymes in life and disease. J. Biol. Chem. 283, 30433–30437 2.  Rawlings, N. D., Barrett, A. J., and Finn, R. (2016) Twenty years of the MEROPS database of proteolytic enzymes, their substrates and inhibitors. Nucleic Acids Res. 44, D343–D350 3.  Keller, P. J., Cohen, E., and Neurath, H. (1958) The Proteins of Bovine Pancreatic Juice. J. Biol. Chem. 233, 344–349 4.  Rawlings, N. D. (2013) Protease families, evolution and mechanism of action. in Proteases: Structure and Function, pp. 1–36, Springer Vienna, Vienna, 10.1007/978-3-7091-0885-7_1 5.  Rawlings, N. D., and Barrett, A. J. (1999) MEROPS: The peptidase database. Nucleic Acids Res. 27, 325–331 6.  Powers, J. C., Asgian, J. L., Ekici, Ö. D., and James, K. E. (2002) Irreversible inhibitors of serine, cysteine, and threonine proteases. Chem. Rev. 102, 4639–4750 7.  Osmulski, P. A., Hochstrasser, M., and Gaczynska, M. (2009) A Tetrahedral Transition State at the Active Sites of the 20S Proteasome Is Coupled to Opening of the α-Ring Channel. Structure. 17, 1137–1147 8.  Tulinsky, A., and Blevins, R. A. (1987) Structure of a tetrahedral transition state complex of alpha-chymotrypsin dimer at 1.8-A resolution. J. Biol. Chem. 262, 7737–7743 9.  Jimenez-Morales, D., Liang, J., and Eisenberg, B. (2012) Ionizable side chains at catalytic active sites of enzymes. Eur. Biophys. J. 41, 449–460 10.  Eriksson, M., Uhlin, U., Ramaswamy, S., Ekberg, M., et al. (1997) Binding of allosteric effectors to ribonucleotide reductase protein R1: Reduction of active-site cysteines promotes substrate binding. Structure. 5, 1077–1092 11.  Song, W., Nadeau, P., Yuan, M., Yang, X., et al. (1999) Proteolytic release and nuclear translocation of Notch-1 are induced by presenilin-1 and impaired by pathogenic 212  presenilin-1 mutations. Proc. Natl. Acad. Sci. U. S. A. 96, 6959–6963 12.  Gutteridge, A., and Thornton, J. (2004) Conformational change in substrate binding, catalysis and product release: An open and shut case? FEBS Lett. 567, 67–73 13.  Harper, E., and Berger, A. (1972) On the size of the active site in proteases: Pronase. Biochem. Biophys. Res. Commun. 46, 1956–1960 14.  Evnin, L. B., Vasquez, J. R., and Craik, C. S. (1990) Substrate specificity of trypsin investigated by using a genetic selection. Proc. Natl. Acad. Sci. 87, 6659–6663 15.  Vanaman, T. C., and Bradshaw, R. A. (1999) Proteases in cellular regulation minireview series. J. Biol. Chem. 274, 20047 16.  Antalis, T. M., Shea-Donohue, T., Vogel, S. N., Sears, C., et al. (2007) Mechanisms of disease: Protease functions in intestinal mucosal pathobiology. Nat. Clin. Pract. Gastroenterol. Hepatol. 4, 393–402 17.  Agbowuro, A. A., Huston, W. M., Gamble, A. B., and Tyndall, J. D. A. (2018) Proteases and protease inhibitors in infectious diseases. Med. Res. Rev. 38, 1295–1331 18.  Sugumaran, M., Saul, S. J., and Ramesh, N. (1985) Endogenous protease inhibitors prevent undesired activation of prophenolase in insect hemolymph. Biochem. Biophys. Res. Commun. 132, 1124–1129 19.  Grossman, M., Tworowski, D., Dym, O., Lee, M. H., et al. (2010) The intrinsic protein flexibility of endogenous protease inhibitor TIMP-1 controls its binding interface and affects its function. Biochemistry. 49, 6184–6192 20.  Saul, S. J., and Sugumaran, M. (1986) Protease inhibitor controls prophenoloxidase activation in Manduca sexta. FEBS Lett. 208, 113–116 21.  Bruckner-Tuderman, L., and Bruckner, P. (1998) Genetic diseases of the extracellular matrix: more than just connective tissue disorders. J. Mol. Med. (Berl). 76, 226–237 22.  Frantz, C., Stewart, K. M., and Weaver, V. M. (2010) The extracellular matrix at a glance. J. Cell Sci. 123, 4195–4200 23.  OpenStax CNX OpenStax, Biology. [online] 213  https://cnx.org/contents/GFy_h8cu@11.1:8Uypx7vu@7/Connections-between-Cells-and- (Accessed May 31, 2018) 24.  Alford, A. I., Kozloff, K. M., and Hankenson, K. D. (2015) Extracellular matrix networks in bone remodeling. Int. J. Biochem. Cell Biol. 65, 20–31 25.  Kresse, H., and Schnherr, E. (2001) Proteoglycans of the extracellular matrix and growth control. J. Cell. Physiol. 189, 266–274 26.  Somaiah, C., Kumar, A., Mawrie, D., Sharma, A., et al. (2015) Collagen promotes higher adhesion, survival and proliferation of mesenchymal stem cells. PLoS One. 10, e0145068 27.  Kubow, K. E., Vukmirovic, R., Zhe, L., Klotzsch, E., et al. (2015) Mechanical forces regulate the interactions of fibronectin and collagen i in extracellular matrix. Nat. Commun. 6, 8026 28.  Wise, S. G., and Weiss, A. S. (2009) Tropoelastin. Int. J. Biochem. Cell Biol. 41, 494–497 29.  Mosher, D. F. (1993) Assembly of fibronectin into extracellular matrix. Curr. Opin. Struct. Biol. 3, 214–222 30.  Kular, J. K., Basu, S., and Sharma, R. I. (2014) The extracellular matrix: Structure, composition, age-related differences, tools for analysis and applications for tissue engineering. J. Tissue Eng. 5, 204173141455711 31.  Lu, P., Takai, K., Weaver, V. M., and Werb, Z. (2011) Extracellular Matrix degradation and remodeling in development and disease. Cold Spring Harb. Perspect. Biol. 10.1101/cshperspect.a005058 32.  Saunders, W. B. (2005) MMP-1 activation by serine proteases and MMP-10 induces human capillary tubular network collapse and regression in 3D collagen matrices. J. Cell Sci. 118, 2325–2340 33.  Nagase, H. (1997) Activation mechanisms of matrix metalloproteinases. J. Biol. Chem. 378, 151–160 34.  Lutgens, S. P. M., Cleutjens, K. B. J. M., Daemen, M. J. A. P., and Heeneman, S. (2007) Cathepsin cysteine proteases in cardiovascular disease. FASEB J. 21, 3029–3041 214  35.  Bonnans, C., Chou, J., and Werb, Z. (2014) Remodelling the extracellular matrix in development and disease. Nat. Rev. Mol. Cell Biol. 15, 786–801 36.  Lu, P., Weaver, V. M., and Werb, Z. (2012) The extracellular matrix: A dynamic niche in cancer progression. J. Cell Biol. 196, 395–406 37.  Johnson, J. L., Jackson, C. L., Angelini, G. D., and George, S. J. (1998) Activation of matrix-degrading metalloproteinases by mast cell proteases in atherosclerotic plaques. Arterioscler. Thromb. Vasc. Biol. 18, 1707–1715 38.  Orbe, J., Fernandez, L., Rodríguez, J. A., Rábago, G., et al. (2003) Different expression of MMPs/TIMP-1 in human atherosclerotic lesions. Relation to plaque features and vascular bed. Atherosclerosis. 170, 269–276 39.  Zavašnik-Bergant, T., and Turk, B. (2006) Cysteine cathepsins in the immune response. Tissue Antigens. 67, 349–355 40.  Barrett, A. J., and Rawlings, N. D. (1996) Families and clans of cysteine peptidases. Perspect. Drug Discov. Des. 6, 1–11 41.  Otto, H.-H., and Schirmeister, T. (1997) Cysteine Proteases and Their Inhibitors. Chem. Rev. 97, 133–172 42.  Brömme, D. (2000) Papain-like Cysteine Proteases. in Current Protocols in Protein Science, p. 21.2.1-21.2.14, John Wiley & Sons, Inc., Hoboken, NJ, USA, 21, 21.2.1-21.2.14 43.  SMITH, E. L. (1957) Active site and structure of crystalline papain. Fed. Proc. 16, 801–9 44.  Turk, V., Stoka, V., Vasiljeva, O., Renko, M., et al. (2012) Cysteine cathepsins: From structure, function and regulation to new frontiers. Biochim. Biophys. Acta - Proteins Proteomics. 1824, 68–88 45.  Linnevers, C., Smeekens, S. P., and Brömme, D. (1997) Human cathepsin W, a putative cysteine protease predominantly expressed in CD8+T-lymphocytes. FEBS Lett. 405, 253–259 46.  Jokimaa, V., Oksjoki, S., Kujari, H., Vuorio, E., et al. (2001) Expression patterns of 215  cathepsins B, H, K, L and S in the human endometrium. Mol. Hum. Reprod. 7, 73–78 47.  Garnero, P., Borel, O., Byrjalsen, I., Ferreras, M., et al. (1999) The collagenolytic activity of cathepsin K is unique among mammalian proteinases. J. Biol. Chem. 273, 32347–32352 48.  Benes, P., Vetvicka, V., and Fusek, M. (2008) Cathepsin D-Many functions of one aspartic protease. Crit. Rev. Oncol. Hematol. 68, 12–28 49.  Burster, T., Macmillan, H., Hou, T., Boehm, B. O., et al. (2010) Cathepsin G: Roles in antigen presentation and beyond. Mol. Immunol. 47, 658–665 50.  Brömme, D., Okamoto, K., Wang, B. B., and Biroc, S. (1996) Human cathepsin O2, a matrix protein-degrading cysteine protease expressed in osteoclasts: Functional expression of human cathepsin O2 in Spodoptera frugiperda and characterization of the enzyme. J. Biol. Chem. 271, 2126–2132 51.  Drake, M. T., Clarke, B. L., Oursler, M. J., and Khosla, S. (2017) Cathepsin K Inhibitors for Osteoporosis: Biology, Potential Clinical Utility, and Lessons Learned. Endocr. Rev. 38, 325–350 52.  Guay, J., Falgueyret, J. P., Ducret, A., Percival, M. D., et al. (2000) Potency and selectivity of inhibition of cathepsin K, L and S by their respective propeptides. Eur. J. Biochem. 267, 6311–6318 53.  Maubach, G., Schilling, K., Rommerskirch, W., Wenz, I., et al. (1997) The inhibition of cathepsin S by its propeptide. Specificity and mechanism of action. Eur. J. Biochem. 250, 745–750 54.  Fox, T., Storer, A. C., de Miguel, E., and Mort, J. S. (1992) Potent Slow-Binding Inhibition of Cathepsin B by Its Propeptide. Biochemistry. 31, 12571–12576 55.  Carmona, E., Dufour, É., Plouffe, C., Takebe, S., et al. (1996) Potency and selectivity of the cathepsin L propeptide as an inhibitor of cysteine proteases. Biochemistry. 35, 8149–8157 56.  McQueney, M. S., Amegadzie, B. Y., D’Alessio, K., Hanning, C. R., et al. (1997) 216  Autocatalytic activation of human cathepsin K. J. Biol. Chem. 272, 13955–13960 57.  Kamphuis, I. G., Kalk, K. H., Swarte, M. B. A., and Drenth, J. (1984) Structure of papain refined at 1.65 Å resolution. J. Mol. Biol. 179, 233–256 58.  Majer, P., Collins, J. R., Gulnik, S. V., and Erickson, J. W. (1997) Structure-based subsite specificity mapping of human cathepsin D using statine-based inhibitors. Protein Sci. 6, 1458–1466 59.  Taralp, A., Kaplan, H., Sytwu, I. I., Vlattas, I., et al. (1995) Characterization of the S3 subsite specificity of cathepsin B. J. Biol. Chem. 270, 18036–43 60.  Alves, M. F. M., Puzer, L., Cotrin, S. S., Juliano, M. A., et al. (2003) S3 to S3’ subsite specificity of recombinant human cathepsin K and development of selective internally quenched fluorescent substrates. Biochem. J. 373, 981–6 61.  Lecaille, F., Chowdhury, S., Purisima, E., Brömme, D., et al. (2007) The S2 subsites of cathepsins K and L and their contribution to collagen degradation. Protein Sci. 16, 662–70 62.  McGrath, M. E., Klaus, J. L., Barnes, M. G., and Bromme, D. (1997) Crystal structure of human cathepsin K complexed with a potent inhibitor. Nat. Struct. Biol. 4, 105–109 63.  Gelb, B. D., Shi, G. P., Heller, M., Weremowicz, S., et al. (1997) Structure and chromosomal assignment of the human cathepsin K gene. Genomics. 41, 258–262 64.  Boyce, B. F., and Xing, L. (2008) Functions of RANKL/RANK/OPG in bone modeling and remodeling. Arch. Biochem. Biophys. 473, 139–146 65.  Troen, B. R. (2006) The regulation of cathepsin K gene expression. in Annals of the New York Academy of Sciences, pp. 165–172, Wiley/Blackwell (10.1111), 1068, 165–172 66.  Matsumoto, M., Kogawa, M., Wada, S., Takayanagi, H., et al. (2004) Essential role of p38 mitogen-activated protein kinase in cathepsin K gene expression during osteoclastogenesis through association of NFATc1 and PU.1. J. Biol. Chem. 279, 45969–45979 67.  Pang, M., Martinez, A. F., Fernandez, I., Balkan, W., et al. (2007) AP-1 stimulates the cathepsin K promoter in RAW 264.7 cells. Gene. 403, 151–158 68.  Keegan, P. M., Wilder, C. L., and Platt, M. O. (2012) Tumor necrosis factor alpha 217  stimulates cathepsin K and V activity via juxtacrine monocyte-endothelial cell signaling and JNK activation. Mol. Cell. Biochem. 367, 65–72 69.  Kim, R. Y., Yang, H. J., Song, Y. M., Kim, I. S., et al. (2015) Estrogen Modulates Bone Morphogenetic Protein-Induced Sclerostin Expression Through the Wnt Signaling Pathway. Tissue Eng. Part A. 21, 2076–2088 70.  Brubaker, K. D., Vessella, R. L., True, L. D., Thomas, R., et al. (2003) Cathepsin K mRNA and protein expression in prostate cancer progression. J. Bone Miner. Res. 18, 222–230 71.  Bromme, D., and Okamoto, K. (1995) Human Cathepsin 02, a Novel Cysteine Protease Highly Expressed in Osteoclastomas and Ovary Molecular Cloning, Sequencing and Tissue Distribution. Biol. Chem. Hoppe. Seyler. 376, 379–384 72.  Zhao, B., Janson, C. A., Amegadzie, B. Y., D’Alessio, K., et al. (1997) Crystal structure of human osteoclast cathepsin K complex with E-64. Nat. Struct. Biol. 4, 109–111 73.  Li, Z., Kienetz, M., Cherney, M. M., James, M. N. G., et al. (2008) The Crystal and Molecular Structures of a Cathepsin K:Chondroitin Sulfate Complex. J. Mol. Biol. 383, 78–91 74.  Choe, Y., Leonetti, F., Greenbaum, D. C., Lecaille, F., et al. (2006) Substrate profiling of cysteine proteases using a combinatorial peptide library identifies functionally unique specificities. J. Biol. Chem. 281, 12824–12832 75.  Ricard-Blum, S., Ruggiero, F., and Van Der Rest, M. (2005) The collagen superfamily. Top. Curr. Chem. 247, 35–84 76.  Engel, J., and Bächinger, H. P. (2005) Structure, stability and folding of the collagen triple helix. Top. Curr. Chem. 247, 7–33 77.  Perumal, S., Antipova, O., and Orgel, J. P. R. O. (2008) Collagen fibril architecture, domain organization, and triple-helical conformation govern its proteolysis. Proc. Natl. Acad. Sci. 105, 2824–2829 78.  Shoulders, M. D., and Raines, R. T. (2009) Collagen Structure and Stability. Annu. Rev. 218  Biochem. 78, 929–958 79.  Ramshaw, J. A. M., Shah, N. K., and Brodsky, B. (1998) Gly-X-Y tripeptide frequencies in collagen: A context for host-guest triple-helical peptides. J. Struct. Biol. 122, 86–91 80.  Brodsky, B., Thiagarajan, G., Madhan, B., and Kar, K. (2008) Triple-helical peptides: An approach to collagen conformation, stability, and self-association. Biopolymers. 89, 345–353 81.  Kafienah, W., Buttle, D. J., Burnett, D., and Hollander, A. P. (1998) Cleavage of native type I collagen by human neutrophil elastase. Biochem. J. 330 ( Pt 2, 897–902 82.  Nagase, H., and Visse, R. (2011) Triple Helicase Activity and the Structural Basis of Collagenolysis. in Extracellular Matrix Degradation, pp. 95–122, Springer Berlin Heidelberg, Berlin, Heidelberg, 10.1007/978-3-642-16861-1_5 83.  Costa, A. G., Cusano, N. E., Silva, B. C., Cremers, S., et al. (2011) Cathepsin K: Its skeletal actions and role as a therapeutic target in osteoporosis. Nat. Rev. Rheumatol. 7, 447–456 84.  Raisz, L. G. (1999) Physiology and pathophysiology of bone remodeling. Clin. Chem. 45, 1353–1358 85.  Hou, W. S., Li, Z., Büttner, F. H., Bartnik, E., et al. (2003) Cleavage site specificity of cathepsin K toward cartilage proteoglycans and protease complex formation. Biol. Chem. 384, 891–897 86.  Vinardell, T., Dejica, V., Poole, A. R., Mort, J. S., et al. (2009) Evidence to suggest that cathepsin K degrades articular cartilage in naturally occurring equine osteoarthritis. Osteoarthr. Cartil. 17, 375–383 87.  van den Brûle, S., Misson, P., Bühling, F., Lison, D., et al. (2005) Overexpression of cathepsin K during silica-induced lung fibrosis and control by TGF-β. Respir. Res. 6, 84 88.  Bühling, F., Röcken, C., Brasch, F., Hartig, R., et al. (2004) Pivotal role of cathepsin K in lung fibrosis. Am. J. Pathol. 164, 2203–2216 89.  Gelb, B. D., Shi, G. P., Chapman, H. a, and Desnick, R. J. (1996) Pycnodysostosis, a 219  lysosomal disease caused by cathepsin K deficiency. Science (80-. ). 273, 1236–1238 90.  Maroteaux, P., and Lamy, M. (1962) “[Pyknodysostosis.].” Presse Med. 70, 999–1002 91.  Hou, W. S., Brömme, D., Zhao, Y., Mehler, E., et al. (1999) Characterization of novel cathepsin K mutations in the pro and mature polypeptide regions causing pycnodysostosis. J. Clin. Invest. 103, 731–738 92.  Hadjidakis, D. J., and Androulakis, I. I. (2006) Bone remodeling. Ann. N. Y. Acad. Sci. 1092, 385–396 93.  Ohno, H., Kubo, K., Murooka, H., Kobayashi, Y., et al. (2006) A c-fms tyrosine kinase inhibitor, Ki20227, suppresses osteoclast differentiation and osteolytic bone destruction in a bone metastasis model. Mol. Cancer Ther. 5, 2634–2643 94.  Narayanan, P. (2013) Denosumab: A comprehensive review. South Asian J. Cancer. 2, 272 95.  Karsdal, M. A., Henriksen, K., Sørensen, M. G., Gram, J., et al. (2005) Acidification of the osteoclastic resorption compartment provides insight into the coupling of bone formation to bone resorption. Am. J. Pathol. 166, 467–476 96.  Teitelbaum, S. L. (2000) Bone resorption by osteoclasts. Science (80-. ). 289, 1504–1508 97.  Bi, H., Chen, X., Gao, S., Yu, X., et al. (2017) Key Triggers of Osteoclast-Related Diseases and Available Strategies for Targeted Therapies: A Review. Front. Med. 4, 234 98.  Das, S., and Crockett, J. C. (2013) Osteoporosis - a current view of pharmacological prevention and treatment. Drug Des. Devel. Ther. 7, 435–448 99.  Gennari, L., Rotatori, S., Bianciardi, S., Nuti, R., et al. (2016) Treatment needs and current options for postmenopausal osteoporosis. Expert Opin. Pharmacother. 17, 1141–1152 100.  Troen, B. R. (2006) The regulation of cathepsin K gene expression. Ann. N. Y. Acad. Sci. 1068, 165–172 101.  Furuyama, N., and Fujisawa, Y. (2000) Regulation of collagenolytic cysteine protease synthesis by estrogen in osteoclasts. Steroids. 65, 371–8 220  102.  Okman-Kilic, T. (2015) Estrogen Deficiency and Osteoporosis. in Advances in Osteoporosis, pp. 7–18, InTech, 10.5772/59407 103.  Roforth, M. M., Fujita, K., McGregor, U. I., Kirmani, S., et al. (2014) Effects of age on bone mRNA levels of sclerostin and other genes relevant to bone metabolism in humans. Bone. 59, 1–6 104.  Gao, Y., Huang, E., Zhang, H., Wang, J., et al. (2013) Crosstalk between Wnt/β-catenin and estrogen receptor signaling synergistically promotes osteogenic differentiation of mesenchymal progenitor cells. PLoS One. 8, e82436 105.  Becker, D. J., Kilgore, M. L., and Morrisey, M. A. (2010) The societal burden of osteoporosis. Curr. Rheumatol. Rep. 12, 186–191 106.  Tarride, J. E., Guo, N., Hopkins, R., Leslie, W. D., et al. (2012) The burden of illness of osteoporosis in Canadian men. J. Bone Miner. Res. 27, 1830–1838 107.  Rossini, M., Adami, G., Adami, S., Viapiana, O., et al. (2016) Safety issues and adverse reactions with osteoporosis management. Expert Opin. Drug Saf. 15, 321–332 108.  Watts, N. B., and Diab, D. L. (2010) Long-term use of bisphosphonates in osteoporosis. J. Clin. Endocrinol. Metab. 95, 1555–1565 109.  Kennel, K. A., and Drake, M. T. (2009) Adverse effects of bisphosphonates: Implications for osteoporosis management. Mayo Clin. Proc. 84, 632–638 110.  Subbiah, V., Madsen, V. S., Raymond, A. K., Benjamin, R. S., et al. (2010) Of mice and men: Divergent risks of teriparatide-induced osteosarcoma. Osteoporos. Int. 21, 1041–1045 111.  Watanabe, A., Yoneyama, S., Nakajima, M., Sato, N., et al. (2012) Osteosarcoma in Sprague-Dawley rats after long-term treatment with teriparatide (human parathyroid hormone (1-34)). J Toxicol Sci. 37, 617–629 112.  Brömme, D., Panwar, P., and Turan, S. (2016) Cathepsin K osteoporosis trials, pycnodysostosis and mouse deficiency models: Commonalities and differences. Expert Opin. Drug Discov. 11, 457–472 221  113.  Panwar, P., Law, S., Jamroz, A., Azizi, P., et al. (2018) Tanshinones that selectively block the collagenase activity of cathepsin K provide a novel class of ectosteric antiresorptive agents for bone. Br. J. Pharmacol. 175, 902–923 114.  Panwar, P., Xue, L., Søe, K., Srivastava, K., et al. (2017) An Ectosteric Inhibitor of Cathepsin K Inhibits Bone Resorption in Ovariectomized Mice. J. Bone Miner. Res. 32, 2415–2430 115.  Guo, Y., Li, Y., Xue, L., Severino, R. P., et al. (2014) Salvia miltiorrhiza: An ancient Chinese herbal medicine as a source for anti-osteoporotic drugs. J. Ethnopharmacol. 155, 1401–1416 116.  Venkatasamy, R., and Spina, D. (2007) Protease inhibitors in respiratory disease: focus on asthma and chronic obstructive pulmonary disease. Expert Rev. Clin. Immunol. 3, 365–381 117.  Vergnolle, N. (2016) Protease inhibition as new therapeutic strategy for GI diseases. Gut. 65, 1215–1224 118.  DeClerck, Y. A., and Imren, S. (1994) Protease inhibitors: Role and potential therapeutic use in human cancer. Eur. J. Cancer. 30, 2170–2180 119.  Deu, E., Verdoes, M., and Bogyo, M. (2012) New approaches for dissecting protease functions to improve probe development and drug discovery. Nat. Struct. Mol. Biol. 19, 9–16 120.  Kasperkiewicz, P., Poreba, M., Groborz, K., and Drag, M. (2017) Emerging challenges in the design of selective substrates, inhibitors and activity-based probes for indistinguishable proteases. FEBS J. 284, 1518–1539 121.  Turk, B. (2006) Targeting proteases: Successes, failures and future prospects. Nat. Rev. Drug Discov. 5, 785–799 122.  Mehmood, S., Marcoux, J., Gault, J., Quigley, A., et al. (2016) Mass spectrometry captures off-target drug binding and provides mechanistic insights into the human metalloprotease ZMPSTE24. Nat. Chem. 8, 1152–1158 222  123.  Salvesen, G. S., and Riedl, S. J. (2007) Caspase Inhibition, Specifically. Structure. 15, 513–514 124.  Thompson, R. C. (1977) [19] Peptide Aldehydes: Potent Inhibitors of Serine and Cysteine Proteases. Methods Enzymol. 46, 220–225 125.  McCauley, J. A., and Rudd, M. T. (2016) Hepatitis C virus NS3/4a protease inhibitors. Curr. Opin. Pharmacol. 30, 84–92 126.  Flexner, C. (2007) HIV drug development: The next 25 years. Nat. Rev. Drug Discov. 6, 959–966 127.  Colwell, N. S., Blinder, M. A., Tsiang, M., Gibbs, C. S., et al. (1998) Allosteric effects of a monoclonal antibody against thrombin exosite II. Biochemistry. 37, 15057–15065 128.  Perni, R. B., Pitlik, J., Britt, S. D., Court, J. J., et al. (2004) Inhibitors of hepatitis C virus NS3·4A protease 2. Warhead SAR and optimization. Bioorganic Med. Chem. Lett. 14, 1441–1446 129.  Yin, Z., Patel, S. J., Wang, W. L., Wang, G., et al. (2006) Peptide inhibitors of dengue virus NS3 protease. Part 1: Warhead. Bioorganic Med. Chem. Lett. 16, 36–39 130.  Yaginuma, S., Asahi, A., Morishita, A., Hayashi, M., et al. (1989) Isolation and characterization of new thiol protease inhibitors estatins A and B. J. Antibiot. (Tokyo). 42, 1362–1369 131.  Powers, J. C., Kam, C. ‐M, Narasimhan, L., Oleksyszyn, J., et al. (1989) Mechanism‐based isocoumarin inhibitors for serine proteases: Use of active site structure and substrate specificity in inhibitor design. J. Cell. Biochem. 39, 33–46 132.  Thompson, S. K., Halbert, S. M., Bossard, M. J., Tomaszek, T. a, et al. (1997) Design of potent and selective human cathepsin K inhibitors that span the active site. Proc. Natl. Acad. Sci. U. S. A. 94, 14249–14254 133.  Richardson, J. L., Kröger, B., Hoeffken, W., Sadler, J. E., et al. (2000) Crystal structure of the human alpha-thrombin-haemadin complex: an exosite II-binding inhibitor. EMBO J. 19, 5650–60 223  134.  Powers, J. C. (1977) [16] Reaction of Serine Proteases with Halomethyl Ketones. Methods Enzymol. 46, 197–208 135.  Imperiali, B., and Abeles, R. H. (1986) Inhibition of Serine Proteases by Peptidyl Fluoromethyl Ketones. Biochemistry. 25, 3760–3767 136.  Barrett, A. J., Kembhavi, A. A., Brown, M. A., Kirschke, H., et al. (1982) L-trans-Epoxysuccinyl-leucylamido(4-guanidino)butane (E-64) and its analogues as inhibitors of cysteine proteinases including cathepsins B, H and L. Biochem. J. 201, 189–198 137.  Yu, Z., Caldera, P., McPhee, F., De Voss, J. J., et al. (1996) Irreversible inhibition of the HIV-1 protease: Targeting alkylating agents to the catalytic aspartate groups. J. Am. Chem. Soc. 118, 5846–5856 138.  Strelow, J. M. (2017) A perspective on the kinetics of covalent and irreversible inhibition. SLAS Discov. 22, 3–20 139.  Muhaxhiri, Z., Deng, L., Shanker, S., Sankaran, B., et al. (2013) Structural Basis of Substrate Specificity and Protease Inhibition in Norwalk Virus. J. Virol. 87, 4281–4292 140.  Tichá, A., Stanchev, S., Vinothkumar, K. R., Mikles, D. C., et al. (2017) General and Modular Strategy for Designing Potent, Selective, and Pharmacologically Compliant Inhibitors of Rhomboid Proteases. Cell Chem. Biol. 24, 1523–1536.e4 141.  Grammer, T. C., and Blenis, J. (1996) The serine protease inhibitors, tosylphenylalanine chloromethyl ketone and tosyllysine chloromethyl ketone, potently inhibit pp70(s6k) activation. J. Biol. Chem. 271, 23650–23652 142.  Ermer, A., Baumann, H., Steude, G., Peters, K., et al. (1990) Peptide diazomethyl ketones are inhibitors of subtilisin-type serine proteases. J. Enzyme Inhib. Med. Chem. 4, 35–42 143.  Porter, S. B., Hildebrandt, E. R., Breevoort, S. R., Mokry, D. Z., et al. (2007) Inhibition of the CaaX proteases Rce1p and Ste24p by peptidyl (acyloxy)methyl ketones. Biochim. Biophys. Acta - Mol. Cell Res. 1773, 853–862 144.  Matsumoto, K., Mizoue, K., Kitamura, K., Tse, W. C., et al. (1999) Structural basis of inhibition of cysteine proteases by E-64 and its derivatives. Biopolym. - Pept. Sci. Sect. 51, 224  99–107 145.  Woo, J. T., Ono, H., and Tsuji, T. (1995) Cathestatins, new cysteine protease inhibitors produced by penicillium citrinum. Biosci. Biotechnol. Biochem. 59, 350–352 146.  Jacobsen, W., Christians, U., and Benet, L. Z. (2000) In vitro evaluation of the disposition of a novel cysteine protease inhibitor. Drug Metab. Dispos. 28, 1343–1351 147.  Palmer, J. T., Rasnick, D., Klaus, J. L., and Brömme, D. (1995) Vinyl Sulfones as Mechanism-Based Cysteine Protease Inhibitors. J. Med. Chem. 38, 3193–3196 148.  Powers, J. C., and Frank Gupton, B. (1977) [17] Reaction of Serine Proteases with Aza-Amino Acid and Aza-Peptide Derivatives. Methods Enzymol. 46, 208–216 149.  Cognetta, A. B., Niphakis, M. J., Lee, H. C., Martini, M. L., et al. (2015) Selective N-Hydroxyhydantoin Carbamate Inhibitors of Mammalian Serine Hydrolases. Chem. Biol. 22, 928–937 150.  Alexander, J. P., and Cravatt, B. F. (2005) Mechanism of carbamate inactivation of FAAH: Implications for the design of covalent inhibitors and in vivo functional probes for enzymes. Chem. Biol. 12, 1179–1187 151.  Marshall, W. F., and Blair, J. E. (1999) The cephalosporins. Mayo Clin. Proc. 74, 187–195 152.  Lundqvist, H., and Dahlgren, C. (1995) The serine protease inhibitor diisopropylfluorophosphate inhibits neutrophil NADPH-oxidase activity induced by the calcium ionophore ionomycin and serum opsonised yeast particles. Inflamm. Res. 44, 510–517 153.  Fahrney, D. E., and Gold, A. M. (1963) Sulfonyl Fluorides as Inhibitors of Esterases. I. Rates of Reaction with Acetylcholinesterase, α-Chymotrypsin, and Trypsin. J. Am. Chem. Soc. 85, 997–1000 154.  Gauthier, J. Y., Chauret, N., Cromlish, W., Desmarais, S., et al. (2008) The discovery of odanacatib (MK-0822), a selective inhibitor of cathepsin K. Bioorganic Med. Chem. Lett. 18, 923–928 225  155.  Munoz, B., Giam, C. Z., and Wong, C. H. (1994) α-Ketoamide Phe-Pro isostere as a new core structure for the inhibition of HIV protease. Bioorganic Med. Chem. 2, 1085–1090 156.  Zimmerman, M., Morman, H., Mulvey, D., Jones, H., et al. (1980) Inhibition of elastase and other serine proteases by heterocyclic acylating agents. J. Biol. Chem. 255, 9848–9851 157.  Thompson, R. C. (1973) Use of Peptide Aldehydes to Generate Transition-State Analogs of Elastase. Biochemistry. 12, 47–51 158.  Kuramochi, H., Nakata, H., and Ishij, S. I. (1979) Mechanism of association of a specific aldehyde inhibitor, leupeptin, with bovine trypsin. J. Biochem. 86, 1403–1410 159.  Frase, H., and Lee, I. (2007) Peptidyl boronates inhibit Salmonella enterica serovar typhimurium lon protease by a competitive ATP-dependent mechanism. Biochemistry. 46, 6647–6657 160.  Poulos, T. L., Alden, R. A., Freer, S. T., Birktoft, J. J., et al. (1976) Polypeptide halomethyl ketones bind to serine proteases as analogs of the tetrahedral intermediate. X ray crystallographic comparison of lysine and phenylalanine polypeptide chloromethyl ketone inhibited subtilisin. J. Biol. Chem. 251, 1097–1103 161.  Thornberry, N. A., Peterson, E. P., Zhao, J. J., Howard, A. D., et al. (1994) Inactivation of Interleukin-1β Converting Enzyme by Peptide (Acyloxy)methyl Ketones. Biochemistry. 33, 3934–3940 162.  Smith, R. A., Copp, L. J., Coles, P. J., Robinson, V. J., et al. (1988) New inhibitors of cysteine proteinases. Peptidyl acyloxymethyl ketones and the quiescent nucleofuge strategy. J. Am. Chem. Soc. 110, 4429–4431 163.  Nowak, N., Lotter, H., Tannich, E., and Bruchhaus, I. (2004) Resistance of Entamoeba histolytica to the cysteine proteinase inhibitor E64 is associated with secretion of pro-enzymes and reduced pathogenicity. J. Biol. Chem. 279, 38260–38266 164.  Dalton, J. P., Clough, K. A., Jones, M. K., and Brindley, P. J. (1997) The cysteine proteinases of Schistosoma mansoni cercariae. Parasitology. 114, 105–112 226  165.  Brady, K. D. (1998) Bimodal inhibition of caspase-1 by aryloxymethyl and acyloxymethyl ketones. Biochemistry. 37, 8508–8515 166.  Siklos, M., BenAissa, M., and Thatcher, G. R. J. (2015) Cysteine proteases as therapeutic targets: Does selectivity matter? A systematic review of calpain and cathepsin inhibitors. Acta Pharm. Sin. B. 5, 506–519 167.  Ring, B., Wrighton, S. A., and Mohutsky, M. (2014) Reversible mechanisms of enzyme inhibition and resulting clinical significance. in Methods in Molecular Biology, pp. 37–56, 1113, 37–56 168.  Morrison, J. F. (1969) Kinetics of the reversible inhibition of enzyme-catalysed reactions by tight-binding inhibitors. BBA - Enzymol. 185, 269–286 169.  Bauer, R. A. (2015) Covalent inhibitors in drug discovery: From accidental discoveries to avoided liabilities and designed therapies. Drug Discov. Today. 20, 1061–1073 170.  Lee, C. U., and Grossmann, T. N. (2012) Reversible covalent inhibition of a protein target. Angew. Chemie - Int. Ed. 51, 8699–8700 171.  Moon, J. B., Coleman, R. S., and Hanzlik, R. P. (1986) Reversible Covalent Inhibition of Papain by a Peptide Nitrile. 13C NMR Evidence for a Thioimidate Ester Adduct. J. Am. Chem. Soc. 108, 1350–1351 172.  Dufour, E., Storer, A. C., and Menard, R. (1995) Peptide Aldehydes and Nitriles as Transition State Analog Inhibitors of Cysteine Proteases. Biochemistry. 34, 9136–9143 173.  Wolfenden, R. (1976) Transition State Analog Inhibitors and Enzyme Catalysis. Annu. Rev. Biophys. Bioeng. 5, 271–306 174.  Quesne, M. G., Ward, R. a, and de Visser, S. P. (2013) Cysteine protease inhibition by nitrile-based inhibitors: a computational study. Front. Chem. 1, 1–10 175.  Brömme, D., and Lecaille, F. (2009) Cathepsin K inhibitors for osteoporosis and potential off-target effects. Expert Opin. Investig. Drugs. 18, 585–600 176.  Black, W. C. (2010) Peptidomimetic Inhibitors of Cathepsin K. Curr. Top. Med. Chem. 10, 745–751 227  177.  Black, W. C., and Percival, M. D. (2006) The consequences of lysosomotropism on the design of selective cathepsin K inhibitors. ChemBioChem. 7, 1525–1535 178.  Jerome, C., Missbach, M., and Gamse, R. (2012) Balicatib, a cathepsin K inhibitor, stimulates periosteal bone formation in monkeys. Osteoporos. Int. 23, 339–349 179.  Mullard, A. (2016) Merck &Co. drops osteoporosis drug odanacatib. Nat. Rev. Drug Discov. 15, 669 180.  Schultz, T. C., Valenzano, J. P., Verzella, J. L., and Umland, E. M. (2015) Odanacatib: An emerging novel treatment alternative for postmenopausal osteoporosis. Women’s Heal. 11, 805–814 181.  Eastell, R., Nagase, S., Small, M., Boonen, S., et al. (2014) Effect of ONO-5334 on bone mineral density and biochemical markers of bone turnover in postmenopausal osteoporosis: 2-Year results from the OCEAN study. J. Bone Miner. Res. 29, 458–466 182.  Tanaka, M., Hashimoto, Y., Hasegawa, C., Deacon, S., et al. (2017) Antiresorptive effect of a cathepsin K inhibitor ONO-5334 and its relationship to BMD increase in a phase II trial for postmenopausal osteoporosis. BMC Musculoskelet. Disord. 18, 267 183.  Lindström, E., Rizoska, B., Tunblad, K., Edenius, C., et al. (2018) The selective cathepsin K inhibitor MIV-711 attenuates joint pathology in experimental animal models of osteoarthritis. J. Transl. Med. 16, 56 184.  Sophocleous, A., and Idris, A. I. (2014) Rodent models of osteoporosis. Bonekey Rep. 3, 614 185.  Kumar, S., Dare, L., Vasko-Moser, J. A., James, I. E., et al. (2007) A highly potent inhibitor of cathepsin K (relacatib) reduces biomarkers of bone resorption both in vitro and in an acute model of elevated bone turnover in vivo in monkeys. Bone. 40, 122–131 186.  Visser, A. W., de Mutsert, R., Loef, M., le Cessie, S., et al. (2014) The role of fat mass and skeletal muscle mass in knee osteoarthritis is different for men and women: The NEO study. Osteoarthr. Cartil. 22, 197–202 187.  Rezaie, A. R. (2003) Exosite-dependent regulation of the protein C anticoagulant pathway. 228  Trends Cardiovasc. Med. 13, 8–15 188.  Rzychon, M., Chmiel, D., and Stec-Niemczyk, J. (2004) Modes of inhibition of cysteine proteases. Acta Biochim. Pol. 51, 861–873 189.  Gettins, P. G. W., and Ofson, S. T. (2009) Exosite determinants of serpin specificity. J. Biol. Chem. 284, 20441–20445 190.  Bing, D. H., Cory, M., and Fenton  2d., J. W. (1977) Exosite affinity labeling of human thrombins similar labeling on the a chain and b chain fragments of alpha and beta gamma thrombins. J. Biol. Chem. 252, 8027–8034 191.  Di Cera, E., Dang, Q. D., and Ayala, Y. M. (1997) Molecular mechanisms of thrombin function. Cell. Mol. Life Sci. 53, 701–730 192.  Verhamme, I. M., Olson, S. T., Tollefsen, D. M., and Bock, P. E. (2002) Binding of exosite ligands to human thrombin. Re-evaluation of allosteric linkage between thrombin exosites I and II. J. Biol. Chem. 277, 6788–6798 193.  Bode, W., Turk, D., and Karshikov, A. (1992) The refined 1.9‐Å X‐ray crystal structure of d‐Phe‐Pro‐Arg chloromethylketone‐inhibited human α‐thrombin: Structure analysis, overall structure, electrostatic properties, detailed active‐site geometry, and structure‐function relationships. Protein Sci. 1, 426–471 194.  Stone, S. R., and Hofsteenge, J. (1986) Kinetics of the Inhibition of Thrombin by Hirudin. Biochemistry. 25, 4622–4628 195.  Naski, M. C., Fenton, J. W., Maraganore, J. M., Olson, S. T., et al. (1990) The COOH-terminal domain of hirudin. An exosite-directed competitive inhibitor of the action of ??-thrombin on fibrinogen. J. Biol. Chem. 265, 13484–13489 196.  Mohammed, S. F., Whitworth, C., Chuang, H. Y., Lundblad, R. L., et al. (1976) Multiple active forms of thrombin: binding to platelets and effects on platelet function. Proc. Natl. Acad. Sci. U. S. A. 73, 1660–1663 197.  Fenton, J. W., Villanueva, G. B., Ofosu, F. A., and Maraganore, J. M. (1991) Thrombin inhibition by hirudin: How hirudin inhibits thrombin. Pathophysiol. Haemost. Thromb. 21, 229  27–31 198.  Sheehan, J. P., and Sadler, J. E. (1994) Molecular mapping of the heparin-binding exosite of thrombin (antithrombin III/serine proteases). Biochemistry. 91, 5518–5522 199.  Chahal, G., Thorpe, M., and Hellman, L. (2015) The importance of exosite interactions for substrate cleavage by human thrombin. PLoS One. 10, e0129511 200.  Segers, K., Dahlbäck, B., Bock, P. E., Tans, G., et al. (2007) The role of thrombin exosites I and II in the activation of human coagulation factor V. J. Biol. Chem. 282, 33915–33924 201.  Van Doren, S. R. (2015) Matrix metalloproteinase interactions with collagen and elastin. Matrix Biol. 44–46, 224–231 202.  Arnold, L. H., Butt, L. E., Prior, S. H., Read, C. M., et al. (2011) The interface between catalytic and hemopexin domains in matrix metalloproteinase-1 conceals a collagen binding exosite. J. Biol. Chem. 286, 45073–45082 203.  Manka, S. W., Carafoli, F., Visse, R., Bihan, D., et al. (2012) Structural insights into triple-helical collagen cleavage by matrix metalloproteinase 1. Proc. Natl. Acad. Sci. 109, 12461–12466 204.  Fields, G. B. (2015) New strategies for targeting matrix metalloproteinases. Matrix Biol. 44–46, 239–246 205.  Overall, C. M., and Kleifeld, O. (2006) Towards third generation matrix metalloproteinase inhibitors for cancer therapy. Br. J. Cancer. 94, 941–946 206.  Šilhár, P., Čapková, K., Salzameda, N. T., Barbieri, J. T., et al. (2010) Botulinum neurotoxin a protease: Discovery of natural product exosite inhibitors. J. Am. Chem. Soc. 132, 2868–2869 207.  Sharma, V., Panwar, P., O’Donoghue, A. J., Cui, H., et al. (2014) Structural requirements for the collagenase and elastase activity of cathepsin K and its selective inhibition by an exosite inhibitor. J., Biochem. 465, 163–173 208.  Johansson, R., Jonna, V. R., Kumar, R., Nayeri, N., et al. (2016) Erratum: Structural Mechanism of Allosteric Activity Regulation in a Ribonucleotide Reductase with Double 230  ATP Cones (Structure (2016) 24(6) (906–917)). Structure. 24, 1432–1434 209.  Shen, A. (2010) Allosteric regulation of protease activity by small molecules. Mol. Biosyst. 6, 1431–1443 210.  Li, H., Lim, K. S., Kim, H., Hinds, T. R., et al. (2016) Allosteric Activation of Ubiquitin-Specific Proteases by β-Propeller Proteins UAF1 and WDR20. Mol. Cell. 63, 249–260 211.  Koshland, D. E. (1995) The Key–Lock Theory and the Induced Fit Theory. Angew. Chemie Int. Ed. English. 33, 2375–2378 212.  Baldwin, A. J., and Kay, L. E. (2009) NMR spectroscopy brings invisible protein states into focus. Nat. Chem. Biol. 5, 808–814 213.  Goodey, N. M., and Benkovic, S. J. (2008) Allosteric regulation and catalysis emerge via a common route. Nat. Chem. Biol. 4, 474–482 214.  Kar, G., Keskin, O., Gursoy, A., and Nussinov, R. (2010) Allostery and population shift in drug discovery. Curr. Opin. Pharmacol. 10, 715–722 215.  del Sol, A., Tsai, C. J., Ma, B., and Nussinov, R. (2009) The Origin of Allosteric Functional Modulation: Multiple Pre-existing Pathways. Structure. 17, 1042–1050 216.  McIlwain, D. R., Berger, T., and Mak, T. W. (2013) Caspase functions in cell death and disease. Cold Spring Harb. Perspect. Biol. 5, 1–28 217.  Shiozaki, E. N., Chai, J., Rigotti, D. J., Riedl, S. J., et al. (2003) Mechanism of XIAP-mediated inhibition of caspase-9. Mol. Cell. 11, 519–527 218.  Li, P., Zhou, L., Zhao, T., Liu, X., et al. (2017) Caspase-9: structure, mechanisms and clinical application. Oncotarget. 8, 23996–24008 219.  Erlanson, D. A., Braisted, A. C., Raphael, D. R., Randal, M., et al. (2000) Site-directed ligand discovery. Proc. Natl. Acad. Sci. 97, 9367–9372 220.  Scheer, J. M., Romanowski, M. J., and Wells, J. a (2006) A common allosteric site and mechanism in caspases. Proc. Natl. Acad. Sci. U. S. A. 103, 7595–7600 221.  Riedl, S. J., Fuentes-Prior, P., Renatus, M., Kairies, N., et al. (2001) Structural basis for 231  the activation of human procaspase-7. Proc. Natl. Acad. Sci. U. S. A. 98, 14790–5 222.  Hanakahi, L. A., Bartlet-Jones, M., Chappell, C., Pappin, D., et al. (2000) Binding of inositol phosphate to DNA-PK and stimulation of double-strand break repair. Cell. 102, 721–729 223.  Macbeth, M. R., Schubert, H. L., VanDemark, A. F., Lingam, A. T., et al. (2005) Structural biology: Inositol hexakisphosphate is bound in the ADAR2 core and required for RNA editing. Science (80-. ). 309, 1534–1539 224.  Jank, T., and Aktories, K. (2008) Structure and mode of action of clostridial glucosylating toxins: the ABCD model. Trends Microbiol. 16, 222–229 225.  Fullner Satchell, K. J. (2007) MARTX, multifunctional autoprocessing repeats-in-toxin toxins. Infect. Immun. 75, 5079–5084 226.  Rupnik, M., Wilcox, M. H., and Gerding, D. N. (2009) Clostridium difficile infection: New developments in epidemiology and pathogenesis. Nat. Rev. Microbiol. 7, 526–536 227.  Olivier, V., Haines, G. K., Tan, Y., and Fullner Satchell, K. J. (2007) Hemolysin and the multifunctional autoprocessing RTX toxin are virulence factors during intestinal infection of mice with Vibrio cholerae El Tor O1 strains. Infect. Immun. 75, 5035–5042 228.  Lupardus, P. J., Shen, A., Bogyo, M., and Garcia, K. C. (2008) Small molecule-induced allosteric activation of the Vibrio cholerae RTX cysteine protease domain. Science (80-. ). 322, 265–268 229.  Prochazkova, K., Shuvalova, L. A., Minasov, G., Voburka, Z., et al. (2009) Structural and molecular mechanism for autoprocessing of MARTX toxin of vibrio cholerae at multiple sites. J. Biol. Chem. 284, 26557–26568 230.  Du, X., Chen, N. L. H., Wong, A., Craik, C. S., et al. (2013) Elastin degradation by cathepsin v requires two exosites. J. Biol. Chem. 288, 34871–34881 231.  Li, Z., Hou, W. S., Escalante-Torres, C. R., Gelb, B. D., et al. (2002) Collagenase activity of cathepsin K depends on complex formation with chondroitin sulfate. J. Biol. Chem. 277, 28669–28676 232  232.  Aguda, A. H., Panwar, P., Du, X., Nguyen, N. T., et al. (2014) Structural basis of collagen fiber degradation by cathepsin K. Proc. Natl. Acad. Sci. 111, 17474–17479 233.  Law, S., Panwar, P., Li, J., Aguda, A. H., et al. (2017) A composite docking approach for the identification and characterization of ectosteric inhibitors of cathepsin K. PLoS One. 12, e0186869 234.  Wang, L., Ma, R., Liu, C., Liu, H., et al. (2017) Salvia miltiorrhiza: A Potential Red Light to the Development of Cardiovascular Diseases. Curr. Pharm. Des. 23, 1077–1097 235.  Cui, Y., Bhandary, B., Marahatta, A., Lee, G. H., et al. (2011) Characterization of Salvia Miltiorrhiza ethanol extract as an anti-osteoporotic agent. BMC Complement. Altern. Med. 11, 120 236.  Janzen, W. P. (2014) Screening technologies for small molecule discovery: The state of the art. Chem. Biol. 21, 1162–1170 237.  Wood, W. J. L., Huang, L., and Ellman, J. A. (2003) Synthesis of a Diverse Library of Mechanism-Based Cysteine Protease Inhibitors. J. Comb. Chem. 5, 869–880 238.  Blanchard, J. E., Elowe, N. H., Huitema, C., Fortin, P. D., et al. (2004) High-throughput screening identifies inhibitors of the SARS coronavirus main proteinase. Chem. Biol. 11, 1445–1453 239.  Lee, H., Zhu, T., Patel, K., Zhang, Y. Y., et al. (2013) High-Throughput Screening (HTS) and Hit Validation to Identify Small Molecule Inhibitors with Activity against NS3/4A proteases from Multiple Hepatitis C Virus Genotypes. PLoS One. 8, e75144 240.  Balasubramanian, A., Manzano, M., Teramoto, T., Pilankatta, R., et al. (2016) High-throughput screening for the identification of small-molecule inhibitors of the flaviviral protease. Antiviral Res. 134, 6–16 241.  Rossi, A. M., and Taylor, C. W. (2011) Analysis of protein-ligand interactions by fluorescence polarization. Nat. Protoc. 6, 365–387 242.  Hall, M. D., Yasgar, A., Peryea, T., Braisted, J. C., et al. (2016) Fluorescence polarization assays in high-throughput screening and drug discovery: A review. Methods Appl. 233  Fluoresc. 4, 022001 243.  Lea, W. A., and Simeonov, A. (2011) Fluorescence polarization assays in small molecule screening. Expert Opin. Drug Discov. 6, 17–32 244.  Ou-Yang  Lu, J., KONG, X., Liang, Z. Luo, C., Jiang, H., S. (2012) Review - Computational drug discovery. Acta Pharmacol. Sin. 33, 1131–1140 245.  Valasani, K. R., Vangavaragu, J. R., Day, V. W., and Yan, S. S. (2014) Structure based design, synthesis, pharmacophore modeling, virtual screening, and molecular docking studies for identification of novel cyclophilin D inhibitors. J. Chem. Inf. Model. 54, 902–912 246.  Sliwoski, G., Kothiwale, S., Meiler, J., and Lowe, E. W. (2013) Computational Methods in Drug Discovery. Pharmacol. Rev. 66, 334–395 247.  Meng, X.-Y., Zhang, H.-X., Mezei, M., and Cui, M. (2011) Molecular docking: a powerful approach for structure-based drug discovery. Curr. Comput. Aided. Drug Des. 7, 146–57 248.  Hu, X., Legler, P. M., Southall, N., Maloney, D. J., et al. (2014) Structural insight into exosite binding and discovery of novel exosite inhibitors of botulinum neurotoxin serotype A through in silico screening. J. Comput. Aided. Mol. Des. 28, 765–778 249.  Montecucco, C., and Schiavo, G. (1995) Structure and Function of Tetanus and Botulinum Neurotoxins. Q. Rev. Biophys. 28, 423–472 250.  Schiavo, G., Matteoli, M., and Montecucco, C. (2000) Neurotoxins affecting neuroexocytosis. Physiol. Rev. 80, 717–766 251.  Breidenbach, M. A., and Brunger, A. T. (2004) Substrate recognition strategy for butulinum neurotoxin serotype A. Nature. 432, 925–929 252.  Novinec, M., Korenč, M., Caflisch, A., Ranganathan, R., et al. (2014) A novel allosteric mechanism in the cysteine peptidase cathepsin K discovered by computational methods. Nat. Commun. 5, 3287 253.  Law, S., Andrault, P.-M., Aguda, A. H., Nguyen, N. T., et al. (2017) Identification of 234  mouse cathepsin K structural elements that regulate the potency of odanacatib. Biochem. J. 474, 851–864 254.  Selent, J., Kaleta, J., Li, Z., Lalmanach, G., et al. (2007) Selective inhibition of the collagenase activity of cathepsin K. J. Biol. Chem. 282, 16492–16501 255.  Komori, T. (2015) Animal models for osteoporosis. Eur. J. Pharmacol. 759, 287–294 256.  Desmarais, S., Massé, F., and Percival, M. D. (2009) Pharmacological inhibitors to identify roles of cathepsin K in cell-based studies: A comparison of available tools. Biol. Chem. 390, 941–948 257.  Desmarais, S., Black, W. C., Oballa, R., Lamontagne, S., et al. (2007) Effect of Cathepsin K Inhibitor Basicity on in Vivo Off-Target Activities. Mol. Pharmacol. 73, 147–156 258.  Linnevers, C. J., McGrath, M. E., Armstrong, R., Mistry, F. R., et al. (1997) Expression of human cathepsin K in Pichia pastoris and preliminary crystallographic studies of an inhibitor complex. Protein Sci. 6, 919–21 259.  Battye, T. G. G., Kontogiannis, L., Johnson, O., Powell, H. R., et al. (2011) iMOSFLM: A new graphical interface for diffraction-image processing with MOSFLM. Acta Crystallogr. Sect. D Biol. Crystallogr. 67, 271–281 260.  Winn, M. D., Ballard, C. C., Cowtan, K. D., Dodson, E. J., et al. (2011) Overview of the CCP4 suite and current developments. Acta Crystallogr. Sect. D Biol. Crystallogr. 67, 235–242 261.  McCoy, A. J., Grosse-Kunstleve, R. W., Adams, P. D., Winn, M. D., et al. (2007) Phaser crystallographic software. J. Appl. Crystallogr. 40, 658–674 262.  Moriarty, N. W., Grosse-Kunstleve, R. W., and Adams, P. D. (2009) Electronic ligand builder and optimization workbench (eLBOW): A tool for ligand coordinate and restraint generation. Acta Crystallogr. Sect. D Biol. Crystallogr. 65, 1074–1080 263.  Adams, P. D., Afonine, P. V., Bunkóczi, G., Chen, V. B., et al. (2010) PHENIX: A comprehensive Python-based system for macromolecular structure solution. Acta Crystallogr. Sect. D Biol. Crystallogr. 66, 213–221 235  264.  Vaguine, A. A., Richelle, J., and Wodak, S. J. (1999) SFCHECK: A unified set of procedures for evaluating the quality of macromolecular structure-factor data and their agreement with the atomic model. Acta Crystallogr. Sect. D Biol. Crystallogr. 55, 191–205 265.  Collaborative Computational Project, Number 4 (1994) The CCP4 suite: Programs for protein crystallography. Acta Crystallogr. Sect. D Biol. Crystallogr. 50, 760–763 266.  Dixon, M. (1953) The determination of enzyme inhibitor constants. Biochem. J. 55, 170–171 267.  Friesner, R. A., Banks, J. L., Murphy, R. B., Halgren, T. A., et al. (2004) Glide: A New Approach for Rapid, Accurate Docking and Scoring. 1. Method and Assessment of Docking Accuracy. J. Med. Chem. 47, 1739–1749 268.  Li, C. S., Deschenes, D., Desmarais, S., Falgueyret, J. P., et al. (2006) Identification of a potent and selective non-basic cathepsin K inhibitor. Bioorganic Med. Chem. Lett. 16, 1985–1989 269.  Krissinel, E., and Henrick, K. (2007) Inference of Macromolecular Assemblies from Crystalline State. J. Mol. Biol. 372, 774–797 270.  Kawabata, T. (2010) Detection of multiscale pockets on protein surfaces using mathematical morphology. Proteins Struct. Funct. Bioinforma. 78, 1195–1211 271.  Rünger, T. M., Adami, S., Benhamou, C. L., Czerwiński, E., et al. (2012) Morphea-like skin reactions in patients treated with the cathepsin K inhibitor balicatib. J. Am. Acad. Dermatol. 66, e89-96 272.  Eastell, R., Nagase, S., Ohyama, M., Small, M., et al. (2011) Safety and efficacy of the cathepsin K inhibitor ONO-5334 in postmenopausal osteoporosis: The OCEAN study. J. Bone Miner. Res. 26, 1303–1312 273.  Bhatia, S., Daschkey, S., Lang, F., Borkhardt, A., et al. (2016) Mouse models for pre-clinical drug testing in leukemia. Expert Opin. Drug Discov. 11, 1081–1091 274.  Li, F., Lu, J., Cheng, J., Wang, L., et al. (2013) Human PXR modulates hepatotoxicity 236  associated with rifampicin and isoniazid co-therapy. Nat. Med. 19, 418–420 275.  Xu, D., Nishimura, T., Nishimura, S., Zhang, H., et al. (2014) Fialuridine Induces Acute Liver Failure in Chimeric TK-NOG Mice: A Model for Detecting Hepatic Drug Toxicity Prior to Human Testing. PLoS Med. 11, e1001628 276.  McKenzie, R., Fried, M. W., Sallie, R., Conjeevaram, H., et al. (1995) Hepatic Failure and Lactic Acidosis Due to Fialuridine (FIAU), an Investigational Nucleoside Analogue for Chronic Hepatitis B. N. Engl. J. Med. 333, 1099–1105 277.  Dossetter, A. G., Beeley, H., Bowyer, J., Cook, C. R., et al. (2012) (1 R,2 R)-N-(1-cyanocyclopropyl)-2-(6-methoxy-1,3,4,5-tetrahydropyrido[4,3- b]indole-2-carbonyl)cyclohexanecarboxamide (AZD4996): A potent and highly selective cathepsin k inhibitor for the treatment of osteoarthritis. J. Med. Chem. 55, 6363–6374 278.  Lecaille, F., Kaleta, J., and Brömme, D. (2002) Human and parasitic Papain-like cysteine proteases: Their role in physiology and pathology and recent developments in inhibitor design. Chem. Rev. 102, 4459–4488 279.  Kafienah,  el, Bro, D., Buttle, D. J., Croucher, L. J., et al. (1998) Human cathepsin K cleaves native type I and II collagens at the N-terminal end of the triple helix. Biochem. J. 331, 727–732 280.  Yasuda, Y., Li, Z., Greenbaum, D., Bogyo, M., et al. (2004) Cathepsin V, a novel and potent elastolytic activity expressed in activated macrophages. J. Biol. Chem. 279, 36761–36770 281.  Brömme, D., and Kaleta, J. (2002) Thiol-dependent cathepsins: pathophysiological implications and recent advances in inhibitor design. Curr. Pharm. Des. 8, 1639–1658 282.  Brömme, D. (2011) Cysteine cathepsins and the skeleton. Clin. Rev. Bone Miner. Metab. 9, 83–93 283.  Pennypacker, B. L., Chen, C. M., Zheng, H., Shih, M. S., et al. (2014) Inhibition of cathepsin K increases modeling-based bone formation, and improves cortical dimension and strength in adult ovariectomized monkeys. J. Bone Miner. Res. 29, 1847–1858 237  284.  Panwar, P., Søe, K., Guido, R. V., Bueno, R. V. C., et al. (2016) A novel approach to inhibit bone resorption: Exosite inhibitors against cathepsin K. Br. J. Pharmacol. 173, 396–410 285.  Madhavi Sastry, G., Adzhigirey, M., Day, T., Annabhimoju, R., et al. (2013) Protein and ligand preparation: Parameters, protocols, and influence on virtual screening enrichments. J. Comput. Aided. Mol. Des. 27, 221–234 286.  Spitzer, R., and Jain, A. N. (2012) Surflex-Dock: Docking benchmarks and real-world application. J. Comput. Aided. Mol. Des. 26, 687–699 287.  Jones, G., Willett, P., Glen, R. C., Leach, A. R., et al. (1997) Development and validation of a genetic algorithm for flexible docking. J. Mol. Biol. 267, 727–748 288.  Panwar, P., Butler, G. S., Jamroz, A., Azizi, P., et al. (2018) Aging-associated modifications of collagen affect its degradation by matrix metalloproteinases. Matrix Biol. 65, 30–44 289.  Schneider, C. A., Rasband, W. S., and Eliceiri, K. W. (2012) NIH Image to ImageJ: 25 years of image analysis. Nat. Methods. 9, 671–675 290.  Søe, K., and Delaissé, J. M. (2010) Glucocorticoids maintain human osteoclasts in the active mode of their resorption cycle. J. Bone Miner. Res. 25, 2184–2192 291.  Halgren, T. A. (2009) Identifying and characterizing binding sites and assessing druggability. J. Chem. Inf. Model. 49, 377–389 292.  Pan, Y., Huang, N., Cho, S., and MacKerell, A. D. (2003) Consideration of molecular weight during compound selection in virtual target-based database screening. J. Chem. Inf. Comput. Sci. 43, 267–272 293.  Ghose, A. K., Viswanadhan, V. N., and Wendoloski, J. J. (1999) A knowledge-based approach in designing combinatorial or medicinal chemistry libraries for drug discovery. 1. A qualitative and quantitative characterization of known drug databases. J. Comb. Chem. 1, 55–68 294.  Houston, D. R., Yen, L. H., Pettit, S., and Walkinshaw, M. D. (2015) Structure- and 238  ligand-based virtual screening identifies new scaffolds for inhibitors of the oncoprotein MDM2. PLoS One. 10, e0121424 295.  Domínguez, J. L., Fernández-Nieto, F., Castro, M., Catto, M., et al. (2015) Computer-aided structure-based design of multitarget leads for Alzheimer’s disease. J. Chem. Inf. Model. 55, 135–148 296.  Alonso, H., Bliznyuk, A. A., and Gready, J. E. (2006) Combining docking and molecular dynamic simulations in drug design. Med. Res. Rev. 26, 531–568 297.  Sullivan, K., Cramer-Morales, K., McElroy, D. L., Ostrov, D. A., et al. (2016) Identification of a Small Molecule Inhibitor of RAD52 by Structure-Based Selection. PLoS One. 11, e0147230 298.  Corsino, P., Horenstein, N., Ostrov, D., Rowe, T., et al. (2009) A novel class of cyclin-dependent kinase inhibitors identified by molecular docking act through a unique mechanism. J. Biol. Chem. 284, 29945–29955 299.  Rastelli, G., Ferrari, A. M., Costantino, L., and Gamberini, M. C. (2002) Discovery of new inhibitors of aldose reductase from molecular docking and database screening. Bioorganic Med. Chem. 10, 1437–1450 300.  Grinter, S. Z., and Zou, X. (2014) Challenges, applications, and recent advances of protein-ligand docking in structure-based drug design. Molecules. 19, 10150–10176 301.  Carta, G., Knox, A. J. S., and Lloyd, D. G. (2007) Unbiasing scoring functions: A new normalization and rescoring strategy. J. Chem. Inf. Model. 47, 1564–1571 302.  Ferrara, P., Gohlke, H., Price, D. J., Klebe, G., et al. (2004) Assessing scoring functions for protein-ligand interactions. J. Med. Chem. 47, 3032–3047 303.  Ntie-Kang, F. (2013) An in silico evaluation of the ADMET profile of the StreptomeDB database. Springerplus. 2, 1–11 304.  Feng, B. Y., Simeonov, A., Jadhav, A., Babaoglu, K., et al. (2007) A high-throughput screen for aggregation-based inhibition in a large compound library. J. Med. Chem. 50, 2385–2390 239  305.  Feng, B. Y., and Shoichet, B. K. (2006) A detergent-based assay for the detection of promiscuous inhibitors. Nat. Protoc. 1, 550–553 306.  Dunstan, M. S., Barnes, J., Humphries, M., Whitehead, R. C., et al. (2011) Novel inhibitors of NRH:Quinone oxidoreductase 2 (NQO2): Crystal structures, biochemical activity, and intracellular effects of imidazoacridin-6-ones. J. Med. Chem. 54, 6597–6611 307.  Martis, E. A., Radhakrishnan, R., and Badve, R. R. (2011) High-throughput screening: The hits and leads of drug discovery-An overview. J. Appl. Pharm. Sci. 1, 2–10 308.  Smilkstein, M., Sriwilaijaroen, N., Kelly, J. X., Wilairat, P., et al. (2004) Simple and Inexpensive Fluorescence-Based Technique for High-Throughput Antimalarial Drug Screening. Antimicrob. Agents Chemother. 48, 1803–1806 309.  Kim, H. Y. H., Korade, Z., Tallman, K. A., Liu, W., et al. (2016) Inhibitors of 7-Dehydrocholesterol Reductase: Screening of a Collection of Pharmacologically Active Compounds in Neuro2a Cells. Chem. Res. Toxicol. 29, 892–900 310.  Siles, S. A., Srinivasan, A., Pierce, C. G., Lopez-Ribot, J. L., et al. (2013) High-throughput screening of a collection of known pharmacologically active small compounds for identification of candida albicans biofilm inhibitors. Antimicrob. Agents Chemother. 57, 3681–3687 311.  Watamoto, T., Egusa, H., Sawase, T., and Yatani, H. (2015) Screening of pharmacologically active small molecule compounds identifies antifungal agents against Candida biofilms. Front. Microbiol. 6, 1453 312.  Annang, F., Pérez-Moreno, G., García-Hernández, R., Cordon-Obras, C., et al. (2015) High-throughput screening platform for natural product-based drug discovery against 3 neglected tropical diseases: Human African trypanosomiasis, leishmaniasis, and chagas disease. J. Biomol. Screen. 20, 82–91 313.  Gu, J., Gui, Y., Chen, L., Yuan, G., et al. (2013) Use of Natural Products as Chemical Library for Drug Discovery and Network Pharmacology. PLoS One. 8, e62839 314.  Bugni, T. S., Richards, B., Bhoite, L., Cimbora, D., et al. (2008) Marine natural product libraries for high-throughput screening and rapid drug discovery. J. Nat. Prod. 71, 1095–240  1098 315.  Dean, R. A., Fam, H. K., An, J., Choi, K., et al. (2014) Identification of a putative tdp1 inhibitor (CD00509) by in vitro and cell-based assays. J. Biomol. Screen. 19, 1372–1382 316.  Hashemi, P., Barreto, K., Bernhard, W., Lomness, A., et al. (2017) Compounds producing an effective combinatorial regimen for disruption of HIV‐1 latency. EMBO Mol. Med. 10, e201708193 317.  Balgi, A. D., Fonseca, B. D., Donohue, E., Tsang, T. C. F., et al. (2009) Screen for chemical modulators of autophagy reveals novel therapeutic inhibitors of mTORC1 signaling. PLoS One. 4, e7124 318.  Garnero, P., Ferreras, M., Karsdal, M., Nicamhlaoibh, R., et al. (2003) The Type I Collagen Fragments ICTP and CTX Reveal Distinct Enzymatic Pathways of Bone Collagen Degradation. J. Bone Miner. Res. 18, 859–867 319.  Nallaseth, F. S., Lecaille, F., Li, Z., and Brömme, D. (2013) The role of basic amino acid surface clusters on the collagenase activity of cathepsin K. Biochemistry. 52, 7742–7752 320.  Cherney, M. M., Lecaille, F., Kienitz, M., Nallaseth, F. S., et al. (2011) Structure-activity analysis of cathepsin K/chondroitin 4-sulfate interactions. J. Biol. Chem. 286, 8988–8998 321.  Yasuda, Y., Kaleta, J., and Brömme, D. (2005) The role of cathepsins in osteoporosis and arthritis: Rationale for the design of new therapeutics. Adv. Drug Deliv. Rev. 57, 973–993 322.  Bisaggio, D. F. R., Adade, C. M., and Souto-Padrón, T. (2008) In vitro effects of suramin on Trypanosoma cruzi. Int. J. Antimicrob. Agents. 31, 282–286 323.  Schulz-Key, H., Karam, M., and Prost, A. (1985) Suramin in the treatment of onchocerciasis: the efficacy of low doses on the parasite in an area with vector control. Trop. Med. Parasitol. 36, 244–8 324.  Chen, C.-H. C.-H., Kang, L., Lin, R.-W., Fu, Y.-C., et al. (2013) (−)-Epigallocatechin-3-gallate improves bone microarchitecture in ovariectomized rats. Menopause. 20, 687–694 325.  Song, D., Gan, M., Zou, J., Zhu, X., et al. (2014) Effect of (-)-epigallocatechin-3-gallate in preventing bone loss in ovariectomized rats and possible mechanisms. Int. J. Clin. Exp. 241  Med. 7, 4183–90 326.  Sun, K., Wang, L., Ma, Q., Cui, Q., et al. (2017) Association between tea consumption and osteoporosis: A meta-analysis. Medicine (Baltimore). 96, e9034 327.  Catchpoole, D. R., and Stewart, B. W. Inhibition of topoisomerase II by aurintricarboxylic acid: implications for mechanisms of apoptosis. Anticancer Res. 14, 853–6 328.  Gonzalez, R. G., Haxo, R. S., and Schleich, T. (1980) Mechanism of action of polymeric aurintricarboxylic acid, a potent inhibitor of protein-nucleic acid interactions. Biochemistry. 19, 4299–4303 329.  Troen, B. R. (2003) Molecular mechanisms underlying osteoclast formation and activation. Exp. Gerontol. 38, 605–614 330.  Finley, J. B., Atigadda, V. R., Duarte, F., Zhao, J. J., et al. (1999) Novel aromatic inhibitors of influenza virus neuraminidase make selective interactions with conserved residues and water molecules in the active site. J. Mol. Biol. 293, 1107–1119 331.  Bogyo, M., McMaster, J. S., Gaczynska, M., Tortorella, D., et al. (1997) Covalent modification of the active site threonine of proteasomal β subunits and the Escherichia coli homolog HsIV by a new class of inhibitors. Proc. Natl. Acad. Sci. U. S. A. 94, 6629–6634 332.  Wiedow, O., Schröder, J. M., Gregory, H., Young, J. a, et al. (1990) Elafin: an elastase-specific inhibitor of human skin. Purification, characterization, and complete amino acid sequence. J. Biol. Chem. 265, 14791–14795 333.  Overall, C. M., and Kleifeld, O. (2006) Towards third generation matrix metalloproteinase inhibitors for cancer therapy. Br. J. Cancer. 94, 941–946 334.  Kornev, A. P. (2018) Self-organization, entropy and allostery. Biochem. Soc. Trans. 46, 587–597 335.  Dallakyan, S., and Olson, A. J. (2015) Small-molecule library screening by docking with PyRx. Methods Mol. Biol. 1263, 243–250 336.  Emsley, P., Lohkamp, B., Scott, W. G., and Cowtan, K. (2010) Features and development 242  of Coot. Acta Crystallogr. Sect. D Biol. Crystallogr. 66, 486–501 337.  Rappsilber, J., Mann, M., and Ishihama, Y. (2007) Protocol for micro-purification, enrichment, pre-fractionation and storage of peptides for proteomics using StageTips. Nat. Protoc. 2, 1896–906 338.  Colaert, N., Helsens, K., Martens, L., Vandekerckhove, J., et al. (2009) Improved visualization of protein consensus sequences by iceLogo. Nat. Methods. 6, 786–7 339.  Xiao, Y., Hsiao, T.-H., Suresh, U., Chen, H.-I. H., et al. (2014) A novel significance score for gene selection and ranking. Bioinformatics. 30, 801–7 340.  MacLean, B., Tomazela, D. M., Shulman, N., Chambers, M., et al. (2010) Skyline: an open source document editor for creating and analyzing targeted proteomics experiments. Bioinformatics. 26, 966–8 341.  Lecaille, F., Weidauer, E., Juliano, M. a, Brömme, D., et al. (2003) Probing cathepsin K activity with a selective substrate spanning its active site. Biochem. J. 375, 307–312 342.  Helali, A. M., Iti, F. M., and Mohamed, I. N. (2013) Cathepsin K inhibitors: a novel target but promising approach in the treatment of osteoporosis. Curr. Drug Targets. 14, 1591–600 343.  Morko, J., Kiviranta, R., Mulari, M. T. K., Ivaska, K. K., et al. (2009) Overexpression of cathepsin K accelerates the resorption cycle and osteoblast differentiation in vitro. Bone. 44, 717–728    243  Appendices List of Other Co-Authored Publications  A list of all my other co-authored publications and my contributions each are listed following with their respective articles. (All articles are reproduced with permission.)  Lavallee, V., Aguda, A., Cheng, P., Bott, T., Meimetis, L., Law, S., Nguyen, N., Williams, D., Davies, J., Andersen, R., Brayer, G. and Bromme, D. Affinity Crystallography: A New Approach to Extracting High-Affinity Enzyme Inhibitors from Natural Extracts. Journal of Natural Products, 79(8), 1962–1970, Aug., 2016. I was responsible for the characterization of the inhibitory kinetics of the synthesized lichostatinal described in this study.  Kruglyak, N., Williams, D., Chen, H., Law, S., Kaleta, J., Villanueva, I., Davies, J., Andersen, R., Brömme, D. Leupeptazin, a highly modified tripeptide isolated from cultures of a Streptomyces sp. inhibits cathepsin K. Bioorganic and Medicinal Chemistry Letters. 27(6), 1397-1400. Mar., 2017. I was responsible for the molecular docking studies for leupeptazin and its related derivatives covered in this manuscript.  Panwar, P., Xue, L., Søe, K., Srivastava, K., Law, S., Delaisse, J., Brömme, D. An Ectosteric Inhibitor of Cathepsin K Inhibits Bone Resorption in Ovariectomized Mice. Journal of Bone Mineral Research. 32(12):2415-2430. doi: 10.1002/ jbmr.3227. Dec., 2017. I was responsible for the molecular docking studies of tanshinone II-A sulfonate which was characterized in this paper.  244  Panwar, P., Law, S., Jamroz, A., Azizi, P., Zhang, D., Ciufolini, M., Brömme, D. Tanshinones: A novel class of ectosteric anti-resorptives. British Journal of Pharmacology. 175(6):902-923, doi: 10.1111/bph.14133. Mar., 2018. I was responsible for the characterization of the collagenase activity of the 31 tanshinones described in this paper as well as the quantitative structural activity relationship studies performed using molecular docking methods.    Affinity Crystallography: A New Approach to Extracting High-AffinityEnzyme Inhibitors from Natural ExtractsAdeleke H. Aguda,†,‡ Vincent Lavallee,‡ Ping Cheng,§ Tina M. Bott,§ Labros G. Meimetis,§ Simon Law,§Nham T. Nguyen,‡ David E. Williams,§ Jadwiga Kaleta,† Ivan Villanueva,⊥ Julian Davies,⊥Raymond J. Andersen,§ Gary D. Brayer,‡ and Dieter Brömme*,†,‡,∥†Department of Oral Biological and Medical Sciences, Faculty of Dentistry, ‡Department of Biochemistry and Molecular Biology,Faculty of Medicine, §Department of Chemistry and Earth, Ocean & Atmospheric Sciences, Faculty of Science, ⊥Department ofMicrobiology, Faculty of Science, and ∥Centre for Blood Research, University of British Columbia, Vancouver, BC Canada, V6T 1Z3*S Supporting InformationABSTRACT: Natural products are an important source of novel drug scaffolds. Thehighly variable and unpredictable timelines associated with isolating novel compoundsand elucidating their structures have led to the demise of exploring natural product extractlibraries in drug discovery programs. Here we introduce affinity crystallography as a newmethodology that significantly shortens the time of the hit to active structure cycle inbioactive natural product discovery research. This affinity crystallography approach isillustrated by using semipure fractions of an actinomycetes culture extract to isolate andidentify a cathepsin K inhibitor and to compare the outcome with the traditional assay-guided purification/structural analysis approach. The traditional approach resulted in theidentification of the known inhibitor antipain (1) and its new but lower potencydehydration product 2, while the affinity crystallography approach led to the identificationof a new high-affinity inhibitor named lichostatinal (3). The structure and potency oflichostatinal (3) was verified by total synthesis and kinetic characterization. To the best ofour knowledge, this is the first example of isolating and characterizing a potent enzymeinhibitor from a partially purified crude natural product extract using a protein crystallographic approach.The discovery of bioactive natural products traditionallystarts with screening crude or partially fractionatedextracts in cell-based or pure molecular target assays. This isaimed at identifying potential hits that are then resolved byiterative cycles of assay-guided fractionation to generate pureactive natural products, whose structures are elucidated bysome combination of spectroscopic, X-ray diffraction, andchemical transformation analyses. Miniaturization of bioassaysand advances in chromatography, mass spectroscopic, andNMR methodologies have continued to make this processmore efficient. Nevertheless, the timelines for completing thecycle of hit identification to the chemical structure of a pureactive natural product can still vary widely from days to monthsand in extreme cases to years. In rare situations the bioactivityin a hit may never be ascribed to a single pure natural product.Determination of the absolute configurations of novel naturalproducts remains one of the most challenging and time-consuming aspects of bioactive natural product discovery.High-throughput screening (HTS) campaigns of diversechemical compound libraries are the main tool used bypharmaceutical companies to identify potential drug candidates.More than half of the drugs in use are natural products orsynthetic compounds inspired by natural product leadcompounds.1 Despite the unquestioned utility of naturalproducts, most pharmaceutical companies have unfortunatelyabandoned their efforts to access natural product chemicaldiversity as part of their discovery programs.2 This is due toseveral factors including difficulties in obtaining sufficient rawmaterials, the interference of natural product extractcomponents with bioassays, and timeline metrics. It has beenargued that the failure to incorporate natural products intopharmaceutical industry drug screening programs has been amajor contributor to the declining rate of clinical approval ofnew chemical entities.3Enzymes are able to distinguish single pure substrates fromthe complex chemical milieu found in cells. Once bound to theenzyme active site, the ligands find themselves in a chiralenvironment with an absolute configuration that is defined bythe configuration of the protein. We envisioned combining theremarkable compound recognition properties of enzymes withthe ability to elucidate the structure of tight binding ligands inthe chiral protein environment via a process we have namedaffinity crystallography. This would short-circuit the iterativeassay-guided fractionation and absolute configuration assign-ment steps in classical bioactive natural product discovery,thereby reducing the timelines and increasing the chances ofthe successful structural resolution of a hit.As a test of this concept, we selected the cysteine proteasecathepsin K (CatK) as a model drug target. Interest inReceived: March 9, 2016Published: August 6, 2016Articlepubs.acs.org/jnp© 2016 American Chemical Society andAmerican Society of Pharmacognosy 1962 DOI: 10.1021/acs.jnatprod.6b00215J. Nat. Prod. 2016, 79, 1962−1970This is an open access article published under an ACS AuthorChoice License, which permitscopying and redistribution of the article or any adaptations for non-commercial purposes.Downloaded via UNIV OF BRITISH COLUMBIA on December 12, 2018 at 23:32:28 (UTC). See https://pubs.acs.org/sharingguidelines for options on how to legitimately share published articles. inhibitors of CatK comes from their potential utility astreatments for osteoporosis.4 A library of crude extractsprepared from cultures of Streptomyces species isolated fromlichens and soils collected in British Columbia were screenedfor inhibition of CatK. Fractionation by classical solventpartition and chromatography methods of the most potentcrude extract, L91-3, resulted in the isolation of antipain (1), aknown potent CatK inhibitor, along with the identification ofV2 (2), a new dehydrated analogue of antipain, which was aconsiderably less potent CatK inhibitor. In contrast, the affinitycrystallography approach identified the new potent CatKinhibitor lichostatinal (3) from the same L91-3 crude extract.Affinity crystallography isolated the most tightly binding CatKinhibitor from a partially purified mixture of closely relatedinhibitors, enabling the elucidation of its constitution andabsolute configuration without resorting to spectroscopicanalysis or chemical transformation, and rigorously definedthe binding mode to the target protein. The affinitycrystallography methodology can dramatically shorten thetimeline of the hit to inhibitor structure cycle and, whencombined with pre-fractionated crude extracts, has the potentialto make natural product libraries compatible with HTScampaigns against crystallizable protein targets. Notably hereinwe have used a cocrystalline approach rather than crystalsoaking into apo-enzyme crystals, due to the rapid loss ofactivity associated with the apo-form of CatK. For other morestable apo-enzymes it is likely that a crystal soaking approachcould also be useful in affinity discrimination between potentialinhibitors in crude mixtures. Details of the isolation andstructure elucidation of lichostatinal (3) by “affinity crystallog-raphy”, along with its active site binding topology, totalsynthesis, and CatK inhibition properties, are presented below.■ RESULTS AND DISCUSSIONCatK Inhibitor Extraction. Considering the richness ofsecondary metabolite production in actinobacteria,5 wescreened the culture supernatants from a library of 350 soil-and lichen-associated bacterial strains collected in the rainforests of British Columbia for CatK inhibitors. Twenty-twoout of 350 samples revealed an inhibition of >80% of the CatKactivity toward the fluorogenic substrate, Z-FR-MCA (carbo-benzoxy-phenylalanyl-arginyl-methyl-7-aminocoumarin amide)(6% hit rate). Eleven of these samples retained their CatKinhibitory activity after a 3 kDa filtration step. Strain L91-3 wasselected for further analysis due to its high potency (100%CatK inhibition at 0.5 μL of the original culture media in a 200μL assay).We produced 7 L of cell-free, filtered L91-3 culturesupernatant, accounting for 84 g of lyophilized material.Maximum cell growth and secretion of inhibitory activity, asmeasured by IC50 determination of the crude extracts, occurredafter 39 h of incubation. The purification procedure issummarized in Figure 1. Briefly, Amberlite XAD4 resineliminated 90% of the inactive material (75 g). A further 7.3g of inactive material was removed by extraction with ethylacetate. Treatment with Dowex Marathon A (anionic exchangeresin) removed highly viscous material and most of thepigments present in the preparation. The active sample wasthen extracted with n-butanol to remove a further 0.3 g ofcontaminants without loss of inhibitory activity. An aliquot ofthe resulting material was saved for crystallization experimentswith CatK. Subsequently, the sample was fractioned using asequential application of reversed-phase HPLC, weak cationexchange resin, and Sephadex LH-20 chromatography, followedby reversed-phase HPLC, to give 150 mg of active material (seeFigure 1. Purification flow chart for the isolation of CatK inhibitors from the supernatant of an L91-3 streptomycetes culture. The structures ofisolated compounds V2 (2) and V4 (1) are depicted.Journal of Natural Products ArticleDOI: 10.1021/acs.jnatprod.6b00215J. Nat. Prod. 2016, 79, 1962−19701963Supporting Information, Figure S1A). A portion of this fractionwas again saved for crystallography (5 mg), while the remainingmaterial (145 mg) was further fractioned via reversed-phaseHPLC to give three major active peaks, V1 (1) (5.8 mg), V2(2) (4.4 mg), and V4 (1) (6.4 mg), and one minor peak, V3(1) (see Supporting Information, Figure S1B).HRESIMS analysis of the isolated compounds gave [M +H]+ ions at m/z 605.3511 for V4 (1), appropriate for amolecular formula of C27H44N10O6 (calcd C27H45N10O6,605.3524), and 587.3403 for V2 (2), appropriate for amolecular formula of C27H42N10O5 (calcd C27H43N10O5,587.3418). NMR analysis identified compound V4 as thecycloarginal tautomer of antipain (1) (Figure 1, V4; seeSupporting Information, Table S1), whereas compound V2 (2)differed from antipain (1) by the loss of H2O from dehydrationof the terminal cycloarginal residue to generate the Δ1,2 olefin(Figure 1, V2 (2); see Supporting Information, Table S1).Although V1, V3, and V4 (1) were isolated as separate peaks byHPLC, the 1H NMR spectra of each were essentially the same:hence V1, V3, and V4 are assumed to be the open-chainaldehyde and cyclic hemiaminal tautomers of antipain (1).6,7The V2 (2) compound is a hitherto undescribed antipainderivative [see Supporting Information for detailed structuralelucidation and Table S1 for the NMR assignments of V2 (2)].The Ki values were determined for V2 (2) and antipain (1) inthree independent experiments using Dixon plot analysis forhuman recombinant CatK, which gave the following values: V1(1), 163 nM; V2 (2), 393 nM; and V4 (1), 105 nM (Table 1;see Supporting Information, Figure S2). Dehydration of thecyclo-argininal ring in antipain (1) decreased the inhibitorypotency by a factor of 2 (Table 1). Molecular docking of V2(2) and V4 (V1) (1) into the active site of CatK revealed agood fit in the S1−S3 subsites with calculated Ki(app) values of66 nM for V2 (2) and 120 nM for V4 (1) (SupportingInformation, Figure S3; Table 1).X-ray Structure Analysis. X-ray methods have traditionallybeen used in drug discovery to determine the binding site of alead compound in order to facilitate its further optimization orto solve the absolute configuration of a pure natural productlead compound whose constitution was already known fromspectroscopic analyses.8 Fragment screening crystallography hasbeen used to identify smaller weak-binding fragments in knownmixed compound libraries.9,10 Very recently, a crystallographicmethod has been introduced to determine the configuration ofnovel low molecular weight compounds by soaking them in acrystalline metal−organic framework.11,12 However, none ofthese methods are suitable for the isolation and characterizationof novel and high-potency compounds for a selected drugtarget directly from crude or semipure natural product extracts.Our approach uses crystallography as an affinity platform toselect the most potent component from a complex andunknown natural product mixture. Specifically, we evaluatedwhether partially purified inhibitor preparations can beexploited for co-crystallization experiments using CatK as abait for active site-directed inhibitors. First, we tested a rathercrude fraction resulting from Amberlite and Dowex batchtreatment of the crude L91-3 extract followed by n-butanolextraction (sample 1) and, second, a semipurified material thathad been subjected to additional reversed-phase HPLC, weakanion exchange, and Sephadex LH-20 chromatography (sample2) (Figure 1). Both samples 1 and 2 yielded crystals that, upondiffraction analysis, showed regions of contiguous Fo − Fcdifference electron density accounting for a bound ligand in theactive site. However, only sample 2 allowed complete structuralelucidation of the ligand, whereas sample 1 was present at loweroccupancy. An overlap of these electron densities can also beseen in Figure 2A (see Supporting Information, Table S2). Thestructure of the ligand bound to the active site,