UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Modeling of cell signaling pathways in macrophages by semantic networks Hsing, Michael 2005

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-ubc_2005-0221.pdf [ 16.93MB ]
Metadata
JSON: 831-1.0092077.json
JSON-LD: 831-1.0092077-ld.json
RDF/XML (Pretty): 831-1.0092077-rdf.xml
RDF/JSON: 831-1.0092077-rdf.json
Turtle: 831-1.0092077-turtle.txt
N-Triples: 831-1.0092077-rdf-ntriples.txt
Original Record: 831-1.0092077-source.json
Full Text
831-1.0092077-fulltext.txt
Citation
831-1.0092077.ris

Full Text

Modeling of cell signaling pathways in macrophages by semantic networks by Michael Hsing B.Sc, Department of Molecule Biology and Biochemistry Simon Fraser University, 2002 THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE in THE F A C U L T Y OF G R A D U A T E STUDIES (CIHR/MSFHR Strategic Training Program in Bioinformatics And U B C Genetics Graduate Program) THE UNIVERSITY OF BRITISH C O L U M B I A February 2005 © Michael Hsing 2005 ABSTRACT Macrophages are essential components of human immune system that engulf and digest pathogens using the molecular mechanisms of phagocytosis and phagosome maturation. These processes are regulated by an essential enzyme - phosphoinositide-3 -kinase (PI3K), a key initiator of signalling cascades in many cellular processes. Importantly, experimental studies demonstrate that some pathogenic bacteria, such as Mycobacterium tuberculosis (MTB), can interfere with PI3K pathways in order to survive within host macrophages. Based on the diverse roles of PI3Ks, it is reasonable to hypothesize that MTB effects upon PI3K signaling could impact macrophages in numerous ways, more than what are currently studied. It is anticipated that greater understanding of P B K signaling mechanisms in macrophages and bacterial interference could provide insights for developing effective strategies against MTB. The complexity of P O K pathways makes the analysis of MTB-macrophage interactions a challenging task. Although a vast amount of knowledge on the pathways has been accumulated in literature and databases, the information is encoded in static diagrams that are difficult to study. While it is necessary to analyze complex systems computationally, the tools for modeling pathways are inadequate. To address current limitation on pathway manipulation, we applied an artificial intelligence method called Semantic Networks (SN) to model MTB interference with PI3K signalling pathways in macrophages. The advantage of SN is in its capacity to represent abstract concepts in machine friendly formats termed "semantic agents" and "relationships". In SN, the behaviour of agents is not fixed, but instead emerges from their relationships. This characteristic makes SN well suited for modeling biological systems. Using the SN methods, a model has been created to describe ii PI3K participation in macrophage signaling. The model encompassed a large amount of information extracted from scientific literature and pertained such complex micro-events as formation of protein complexes, chemical modifications of proteins, allosteric regulation, and changes in intracellular localization by the agents. The data integration in the SN-environment allowed us to reconstruct the molecular mechanisms of macrophage pathogenic invasion, and the model predicted previously unobserved macrophage responses. The results will be used to guide and interpret upcoming gene and protein expression studies. iii TABLE OF CONTENTS Abstract ii Table of Contents iv List of Tables vi List of Figures vii Acknowledgements viii Chapter 1 INTRODUCTION 1 1.1 The current state of pathway representation and modeling 3 1.2 Bacterial infections in macrophages through PI3K and related pathway interference 7 Chapter 2 METHODS 13 2.1 Theory of semantic networks 13 2.2 Implementation of semantic networks by Visual Knowledge and BioCAD software 16 Chapter 3 RESULTS 20 3.1 A semantic model development for cell signalling pathways 20 3.1.1 Biological structures as semantic agents 21 3.1.2 Localizations and translocations as semantic agents 24 3.1.3 Non-covalent interactions as semantic agents 26 3.1.4 Covalent interactions as semantic agents 29 3.1.5 Allosteric regulations as semantic agents 31 3.1.6 Cellular responses as semantic agents 36 3.2 Reconstruction of macrophage pathways by semantic modeling 37 3.2.1 Data sources and pathway reconstruction 37 3.2.2 SN modeling of known MTB interference mechanisms 39 3.2.2.1 MTB promotes actin polymerization and rearrangement in macrophage 44 3.2.2.2 MTB promotes membrane delivery to plasma membrane in macrophage 46 3.2.2.3 MTB inhibits phagosome-lysosome fusion in macrophage 46 3.2.2.4 MTB inhibits recruitment of oxidase complex to phagosome in macrophage 47 3.3 Cause-effect SN simulation of macrophage pathways during infection 48 Chapter 4 DISCUSSION 59 4.1 Use of SN modeling for predicting unknown macrophage responses to infection 59 4.1.1.1 MTB increases intracellular glucose uptake in macrophage 60 4.1.1.2 MTB increases the rate of protein synthesis in macrophage 60 4.1.1.3 MTB promotes cell division in macrophage 61 iv 4.1.1.4 MTB promotes survival of macrophage 61 4.2 Advantages of using semantic networks for pathway modeling 63 4.2.1 Specify the spatial organization of molecules 64 4.2.2 Model proteins as logical, integrating and adaptive devices 64 4.2.3 Reduce the need for labels and descriptions 64 4.2.4 Provide a direct communication from models to simulations 65 4.3 Future directions 66 4.3.1 A collaborative pathway modeling environment 68 4.3.2 A potential tool for in silico drug discovery 70 Chapter 5 CONCLUSION 72 References 74 Appendices • 82 Appendix A - Definitions of the icons 82 Appendix B - Semantic Network Environment for Cell-modeling (SNEC) 84 Utilize current information to customize biological structures 84 Define the behavior of molecules by creating different types of events 85 Analyze and traverse interactions upstream and downstream 85 SNEC - screenshots 87 B l - Protein search page 92 B2 - Protein's detail page 93 B3 - Protein's localization page 94 B4 - Protein's domain/site page 95 B5 - Allosteric regulation summary page 96 B6 - Allosteric regulation's detail - <condition> page 97 B7 - Allosteric regulation's detail - <response> page 98 B8 - Interaction summary page 99 B9 - Non-covalent interaction's detail page 100 BIO - Covalent interaction's detail page 101 B l 1 - Cellular response detail page 102 Appendix C - List of molecules and events in the macrophage pathway model 103 CI - Molecules in the macrophage model 103 C2 - Non-covalent interactions in the macrophage model 107 C3 - Covalent interactions in the macrophage model I l l C4 - Allosteric regulations in the macrophage model 113 C5 - Cellular responses and their conditions in the macrophage model 115 LIST OF TABLES Table 1. Resources incorporated in the BioCAD environment 18 Table 2. Classification of biological structures in six prototypes in the semantic model.22 Table 3. Classification of biological events in six prototypes in the semantic model. ...24 Table 4. Two types of states in the semantic model 28 Table 5. Data sources used in macrophage pathway reconstruction 37 Table 6. Biological structure and event prototypes modeled in the macrophage pathways ; 39 Table 7. Macrophage responses known to be affected by MTB interference 44 Table 8. Non-covalent interaction events in the simulation 56 Table 9. Covalent interaction events in the simulation 56 Table 10. Allosteric regulation events in the simulation 57 Table 11. Translocation events in the simulation 57 Table 12. Unknown macrophage responses affected by MTB interference 59 vi LIST OF FIGURES Figure 1. A typical pathway representation. 2 Figure 2. Phagocytosis of bacteria in macrophages 7 Figure 3. General mechanisms for PI3K activation through cell receptors 10 Figure 4. An example of a semantic network 14 Figure 5. The basic classes of semantic agents in V K 17 Figure 6. Information flow between experimental sources and biological databases implemented through semantic models 21 Figure 7. Spatial organization of intracellular structures in the semantic model 23 Figure 8. Localization event of the semantic model 25 Figure 9. Translocation event of the semantic model 26 Figure lO.Non-covalent interaction event of the semantic model 27 Figure 11. Covalent interaction event of the semantic model 31 Figure 12. A visualization of allosteric regulations and interactions between Ras and PI3K-pllO 33 Figure 13. Allosteric regulation event of the semantic model 35 Figure 14. Cellular response of the semantic model 36 Figure 15.PI3K interaction map part one 41 Figure 16.PI3K interaction map part two 42 Figure 17.PI3K interaction map part three 43 Figure 18.Interactions between Fc-gamma receptor and Lyn kinase.... 49 Figure 19. A SN-based simulator, before a simulation run (time =0) 50 Figure 20. A SN-based simulator, at the end of a simulation run (time=6) 51 Figure 21. The sequence of simulation steps 52 vii ACKNOWLEDGEMENTS I would like to acknowledge my scientific supervisor, Artem Cherkasov and my thesis committee: Wyeth Wasserman and Leah Keshet for their advice and support. I thank Artem Cherkasov, Joel Bellenson, Conor Shankey, Kyle Recsky and Shawn Anderson for their help with the model development and implementation. I also thank Upstream Biosciences, Inc. and Visual Knowledge, Inc. for providing the semantic software environment and database support. I acknowledge Zakaria Hmama, Neil E. Reiner and Jimmy Lee for their advice on the bacterial invasion process in macrophages. I thank Fiona Brinkman, David Baillie, Francis Ouellette and Steven Jones for their advice and encouragement during my training in the CIHR/MSFHR Strategic Training Program in Bioinformatics. I would like to acknowledge the training program for providing the training and funding. I thank Michael Smith Foundation for Health Research and NSERC for additional financial support. viii CHAPTER 1 INTRODUCTION Physical interactions among genes, proteins and ligands regulate all cellular processes that are typically studied in networks, where molecules are visualized as nodes and interactions are represented by edges. The intracellular networks are commonly investigated in the context of signalling, metabolic and gene regulatory pathways (Ideker and Lauffenburger 2003). Although it is conventional to study molecular interactions in each of these three contexts separately, most cellular processes involve components from all pathway types. Therefore, it is important to integrate all known intracellular interactions into a unified model. Moreover, an adequate insight into molecular networks should include the dynamic behaviors of participating entities. Unfortunately, most of the existing network representation and manipulation methods do not capture the important features of interaction networks, such as protein allosteric regulation, domain organization within proteins and change of their intracellular localization. A pathway representation custom for the current databases such as Signal Transduction Knowledge Environment - STKE (Gough 2002) and the Kyoto Encyclopaedia of Genes and Genomes - KEGG (Kanehisa and Goto 2000), is illustrated in Figure 1, where an arrow with a plus sign indicates an activating or promoting relationship, and a line with a minus sign and a short bar at the end indicates a deactivating or inhibitory relationship. 1 Figure 1. A typical pathway representation. Figure 1. A typical pathway representation. This pathway drawing represents the relationships among 5 proteins A, B, C, D, E, and a cellular response, cell survival. An arrow with a plus sign indicates an activating or promoting relationship. For example, protein A activates protein C, and protein E promotes cell survival. A line with a minus sign and a short bar at the end indicates a deactivating or inhibitory relationship. For instance, protein B inhibits protein C. The lines on such diagrams designate physical interactions between molecules such as proteins, or they represent associations between molecules and cellular processes. The information represented in this way is ambiguous because a non-covalent binding and a chemical reaction are not clearly distinguished. In addition, it is difficult to determine the true cause-effect relationships from protein A and B to protein D and E through protein C. For instance, the diagram doest not reflect on whether the activation of protein C by A promotes the activation of D or promotes the deactivation of E, or whether the both processes take place. Even though the limitations and drawbacks of such simplified graphic representation are obvious, the majority of the current pathway and interaction databases have been developed with similar representations. 2 1.1 The current state of pathway representation and modeling The current tools for pathway representation and modeling can be classified into two broad groups: databases that store information on molecular interactions, and programs that utilize the information for simulation. Pathway databases such as K E G G (Kanehisa and Goto 2000), MetaCyc (Krieger et al. 2004) and M P W (Selkov et al. 1998) contain information on metabolic pathways for a great number of studied species. S T K E (Gough 2002), BioCarta (BioCarta 2004) and T R A N S P A T H databases (Krul l et al. 2003) focus on pathways related to cell signaling and gene regulation. For instance, the Connection Maps database in S T K E contains about 60 pathway diagrams that were created by pathway authorities and cover major signaling processes such as M A P K , G -protein, insulin and PI3K pathways (Gough 2002). In addition, the a M A Z E project offers an object- oriented environment that integrates biological entities and interactions from metabolic, cell-signaling and gene regulatory pathways (Lemer et al. 2004). Although theses databases contain valuable knowledge on biological pathways, the information is represented in static, non-linked diagrams that are difficult to analyze computationally. Protein interaction databases such as B I N D (Bader, Betel, and Hogue 2003), IntAct (Hermjakob et al. 2004b), DIP (Xenarios et al. 2002), H P R D (Peri et al. 2003), G R I D (Breitkreutz, Stark, and Tyers 2003a) and M I N T (Zanzoni et al. 2002) put more emphasis on storage and retrieval of individual protein-protein interactions that are experimentally verified. In particular, B I N D is a valuable source of data that currently contains about 140,000 interactions from 1,050 organisms such as human, mouse, drosophila and yeast (Bader, Betel, and Hogue 2003). Interactions in B I N D are carefully curated with experimental evidence including co-immunoprecipitation, affinity chromatography, and yeast two-hybrid test. The data model of B I N D includes three main types of objects: interactions, molecular complexes 3 and pathways (Bader and Hogue 2000). An interaction,object describes an interaction between proteins, DNA or RNA, encompassing information on binding sites, chemical actions, kinetics, and chemical states. A molecular-complex or pathway object defines a collection of interaction objects that form a complex or pathway, respectively. In addition, there are several computational approaches that predict protein-protein interactions and complement the limited experimental data. For example, STRING predicts protein-protein interactions, which include physical and functional associations, from genomic context, co-mentioning of gene names in PubMED abstracts and co-regulation of genes in microarrays (von Mering et al. 2005). STRING developed a scoring system that combines different types of evidence (including both predictions and high-throughput experimental data) and assigns each protein-protein interaction a confidence score. Other methods predict protein-protein interactions based on interacting protein domains. InterDom utilized a collection of annotated protein domains from Pfam (Bateman et al. 2004) and derived domain-domain interactions from sources including protein complexes at PDB (Bhat et al. 2001), experimentally verified protein interactions in BIND and DIP, and gene fusion (Ng, Zhang, and Tan 2003). Scansite contains a collection of interaction rules between short sequence motifs and domains (Obenauer, Cantley, and Yaffe 2003). These interaction databases provide a good collection of experimentally determined or predicted protein interactions. However, unlike the pathway databases, most of the interactions are not associated with each other in the context of cellular processes. In addition, the current representation for interactions cannot capture the conformational and functional changes of their participating proteins. Since these databases do not provide a dynamic insight into the interactions, there exist a number of computational approaches that simulate cell processes dynamically using the static 4 data collections described above. Programs such as E-cell (Tomita et al. 1999), Gepasi 3 (Mendes 1997), Virtual Cell (Loew and Schaff 2001) and BioSPICE (Garvey et al. 2003), use differential equations to represent molecular interactions quantitatively (Neves and Iyengar 2002). In particular, E-cell developed a whole-cell simulation in Mycoplasma genitalium, based on a minimal set of 127 genes (Tomita et al. 1999) and has attempted to simulate metabolic pathways in human erythrocyte (Tomita 2001). However, many cellular processes are sensitive to the stochastic behavior of a small number of cell components, which compromise the suitability of differential-equation methods (Le Novere and Shimizu 2001). Because differential equations treat each molecular species as a single variable, a molecular event cannot be tracked individually in a simulation. Several studies have attempted to address the stochastic character of cellular processes. Vasudeva and Bhalla proposed a hybrid simulation method that combined both deterministic and stochastic calculations (Vasudeva and Bhalla 2004). A stochastic simulator, StochSim represented molecules as individual software objects that interact according to probabilities (Le Novere and Shimizu 2001). Although these two approaches demonstrated the utility of stochastic simulation on individual molecules, the programs are limited in modeling intracellular translocations and functional roles of protein domains. In addition to the pathway/interaction databases and the simulation programs, there exist several languages that facilitate the exchange of interaction data and pathway models. An XML-based format, called PSI-MI, has been developed for exchanging and retrieving protein-protein interaction data from databases such as BIND, DIP and IntAct (Hermjakob et al. 2004a). The BioPAX project develops a common format for sharing and exchange of biological pathway data (BioPAX 2005). The System Biology Markup Language (SBML) has been 5 developed for representing biochemical reaction networks and for communicating models used for various simulation programs (Hucka et al. 2003). Tools such as Cytoscape (Shannon et al. 2003) and Osprey (Breitkreutz, Stark, Tyers 2003b) visualize molecular interaction data in various graph layouts, composed of nodes (molecules) and edges (interactions). In particular, Cytoscape can assign different attributes to nodes and edges, and can overlay data such as gene expression on top of protein-protein interaction networks. Cytoscape has several "plug-in" modules that support network analysis and data import in PSI-ML or SBML format (Shannon et al. 2003). In addition, there are several pathway environments such as PATIKA (Demir et al. 2002) and the Pathway Tools software (Karp, Paley, and Romero 2002) that enable automatic pathway creation from annotated genomes and manual pathway reconstruction by experts. They provide tools for manipulating and analyzing pathways, but they currently lack simulation capability. In our research, we attempt to address the shortcomings in pathway manipulation through semantic modeling of biological pathways. In collaboration with a biotechnology company, Upstream Biosciences, Inc., we have utilized an artificial intelligence method, Semantic Networks (SN), to develop and implement a model, which represents and integrates complicated information on protein complex formations, chemical modifications, allosteric regulations, intracellular localizations and cause-effect relationships in pathways. We have considered the cell signalling events involving the phosphoinositide-3-kinases (PI3K) -related pathways that are affected by Mycobacterium tuberculosis (MTB) during the macrophage internalization. By doing so, we did not only bring a well-developed SN platform to the field of biological data integration, but also used the advantage of semantic networks for identification of previously unappreciated cellular responses occurring during and upon infection. 6 1.2 Bacterial infections in macrophages through PI3K and related pathway interference Macrophages express a variety of cell-surface receptors that can bind to bacterial surface molecules such as lipopolysaccharide and peptidoglycan, or to the Fc portion of antibodies and the C3b complements associated with pathogenesis (Ernst 1998). Upon ligand binding, the macrophage cell-surface receptors become activated and trigger the phagocytosis of bacteria (Tjelle, Lovdal, and Berg 2000). Figure 2 shows the phagocytosis process in macrophage, which involves actin polymerization and rearrangement at the site of bacterial contact. As new membrane is delivered by intracellular vesicles, the macrophage plasma membrane is extended to surround a bacterium, forming a cup structure called a pseudopod. The pseudopod is further extended until the whole pathogen is engulfed inside a newly formed organelle called a phagosome. Figure 2. Phagocytosis of bacteria in macrophages. Figure 2. Phagocytosis of bacteria in macrophages. The picture shows macrophages ingesting green fluorescent mycobacteria (indicated by arrows). The host cell membrane was stained by red fluorochorme PKH to define the limit of the cell. (The picture was provided by Zakaria Hmama, Division of Infectious Diseases, Dept. of Medicine, University of British Columbia) The phagosome goes through a maturation process, fusing with lysosomes to form phagolysosome, which contains lysozymes and acid hydrolases that can degrade bacterial cell walls and proteins (Tjelle, Lovdal, and Berg 2000). In addition, the NADPH oxidase complex 7 is assembled on the phagosomal membrane to catalyze the production of toxic oxygen-derived compounds such as hydrogen peroxide, superoxide, hypochlrorite, nitric oxide and hydroxyl radicals (Stephens, Ellson, and Hawkins 2002). The pathogen is normally killed during phagosome maturation. However, it has been observed that some pathogens sustain their infections through surviving within host macrophages (Meresse et al. 1999; Tjelle, Lovdal, and Berg 2000). The eukaryotic parasite Trypanosoma cruzi and bacteria including Shigella flexneri and Listeria monocytogenes can lyse phagosomal membrane and escape into host cytosol. The eukaryotic pathogen Leishmania mexicana and the bacterium Coxiella burnetii have also developed mechanisms to survive in the harsh environment inside the phagolysosome. Moreover, some bacteria such as Salmonella trphimurium and Mycobacterium tuberculosis manage to inhibit the phagosome maturation and reside inside the immature phagosomes (Finlay and Falkow 1997; Russell 2001). A remarkable example of major pathogens capable of surviving inside macrophages is Mycobacterium tuberculosis (MTB) that causes serious lung infection. It is estimated that 1.7 to 2.0 billion people world-wide are currently infected by MTB, and 3 million deaths a year are attributable to tuberculosis (Health & Development Initiative 2004). In healthy individuals, infected macrophages are confined in a lesion called tubercle. When the immune system of the infected individual is weakened by drugs or other diseases, the MTB infection can be reactivated and spread in the lungs and to other organs. Both phagocytosis and phagosome maturation are regulated by complicated intracellular pathways (Stephens, Ellson, and Hawkins 2002). It has been hypothesized that MTB target and modify components in these pathways to ensure its intracellular survival (Fratti et al. 2001). Many studies have been done to identify the individual molecular interaction 8 involved in the pathways (Stephens, Ellson, and Hawkins 2002; Gu et al. 2003). Thus, it has been previously shown that there is a family of enzymes, phosphoinositide 3-kinases (PI3Ks), that plays critical roles in regulating both phagocytosis and phagosome maturation (Vieira et al. 2001). The class I PI3K is required for phagocytosis, while the class III PI3K is responsible for phagosome maturation (Vieira et al. 2001). The class I PI3Ks are composed of a p85 regulatory subunit and a catalytic subunit pi 10 (Vanhaesebroeck and Waterfield 1999). The pi 10 is an allosteric enzyme that is activated when it binds small G-protein Ras-GTP or when the p85 subunit is bound to phosphotyrosine site by a SH2 domain. Activation of class I PI3K induces macrophage phagocytosis, and it has been established that some pathogens, including the MTB initiate this process via interactions with cell receptors (such as Fc-gamma receptor) linked to class I PI3K. Such interaction leads to auto-phosphorylation of receptor's phosphotyrosine site, if the receptor contains tyrosine kinase domain as illustrated in Figure 3 (Wymann, Zvelebil, and Laffargue 2003). In other cases a receptor can bind and activate additional tyrosine kinases that phosphorylate the phosphotyrosine sites on adaptor proteins. Over 50 different receptors are known to activate the class I PI3K by either of these two mechanisms (Wymann, Zvelebil, and Laffargue 2003). 9 Figure 3. General mechanisms for PI3K activation through cell receptors. (1) (2) Cellular response Figure 3. General mechanisms for PI3K activation through cell receptors. The class IPI3K enzymes are activated through two types of pathways. (1) Upon ligand binding, receptors that contain kinase domains dimerize and auto-phosphorylate each other at phosphotyrosine sites. Phosphorylated tyrosine binds to the SH2 domain on p85 and activates pi 10. (2) Receptors that lack kinase domains activate additional kinase proteins. Those kinases phosphorylate phosphotyrosine residues on adaptors proteins, which in turn activate PI3K. Activated PI3K phosphorylates PIP2 into PIP3, which induce the activation of downstream kinases and cellular responses. Blue circles represent proteins, and yellow circles are domain and sites. Black arrows indicate direct interactions including bindings and chemical reactions, while red arrows indicate the indirect relationships to cellular responses. Abbreviations: R = receptor, K=kinase domain, pY = phosphotyrosine. This figure is adapted from Figure 1 of Cantley's paper (Cantley 2002) on PI3Kpathways. Once the PI3K-pllO is activated by either receptors or adaptor proteins, PI3K-pllO phosphorylates membrane-bound lipid phosphatidylinositol-4,5-bisphosphate (PIP2) into phosphatidylinositol-3,4,5-bisphosphate (PIP3). PIP3 binds to pleckstrin homology (PH) domains of proteins such as PDK1 and AKT1, and the interactions localize those proteins to the cell membrane (Vanhaesebroeck et al. 2001). To the date, about 97 human proteins have been associated with the PH domain, according to the prediction from Pfam (Bateman et al. 2004), and these proteins can potentially interact with PIP3. Through PIP3, the class I PI3K can 10 regulate a variety of cellular signalling events including cell survival, cell growth, replication, transcription, and translation (Cantley 2002; Wymann, Zvelebil, and Laffargue 2003). These studies imply that the activation of the class I PI3K in macrophage by intracellular pathogens not only lead to the known phagocytosis response, but also cause multiple changes in the cell that are important to recognize. Another pathogenic mechanism employed by the MTB for macrophage manipulation through the PI3K pathways involves deactivation of the class III of the enzyme. The class III PI3K also consists of two subunits. These are a pi 50 subunit, which is a Ser/Thr protein kinase, and an active PIK3C3 (homolog of Vps34p in yeast) unit that phosphorylates phosphatidylinositol (PI) lipid to phosphatidylinositol-3-phosphate (PI3P) (Vanhaesebroeck 1999). The pi50 subunit serves as an anchor linking PIK3C3 to phagosomal or lysosomal membrane (Murray et al. 2002; Stephens, Ellson, and Hawkins 2002). It has been suggested that the MTB has developed a mechanism of competitive binding to PIK3C3 subunit of class III PI3K by producing a pathogenic analogue of PI3P lipid, ManLAM (Mannose-capped lipoarabinomannan) (Fratti et al. 2001). By establishing such PI3K binding competition, the MTB prevents the further production of PI3P substrates that lead to a suppression of the superoxide generating complex and EEA1 (early endosome antigen 1) recruitments which are essential for normal phagosome maturation. The exact mechanism of this pathogenic interference is not well studied, and its implications are not fully understood. Nonetheless, it is clear that the PI3K enzymes have significant implications in bacterial invasions. We anticipate that the detailed reconstruction of PI3K pathways by the semantic networks would allow us to answer the following questions. Firstly, which and how 11 macrophage pathways are affected by MTB? Secondly, what molecules are involved in those pathways? Thirdly, what cellular responses can be induced by MTB? These questions are addressed by two bioinformatics objectives. The first objective is to develop a biological SN-language" or a semantic model for representing and modeling cell signaling pathways. The second one is to apply the resulting model for reconstructing macrophage pathways from the literature and for predicting MTB interference. 12 CHAPTER 2 METHODS 2.1 T h e o r y o f semantic networks Semantic networks were first introduced and formalized by Griffith, R.L. in 1982, as a general method to represent complex information by nodes and edges in a graphic form. Nodes (semantic agents) represent abstract concepts, and the identity and behaviors of each agent is defined by its edges (relationships) with other agents in a semantic network (Griffith 1982; Visual Knowledge 2004). An example of a semantic network is illustrated on Figure 4, and it contains five agents and eight relationships. Semantic agents are connected by reciprocal relationships, and agents that share common properties are classified into the same category. For instance, the agent [Protein A] has a relationship {instance of} with the agent [Protein] (a prototype), which has the opposite relationship {prototype of} with [Protein A]. Similarly, [Protein B] has a relationship {instance of} with [Protein]. Hence, [Protein B] is in the same category as [Protein A] because they share the same prototype. In addition, the composition relationships allow agents to be related to their components. For example, [Protein A] and [Protein B] have relationships {composed of} with [Domain A] and [Domain B] respectively. The opposite relationship of {composed of} is {component of}. 13 Figure 4. An example of a semantic network. instance of (relationship) instance of (relationship) composed of (relationship) component of (relationship) composed of (relationship) component of (relationship) Figure 4. An example of a semantic network. Characteristics and behaviors of a semantic agent are defined by its relationships with other agents. Semantic agents are represented as nodes, and relationships are depicted as edges. This semantic network conveys the information that Protein A and Protein B are instances of a Protein (a prototype), and they are composed of Domain A and Domain B respectively. When a semantic network is implemented within a computing environment, it can efficiently model complex systems and solve multi-component problems (Griffith 1982). The underlying design principle of a semantic network is important as it affects its capability to represent the complexity of the system. For instance, it has been suggested that while some simple concepts or ideas can be sufficiently represented by a single agent, a more complicated concept should be modeled by a set of interconnected agents (Griffith 1982). A representation of concepts by over-complicated agents or relationships can make the semantic network too 14 descriptive, and hence impair its ability to integrate and consolidate similar concepts and to identify emerging properties. Therefore it is beneficial to represent a complicated concept (such as a protein or a chemical reaction) with a set of simple, reusable and well-classified agents interconnected by fundamental relationships such as the prototype and composition relationships as described. Since the introduction of semantic networks in the 1980's, this methodology of knowledge representation has influenced artificial intelligence, relational database technology and object-oriented programming (Griffith 1982; Visual Knowledge 2004). Recently semantic networks have gained significant attention of biological community as a powerful tool for organizing and integrating large amounts of biological information (McCray and Nelson 1995). For instance, the semantic network in the Unified Medical Language System (UMLS) was designed to retrieve and integrate biomedical information from various resources (Lindberg, Humphreys, and McCray 1993). The UMLS semantic network has also been applied and expanded to include information and knowledge from other domains such as genomics (Yu et al. 1999). The BioMOBY project has applied the "semantic web" concept for the integration and communication between bioinformatic tools and databases that use different data types (Wilkinson and Links 2002). Other studies have suggested a semantic approach where proteins are viewed as "adaptive and logical agents", whose properties and behaviors are affected by other agents in their spatial organization including intracellular compartments and protein complexes (Fisher, Paton, and Matsuno 1999; Fisher, Malcolm, and Paton 2000). Defining the semantics among agents could characterize both local and global behaviors of a system, and therefore, it is potentially useful to apply such approach to study cell signalling in biological systems (Fisher, Malcolm, and Paton 2000). 15 Recently, an SN-based application development environment known as Visual Knowledge (VK) has been developed by a Vancouver-based software development company, Visual Knowledge, Inc. The V K environment has been shown capable of different formalizations and implementations of semantic networks, and it allows information from various domains to be properly integrated (Visual Knowledge 2004). 2.2 Implementation of semantic networks by Visual Knowledge and BioCAD software Visual Knowledge is an application development environment that implements the theory of semantic networks and other contemporary computational methods including set theory, frame system, object-oriented modeling theory and systems based on networks of active software agents (Visual Knowledge 2004). V K is distinguished from other passive knowledge representation technologies by its dynamics, scalability, and capability of active representation and integration of different domain knowledge. To model real-world systems, V K has implemented several fundamental classes of agents in semantic networks, some of which are presented in Figure 5. V K allows creation of prototypes within each basic class and enables further classification of agents based on their common properties. Therefore any form of "semantic models" can be developed by meaningful connections of agents through specific relationships. The models then act as the medium that translates and integrates information from different domains into semantic networks. For instance, a semantic agent of the class "physical thing" models a physical object that has a shape and occupies space (Visual Knowledge 2004). An agent of the class "event" represents a phenomenon or a change that occurs on a physical object over a period of time. To enable application development such as a website, the V K contains application-specific agents such as triggers, operations, and reports. A trigger is an agent that spawns other agents and 16 changes their relationships. An operation agent searches and collects other agents with certain properties, and a report agent displays the results in an application. Figure 5. The basic classes of semantic agents in VK. Figure 5. The basic classes of semantic agents in VK. Semantic agents in the Visual Knowledge environment are classified according to their common properties and functions. Each class contains its own computer-codes and a unique set of relationships that defines the intrinsic behaviors of all its instances. Each agent is reusable and contains instructions to act automatically when it is connected to the proper agents. The corresponding graphic user-interface in VK allows users to conveniently implement SN models and applications without any computer-code writing, but rather by simple dragging and dropping of SN agents in-and-out their relationships. Previously, Visual Knowledge has been successfully used to model and manipulate various complex systems including corporate enterprise environment, flight scheduling, hardware maintenance simulators, and integrated currency exchange boards (Visual Knowledge 2004). It has been anticipated that the Visual Knowledge platform can address current limitations in the modeling of cell signaling pathways. The specialized, biology-oriented V K 17 application package called BioCAD has been developed and delegated to the Vancouver-based bioinformatics company Upstream Biosciences, Inc. (BioCAD 2004). BioCAD software provides standard bioinformatics tools for managing large-scale biological data, and for visualizing and editing biological pathways. BioCAD currently contains millions of biological concepts that were extracted from publicly available bioinformatics resources. For instance, BioCAD contains 40,512 proteins from Homo sapiens as other organisms including Saccharomyces cerevisiae, Drosophila, Mus musculus and Rattus. Within the BioCAD environment each prototypical protein is connected to various annotations derived from RefSeq (Pruitt, Tatusova, and Maglott 2005), GenBank (Benson et al. 2005), and Gene Ontology (Harris et al. 2004). Table 1 shows some of the resources that have been incorporated into the BioCAD environment and their database version numbers. Information on protein domains and sites have also been imported and integrated into BioCAD. For example, the database currently contains 7,316 domains from Pfam (Bateman et al. 2004), 1,331 sites from Prosite (Sigrist 2002), and domains and sites from eight other domain databases. Each protein has been connected to its proper protein domains according to the annotation from InterPro (Mulder 2005). To facilitate prediction of protein-protein interactions, 30,037 domain-domain interactions have been incorporated from sources such as InterDom (Ng et al. 2003). Table 1. Resources incorporated in the BioCAD environment. Database Database version Date of import RefSeq 3 Feburary, 2004 Unigene May, 2004 (release date) May, 2004 Gene Ontology April, 2003 (release date) April, 2003 InterPro 7.2 May, 2004 Pfam 12.0 May, 2004 PROSITE 18.10 May, 2004 18 PRINTS 37.0 May, 2004 ProDom 2002.1 May, 2004 Smart 4.0 May, 2004 TIGRFAMs 3.0 May, 2004 PIR SuperFamily 2.41 May, 2004 SUPERFAMILY 1.63 May, 2004 InterDom 1.2 August, 2004 The existing biological concepts and resources in BioCAD provide an excellent environment for studying macrophage pathways in humans. A locally installed client program allows additional semantic agents to be easily created, stored and queried from a remote central server located at Upstream Biosciences, Inc. Using the BioCAD environment, we have developed and implemented a SN-based language or a semantic model that is capable of representing the complex molecular mechanisms such as the PI3K-controlled regulation of cell signalling in human macrophage. To facilitate the reconstruction and analysis of the macrophage pathways and MTB interference, we have built a web-based application called Semantic Network Environment for Cell-modeling (SNEC). SNEC utilizes the developed semantic model and the existing biological entities in BioCAD's database for collaborative pathway reconstruction. The unique features of SNEC are discussed in Appendix B. 19 CHAPTER 3 RESULTS 3 .1 A semantic model development for cell signalling pathways One of the basic concepts of the SN methodology is 'a model' that may refer set of rules in two independent contexts. A "semantic model" designates specific rules for translating biological concepts into semantic agents and relationships. A "pathway model" encompasses rules specifying what, how, when and where molecules can interact. The basic SN methodology for constructing a pathway model is represented in Figure 6, which illustrates how information from experimental observations is translated and integrated into pathway models. The semantic models communicate with the Visual Knowledge environment, storing information in the forms of semantic agents and relationships. Such organization of biological information by the SN environment allows effective querying, analysis and inference of the pathway models that can significantly facilitate testing of biological hypothesis and guide experimental efforts. "A version of this chapter has been published. Hsing, M. , J. L. Bellenson, C. Shankey, and A. Cherkasov. 2004. Modeling of cell signaling pathways in macrophages by semantic networks. BMC Bioinformatics 5 (1):156." 20 Figure 6. Information flow between experimental sources and biological databases implemented through semantic models. t \ Hmlugy Semantic networks (Visual kiiunli'ilKi- I Figure 6. Information flow between experimental sources and biological databases implemented through semantic models. Biological data and information are generated from experimental observations and integrated into pathway models. Semantic models translate the pathway information into semantic agents and relationships in the Visual Knowledge environment, which store the information in a database. 3.1.1 Biological structures as semantic agents All biological structures can be considered as physical objects. Within a semantic network, specifically, they can be represented as semantic agents of the "physical thing" class. Six prototypes are introduced to address distinctive subgroups (Table 2). 2 1 Table 2. Classification of biological structures in six prototypes in the semantic model. Semantic Agent - Physical thing Biological Example Cell Human macrophage, Mycobacterium tuberculosis Intracellular Compartment Plasma membrane, cytosol, phagosome, nucleus Macromolecule Protein, nucleic acid, polysaccharide, fat/lipid Domain and Site Catalytic domain, SH2 domain, PH domain, binding site, phosphorylation site, promoter, gene regulatory site. Small Molecule and Molecular Fragment Amino acid, nucleotide, sugar, fatty acid Atom Hydrogen, carbon, oxygen, nitrogen, phosphorus, sulphur Table 2. Classification of biological structures in six prototypes in the semantic model. Six major prototypes classify biological structures that are relevant in the study of cell signaling pathways in macrophages. The second column lists biological examples of each. From the highest to the lowest level, they are positioned as the following [Cell], [Intracellular Compartment], [Macromolecule], [Domain/Site], [Small Molecule/Molecular Fragment], and [Atom]. The [Macromolecule] prototype is further classified into four sub-prototypes: [Protein], [Nucleic acid], [Polysaccharide], and [Fat and Lipid]. The [Domain/Site] objects represent domains, which are common structural folds in macromolecules, and sites, which are short-sequence motifs or post-translational modification sites. The [Small Molecule/Molecular Fragment] has been further divided into four subgroups: [Amino acid], [Nucleotide], [Sugar], and [Fatty acid]. The final prototype is [Atom] that models individual chemical elements. Because cell signaling pathways are composed of molecules and their interactions, atoms are not considered further in the modeling. Composition relationships relate each biological structure to its components. Figure 7 illustrates the semantic representation of a macrophage cell. The semantic agents are represented as individual icons (Appendix A contains the definitions of the icons), and the 22 semantic relationships are depicted as solid arrows. Although all agents are related by pairs of reciprocal relationships in SN, we depict only one direction for simplicity. A solid arrow represents the {composed of} relationship. A dotted arrow indicates that there are additional agents and relationships between the icons. For instance, [Cytosol] and [PDKl] are linked through a [localization event] agent. Figure 7. Spatial organization of intracellular structures in the semantic model. Macrophage Cell Intracellular compartment <&& <&& <&& Plasma Membrane Cytosol Phagosome Mitochondrion Nucleus Macromolecule and Small molecule Domain and Site Fcr Receptor pllO-binding Site SH2 Domain Kinase Domain PH Domain Figure 7. Spatial organization of intracellular structures in the semantic model. Biological structures are modeled by semantic agents, which are related to their components by the composition relationships. A human macrophage has been modeled as a semantic agent of the [Cell] prototype, and it is composed of various [Intracellular Compartment] agents, including plasma membrane, cytosol, nucleus and others. Each compartment such as cytosol has linked to [Macromolecule] and [Small Molecule//Molecular Fragment] agents including proteins, ATP and GTP. A macromolecule such as a protein is further composed of [Domain/Site] agents. 23 3.1.2 Localizations and translocations as semantic agents Six interaction types have been incorporated into the semantic model, each represented by a semantic agent of the [Event] class (Table 3). Table 3. Classification of biological events in six prototypes in the semantic model. Semantic Agent - Event Biological Examples Localization A protein is located in the cytosol Translocation A protein moves from cytosol to plasma membrane. Non-covalent Interaction A ligand binds to a receptor. Covalent Interaction An enzyme catalyzes a chemical reaction where substrates are converted to products. Allosteric Regulation A ligand binding on site A of a protein causes a conformational change on site B of the protein. Cellular Response A qualitative cellular behavior such as cell survival, cell death, phagosome formation, and an increase of intracellular glucose level. Table 3. Classification of biological events in 6 prototypes in the semantic model. Six major event prototypes represent interactions among biological structures. The first column contains the six prototypes, and the second column contains biological examples of the correspond prototypes. Here it is important to distinguish between two kinds of agents in semantic networks: the prototypical agent and the instance agent. A prototypical protein such as PI3K-pllO can have many instances that inherit the same properties from the prototype. Similarly, a prototypical event can have event instances, each of which considers a particular occurrence of an event on molecular instances. The prototypical agents represent the pathway information and define how their instances should behave in a simulation program. 24 To capture the complete information associated with an interaction, a corresponding event considers biological structures at different organizational levels. For instance, a [localization event], which represents subcellular localization of a molecule, connects not only a [molecule] agent, but also an [intracellular compartment] agent. A prototypical molecule agent can associate with multiple cellular compartments through different localization events. An example of localization events has been illustrated in Figure 8, in which PI3K-pl 10 can be located in the cytosol, plasma membrane or phagosome. Although a prototypical molecule has been linked to multiple locations, we have specified that a given instance can only be present at one place at any given time in a simulation. Figure 8. Localization event of the semantic model. POK-p l lU Localization T Phagosome Figure 8. Localization event of the semantic model. Localization events in the model define the possible locations of a molecule in a cell. In this example, a PI3K-pllO protein agent is connected to three different intracellular compartments through three localization events. In addition to the localization event, a [translocation event] defines the movement of a molecule from one location to another. Figure 9 features an example where a PI3K-pll0 25 Localization Cytosol Localization Plasma membrane molecule moves from the cytosol to plasma membrane. The translocation event connects [P I3K-p l 10] to [Cytosol] with a {Origin} relationship, while [Plasma membrane] is linked by a {Destination} relationship. Taken together, the [localization event] defines all possible locations of a molecule in a cell, while the [translocation event] allows an instance of the molecule to change its location during a simulation. Figure 9. Translocation event of the semantic model. & PI3K -p l lO "Molecule moved" ^ "Origin- „ "Destination" . ^ Cytosol Translocation Plasma membrane Figure 9. Translocation event of the semantic model. A translocation event agent connects to a molecule, an original location and a destination. The translocation event allows proteins such as PBK-pllO to move in between locations in a simulation. 3.1.3 Non-covalent interactions as semantic agents Physical interactions occurring between two molecules via non-covalent forces (electrostatic interactions, van der Waals interactions, hydrogen bonding) have been represented by [Non-covalent interaction] event agents in the semantic model. A non-covalent event has been composed of two sub-events: [binding event A] and [binding event B], and each binding event considers only one molecule. Figure 10 illustrates a non-covalent event that models binding between P I 3 K - p l 10 and Ras. 26 Figure 10. Non-covalent interaction event of the semantic model. Non-covalent interaction Binding event A Binding event B Ras PI3K-pllO Func. Bound Binding domain for non-cov. B PI3K-pllO Ras Func. Bound Binding domain for non-cov. Allosteric regulation PI3_kinase domain Figure 10. Non-covalent interaction event of the semantic model. A non-covalent interaction event models the binding of two molecules. This example features the interaction between Ras and PI3Kproteins. The event links the binding domains and their corresponding states. Each sub-event has been linked to not only a molecule and its binding domain but also two types of states. We have assigned two distinct states to indicate the condition and consequence of each interaction event. Table 4 shows the two types of states: "states required for interactions" and "states caused by interactions". Each type further contains 2 subgroups to differentiate between non-covalent and covalent interactions. 27 Table 4. Two types of states in the semantic model. Non-covalent interaction Covalent interaction 1. States required for interactions (conformational states) [Functional for non-covalent] [Functional for covalent] psfon-functional for non-covalent] [Non-functional for covalent] 2. States caused by interactions (binding states or phosphorylation states) [Bound] [Phosphorylated] [Not-bound] [Not-phosphorylated] Table 4. Two types of states in the semantic model. Two major types of states are associated with interactions, and they represent conformational and functional changes during physical interactions. The first type is conformational states that are required for either a non-interaction or a covalent interaction. Each sub-type also contains a positive (Functional) and a negative (Non-functional) state. The second type is states that are changed as the result or consequence of either a non-covalent or a covalent interaction. Each sub-type contains two opposite state's; positive (e.g. Bound) and negative (e.g. Not-bound). Each of eight possible states is represented by an individual semantic agent. The first type of states, "states required for interactions", have been introduced to indicate the conformation that is required for an interaction to occur. The [Functional for non-covalent interaction] state or the [Functional for covalent interaction] state denotes that a domain or site is in a correct conformation that enables the occurrence of a non-covalent interaction or covalent interaction respectively. On the other hand, the [Non-functional for non-covalent interaction] or the [Non-functional for covalent interaction] states imply that a domain or site is present in such a conformation that prevents its interactions with other cell components. The second state type "states caused by interactions" designates binding states or phosphorylation states. For example, a domain with the [Bound] state implies that such a domain is currently bound to a domain on another molecule, while the [Not-bound] state indicates the domain is not engaged in a non-covalent interaction. The [Phosphorylated] state 28 indicates a modification residue has been phosphorylated, while the [not-phosphorylated] state means the residue is not phosphorylated. In the example of Ras and PI3K-pl 10 binding, the [Binding event A] is connected to a Ras protein and its corresponding PI3K-pllO binding domain as well as two different states. The [Functional for non-covalent interaction] state indicates that the [PI3K-pll0 binding domain] on Ras needs to acquire this conformational state before the interaction can occur. On the other hand, the [Bound] state indicates that this binding domain will acquire the [Bound] state as the result of the interaction. Thus, the construction of the non-covalent interaction event allows us to infer the existence of molecular complexes from the connections of the corresponding binding molecules. No additional agents are required to represent protein complexes. An "allosteric regulation event" describes the situation in which interaction at one site of a protein causes conformational changes on another part of the molecule that have functional implications. The SN representation of the allosteric regulation is illustrated in Figure 10 through the example of non-covalent interaction occurring between Ras and PI 10, leading to the change of the state of the PDkinase domain. The use of allosteric regulation events in SN modeling allows us to represent cause-effect relationships between proteins domains. The model for allosteric regulation events is presented in detail in Section 3.15. 3.1.4 Covalent interactions as semantic agents In contrast to non-covalent interactions, covalent interaction events model chemical reactions (often catalyzed by enzymes) that transform substrates to products by breaking or creating covalent chemical bonds. A covalent interaction event has been represented with the SN environment by three types of sub-events: an [enzyme event] that models the involvement of an enzyme and its active site; a [substrate event] that represents ligands and their 29 modification sites; and a [product event] that links the corresponding products. As an example, Figure 11 illustrates a chemical reaction catalyzed by PDK1 kinase which phosphorylates protein AKT1. The enzyme event has been linked to a prototypical PDK1 enzyme and its kinase domain as well as the [Functional for covalent interaction] state. Such connections imply the kinase domain has to be "functionar for this reaction to occur. It is possible that an [allosteric event] regulates the states of the kinase domain through another ligand binding or a chemical modification on the enzyme. The first substrate event has linked to a prototypical AKT1 protein and its phosphorylation site on threonine 308 (T308). The connection with the [Not-phosphorylated] state indicates that the site is not phosphorylated prior to the covalent interaction event. The first product event connects the same prototypical AKT1, the same prototypical [T308] site, and the [phosphorylated] state. Although the same AKT prototype participates in both the substrate and product events, a new instance of AKT1 and a new instance of the site T308 are created in the simulation. Similarly, a new instance of a small molecule ADP is created to represent the second product from the reaction. 3 0 Figure 11. Covalent interaction event of the semantic model. * • Kinase Func. PDK1 domain for Co v. • Enzyme event Covalent interaction Xs1 I ? 2 Substrate event 1 Substrate event 2 Product event 1 Product event 2 AKTl T308 Not ATP phosphorylated AKTl T308 phosphorylated ADP Figure 11. Covalent interaction event of the semantic model. This covalent interaction represents the phosphorylation of AKTl by the enzyme PDK1. The event has been connected to the substrates (AKTl and ATP), products (AKTl and ADP) and the phosphorylation site on threonine 308 (T308). 3.1.5 A llosteric regulations as semantic agents A protein can adopt multiple conformations in response to non-covalent ligand binding or chemical modifications of particular residues. Thus the term "allosteric regulation" refers to the phenomena of interactions at one site of a molecule causing conformational changes at another (Alberts et al. 2002). We expanded this definition to include conformational changes caused by either a ligand binding or a chemical modification on any other subunit in the same 31 protein complex. One example of such complicated allosteric regulation is the conformational change occurred at the kinase domain on PI3K-pl 10, when its PI3K-p85 subunit binds to the cell receptors (Vanhaesebroeck and Waterfield 1999). The conformational states as introduced in Section 3.1.3 are affected directly by the binding states and/or the phosphorylation sates through allosteric regulation events. Figure 12 features three different allosteric regulations on the Ras protein (Macaluso et al. 2002): the binding of GDP inhibits the GTP binding on Ras; the binding of SOS causes Ras to switch to a GTP-binding conformation; the binding of GTP causes the conformational change on the PI3K-pl 10 binding domain of Ras. Such change enables Ras molecule to bind to PI3K-pl 10 protein and activates the PI3 kinase domain on POK-110. 3 2 Figure 12. A visualization of allosteric regulations and interactions between Ras and PI3K-pllO. Figure 12. A visualization of allosteric regulations and interactions between Ras and PI3K-pllO. This figure illustrates allosteric regulation of Ras occurring during its interaction with PI3K-pl 10. The blue circles designate molecules and the yellow ones are domains or sites. The double-headed black arrows denote non-covalent interactions, while the single-headed black arrows are covalent interactions. The red arrows with either a "plus" sign or a "minus" sign are used to represent the allosteric regulations. This figure is created to facilitate the visualization of the underlying biological information, and does not reflect all the connections among the semantic agents and relationships as implemented in the model. (Abbreviation: RBD = Ras Binding Domain). The Ras-PI3K example demonstrates that proteins can act as a logic and adaptive device, that is affected by the "upstream" interactions (ie. the inputs) and affects the "downstream" interactions (ie. the outputs). In the developed semantic model, we captured such protein-logics through the creation of "allosteric regulation events". An allosteric regulation 33 event is composed of [condition] events that contain specific information about the input signals, and [response] events that consider the outputs. Figure 13 illustrates an example of such an allosteric regulation model, which represents the cause-effect relationship from the [GEF] domain to [GDP] and [GTP] binding domains on Ras. A condition event has been connected to the GEF domain on Ras as well as the [Bound] state, and it implies that the condition is met only when the GEF domain on Ras has been bound. Consequently, the allosteric regulation contains two response events. If the condition is satisfied, the first response causes the GDP binding domain on Ras to become [Non-functional for non-covalent interaction], and inhibits GDP-binding. The second response switches the GTP binding domain to [Functional for non-covalent interaction] and promotes GTP-binding. As stated earlier, the binding and phosphorylation states are caused by upstream interactions (non-covalent and covalent respectively), and the conformational states are required for the downstream processes. Therefore, by linking the binding/phosphorylation states to conformational states through allosteric regulation events, upstream interactions are connected to downstream events. Such connections among interactions have enabled us to traverse within cell-signaling pathways in both up and down directions. 34 Figure 13. Allosteric regulation event of the semantic model. Ras GEF B 1— Bound GDP Condition o Allosteric Regulation Non-Func. for non-cov. GTP Func. for non-cov. Ras Figure 13. Allosteric regulation event of the semantic model. An allosteric regulation event agent is composed of condition and response events. Each condition event considers a domain and its conditional state (binding or phosphorylation state). After the conditions are met, one or more response events would change the conformational states on other domains. 35 3.1.6 Cellular responses as semantic agents The final type of interactions we have implemented within the BioCAD environment is [cellular response]. A [cellular response] has been represented as an event agent that corresponds to qualitative cellular behaviors such as cell survival, cell growth and phagosome formation. Activation or deactivation of certain molecules promotes the occurrence of these cellular response events. Figure 14 illustrates how a [protein synthesis] response can be induced if a condition is satisfied. The condition specifies that a p70 S6-kinase (RPS6KB1) is phosphorylated at the threonine 389 site (T389). It is possible to include additional conditions into the model, such that the occurrence of a cellular response is dependent on states of multiple molecules. Figure 14. Cellular response of the semantic model. Protein synthesis Cellular response - condition RPS6KB1 T389 Phosphorylated Figure 14. Cellular response of the semantic model. A cellular response (e.g. protein synthesis) contains one or more condition events. A condition event connects to a molecule, its domain involved and a conditional state (binding state or phosphorylation state). 36 In semantic networks, the behavior of any semantic agent can be clearly defined by its relationships or connections to other agents. Therefore, the construction of the six types of events (localization, translocation, non-covalent interaction, covalent interaction, allosteric regulation, and cellular response) enables modeling of the behaviors of molecules in pathways. We have utilized the semantic model to reconstruct MTB interference in macrophage pathways and the cause-effect relationships between the intracellular events. 3.2 Reconstruction of macrophage pathways by semantic modeling 3.2.1 Data sources and pathway reconstruction The majority of the currently available public resources such as pathway and interaction databases contain un-integrated data on cell-signaling pathways that lack information on allosteric regulation in participating proteins. Therefore, we have focused our efforts on extracting pathway information from primary research and review articles and from the STKE PI3K pathway map (Table 5). The corresponding data has been collected from the literature manually and incorporated into the macrophage model through the use of SNEC (Semantic Network Environment for Cell-modeling). Table 5. Data sources used in macrophage pathway reconstruction. References Primsry research articles Arbibe et al. 2002; Datta et al. 1997; Fratti et al. 2001; Gu et al. 2003; Hmama et al. 1999; Kane et al. 2002; Kumagai and Dunphy 1991; Lanzetti et al. 2004; Lin et al. 1999; Murray et al. 2002; Muta and Takeshige 2001; Shapiro and Harper 1999; Tall et al. 2001; Vieira et al. 2003 Review articles Cantley 2002; Downward 2004; Hayden and Ghosh 2004; Macaluso et al. 2002; Pavletich 1999; Stenmark and Aasland 1999; Stephens, Ellson, and Hawkins 2002; Vanhaesebroeck and Waterfield 1999; Vanhaesebroeck et al. 2001; Velasco-Velazquez et al. 2003; Wurmser, Gary, and Emr 1999; Wymann, Zvelebil, and Laffargue 2003 37 Pathway database STKE (Gough 2002), Connections Map - PI3K pathway (last updated, July, 2003) It should be noted, that a pathway diagram presented in literature or public database, does in principle, represent what may happen if every depicted molecule is expressed in the correct location, at the correct time and with the correct conformations in a cell. Hence, the aggregation of multiple pathway diagrams describes some, if not all, possible molecular events that can potentially occur under a given condition. To utilize such information in semantic modeling, we decomposed pathway diagrams involving PI3K enzyme families into discrete pieces of information. We then utilized the defined sets of biological structures and events to integrate the information into a unified macrophage pathway model. We reconstructed the pathway model by incorporating the essential components regulating phagocytosis and phagosome maturation processes in human macrophages. In addition, the model encompasses many PI3K-related interactions that have implications in other cellular processes including cell survival, cell growth and cell division. Thus, Table 6 summarizes the overall SN reconstruction of the macrophage model that involves 59 prototypical proteins, localized in different intracellular compartments. Appendix CI contains a complete protein list. The 59 proteins contain 201 domains and sites. Among them are annotated Pfam domains that include SH2, SH3, PH and PX, as well as phosphorylation sites (phosphoserine, phosphothreonine, phosphotyrosine). Lipids such as PIP3 (main product of PBK-pllO), PI3P (product of Vps34p) and ManLAM (MTB's a phosphatidylinositol analog) and small molecules, GDP and GTP, also play important roles. 38 Table 6. Biological structure and event prototypes modeled in the macrophage pathways. Biological structure Sum Cell 1 Intracellular Compartment 4 Protein 59 Domain and site 201 Lipid 5 Polysaccharide 1 small molecule 2 Biological event Sum Localization 107 Non-covalent interaction 46 Covalent interaction 17 Allosteric regulation 27 Cellular response 8 Table 6. Biological structure and event prototypes modeled in the macrophage pathways. This table shows the number of prototypical structures and events modeled in the macrophage analysis. In addition to the biological entities, various PI3K- associated signalling events have been extracted from the literature. Currently the macrophage model reflects 107 localization events, 46 non-covalent interaction events, 17 covalent interaction events, 27 allosteric regulation events and 8 cellular responses. Each event has been supported by at least one literature reference (detailed in Appendices C2-C5). 3.2.2 SN modeling of known MTB interference mechanisms The integration of information on macrophage signalling involving PI3K provides a detailed picture of cellular processes. Such integration also allows us to visualize the interaction networks and to reconstruct scenarios of pathogenic MTB interference with the normal 39 macrophage pathways. The semantic network methodology (as implemented in SNEC) allowed us to traverse among the various prototypical events or to perform "pathway walk", starting from the MTB surface molecules, propagating through the activated macrophages cell receptors to the intermediate intracellular proteins such as PI3Ks, and leading to terminal cellular responses. Three interaction maps have been generated to represent the pathogenesis pathways in the model (Figure 15, 16, 17). The SN macrophage model predicts eight cellular pathways that can be affected by MTB. Table 7 shows that among the eight cellular responses, four have been previously reported in experimental studies. The interactions leading to those four previously identified responses are described in detail in the following sections. 40 Figure 15. PI3K interaction map part one. T L R 2 Figure 15. PI3K interaction map part one. The interaction map was generated manually by traversing among the different molecules in the macrophage pathway model through SNEC. The starting points of this graph are the three MTB surface molecules IGHG3, C3 and LPS. The icons represent semantic agents and are defined in Appendix A. All the arrows represent "derived relationships". The black double-headed arrows are used to connect binding molecules to their non-covalent interaction (double-headed indicates the dual directionality in the interaction). The black single-headed arrows represent the connections among enzymes, substrates, and products. The blue arrows connect allosteric regulations to the molecular interactions. The map shows the connections from the three starting molecules on MTB to the production of PIP 3 in macrophages. Figure 16. PI3K interaction map part two. * * CCND1 CCND1-P GSK3B-P Figure 16. PI3K interaction map part two. The map shows the connections from PIP3 to various cellular responses. The red arrow with a "check" sign represents "promoting" relationship (also a derived relationship). The red arrow with a "cross" sign represents "inhibitory" relationship. . . Figure 17. PI3K interaction map part three. NCF4 Recruitment of oxidase complex to phagosome Figure 17. PI3K interaction map part three. The map shows another parallel pathway from PIP3, and the connections to two additional cellular responses. Table 7. Macrophage responses known to be affected by MTB interference. Cellular responses promoted by MTB Supporting reference Actin polymerization and rearrangement Schlesinger et al. 1990 Membrane delivery to plasma membrane Schlesinger et al. 1990 Cellular responses inhibited by MTB Supporting reference Recruitment of oxidase complex to phagosome Moura, Modolell, and Mariano 1997 Phagosome-lysosome fusion Xuetal. 1994 Table 7. Known macrophage responses affected by MTB interference. The table shows the cellular responses that can be promoted or inhibited by MTB infection in macrophages, as predicted by the pathway model. The second column shows literature that supports the prediction. The supporting reference has been excluded from model reconstruction. 3.2.2.1 MTB promotes actin polymerization and rearrangement in macrophage It is known that phagocytosis of bacteria involves two major macrophage responses: [actin polymerization and rearrangement] and [membrane delivery to plasma membrane] (Tjelle, Lovdal, and Berg 2000). The pathway model has reconstructed a series of events that link interactions of MTB surface molecules to activation of [actin polymerization and rearrangement] pathway. The pathway model (Figure 15) shows how each of three MTB surface molecules (IGHG3, C3, LPS) can activate a different set of macrophage cell-receptors (including FCGR1A, [ITGAM + ITGB2], [CD14 + TLR2]). Subsequently, those receptors and the corresponding adaptor protein (GAB2) can bind to PIK3R1 (PI3K-p85) with their phosphotyrosine residues to activate PIK3CA enzyme (PI3K-pll0). On these interaction maps different types of semantic agents are depicted as individual icons connected by arrows of "derived relationships". The "derived relationships" have been inferred from the underlying semantic relationships that connect the agents as described in the semantic model (Section 3.1). For simplification, we have condensed several semantic agents and relationships into a single 44 arrow. For instance, the double-headed arrows on Figure 15, which connects IGHG3 and FCGR1A to a non-covalent interaction, have incorporated their individual binding event agents as well as domain/site agents. The semantic model incorporates PIK3CA activation leading to conversion of PIP2 into PIP3 (Figure 15). A single-headed arrow was used to represent the directionality of a covalent interaction. For instance, the enzyme PIK3CA is linked by an arrow that goes "into" a covalent interaction icon, while the substrate PIP2 is connected by an arrow that comes out of the covalent interaction. The substrate has been associated with its product in the reaction by another single-headed arrow from PIP2 to PIP3. The pathway model incorporates allosteric regulation events such as the one involving FCGR1A. The binding of IGHG3 can change FCGRlA's conformation, enabling its binding with LYN. To visualize such information, the regulation event is represented by a "positive" allosteric regulation icon, which connects to the two corresponding non-covalent interactions by two single-headed blue arrows. The constructed SN model illustrated by Figure 15 incorporates cause-effect relationships between the interactions of MTB surface molecules and the production of PIP3, which is an essential molecule interacting with downstream proteins. PSCD3 (a guanine-nucleotide exchange protein for ARF6) is one of many proteins that binds to PIP3 (shown on Figure 16). The subsequent binding of PSCD3 with ARF6 can change ARF6 into a conformation that is favourable for GTP binding (Cantley 2002). We have created a "positive" allosteric regulation event that links the upstream non-covalent interaction (PSCD3 <—> ARF6) to the downstream one (ARF6 <--> GTP). It has been reported that the binding of GTP on ARF6 can induce actin polymerization and rearrangement (Cantley 2002). Therefore, we integrated such information by connecting the response to the previous non-covalent event with a "promoting relationship", depicted as a red arrow with a check sign in Figure 16. 45 The resulting pathway model simulates the details of MTB binding to cell receptors that can promote [actin polymerization and rearrangement] responses in macrophages. 3.2.2.2 MTB promotes membrane delivery to plasma membrane in macrophage The ARF6-GTP complex activates the [membrane delivery to the plasma membrane] process (Stephens, Ellson and Hawkins 2002). We have incorporated the [membrane delivery to the plasma membrane] response into the previously described PI3K-ARF6 pathway (Figure 16). The cause-effect scenarios are incorporated into the SN model. It predicts that MTB promotes [actin polymerization and rearrangement] and [membrane delivery to the cell membrane] responses and, therefore, induces phagocytosis in macrophages. The prediction is supported by Schlesinger's study (Schlesinger et al. 1990), which has shown that Mycobacterium tuberculosis triggered the phagocytosis process by binding with CR3 (ITGAM and ITGB2 receptors) on macrophages. 3.2.2.3 MTB inhibits phagosome-lysosome fusion in macrophage The interaction map on Figure 17 shows another example of MTB interference to the PI3K pathways. The pathways start from the PIP3 molecule, and continue with the activation of RAB5A by the GTP binding and with its interaction to PIK3R4. Subsequently, PIK3R4 can bind with PIK3C3, the class III PI3K enzyme enabling phosphorylation of PI into PI3P. Previous studies have shown that PI3P interacts with EEA1 (early endosomes antigen 1), which is an essential anchoring protein that induces fusion between intracellular compartments including phagosomes and lysosomes (Stenmark and Aasland 1999; Wurmser, Gary, Emr 1999). Therefore, the pathway model has been reconstructed in a way that the binding of EEA1 to PI3P can induce the [phagosome-lysosome fusion] response. 46 As it has been discussed earlier, ManLAM (a phosphatidylinositol analog produced by MTB) can bind to the active site of PIK3C3 and inhibit the catalytic activity of the enzyme (Fratti et al. 2001). Figure 17 shows that the interference by MTB has been incorporated into the model by a non-covalent interaction that links ManLAM to PIK3C3 enzyme. A "negative" allosteric regulation event connects to the downstream covalent interaction, and it signifies the production of POP is inhibited by the ManLAM binding. Because of the inhibitory effect of the allosteric regulation, the event has been visualized with an icon that has a "cross" sign. The model suggested that MTB can stop the [phagosome-lysosome fusion] response that is normally induced by the PIK3C3 enzyme. This prediction has been supported by Xu's study (Xu et al. 1994), which documented that MTB restricted the fusion capability of intracellular compartments in macrophages. 3.2.2.4 MTB inhibits recruitment of oxidase complex to phagosome in macrophage The study by the Stephens' group (Stephens, Ellson, and Hawkins 2002) has indicated that PI3P can interact not only with EEA1, but also with NCF4 (p40-phox), which plays an important role in the formation of the oxidase complex on phagosomes. We have incorporated a non-covalent interaction between PI3P and NCF4 into the SN model (Figure 17), showing that the interaction can activate the [recruitment of oxidase complex to phagosome] response in macrophages. However, the ManLAM competitive binding on PIK3C3 enzyme can reduce the PI3P production and inhibit the response. It is known that the [recruitment of oxidase complex to phagosome] is required for the production of toxic oxygen-derived compounds in the organelle, and this event is accompanied by increased consumption of oxygen in macrophage cells (Moura, Modolell, and Mariano 1997). Moura's study (Moura, Modolell, and Mariano 1997) has indicated a cell wall lipid from a MTB-related species, Mycobacterium leprae, down-regulated the oxygen consumption in 47 macrophages, supporting the model prediction of [recruitment of oxidase complex to phagosome] response. The four cases of the MTB interference that have been discussed above in greater detail demonstrated that the SN-based pathway model could successfully reconstruct the molecular events that lead to known macrophage responses. The model shows that MTB can promote [actin polymerization and rearrangement] and [membrane delivery to plasma membrane] responses, but inhibits [phagosome-lysosome fusion] and [recruitment of oxidase complex to phagosome] in macrophages. 3.3 Cause-effect SN simulation of macrophage pathways during infection The above macrophage pathway model represents the qualitative scenarios of MTB interference in the host pathways. We have implemented an SN-simulation program to further investigate the dynamic behavior of molecules in the pathways. We allow each individual molecular "instance" to interact with other molecules, change its conformation, and move between different locations in a macrophage cell. In the corresponding simulation, every molecule has been represented by an individual agent, while every instance of a molecular interaction is represented as an individual event agent. An event agent has been connected to all the participating entities to record "what", "how" and "when" an interaction occurred in a simulation. The simulator provides a traceable "trajectory" of all the events that happen to every molecule. Such an event history allows a detailed analysis of simulated molecular interactions. The interactions among IGHG3, FCGR1A, Lyn and Gab2 have been studied in detail by Gu et al. (2003), providing a simple and well-characterized test for the simulation (Figure 18). The Fc-gamma receptor (FCGR1A) has a binding domain for immunoglobulin gamma 3 48 (IGHG3). When FCGR1A binds to IGHG3, the Lyn binding domain is activated and thus enables its binding to Lyn kinase. The subsequent binding between FCGR1A and Lyn activates the kinase active site, [ProteinkinaseTyr], on Lyn, and allows the enzyme to phosphorylate Gab2 on a phosphotyrosine site (pYxxM). Figure 18. Interactions between Fc-gamma receptor and Lyn kinase. Figure 18. Interactions between Fc-gamma receptor and Lyn kinase. This figure illustrates the possible interactions and allosteric regulations among the 4 different molecules in the simulation (IGHG3, FCGR1A, Lyn, Gab2). The figure uses the same visualization schema as in Figure 12 for Ras and PI3K interactions. To simulate the above interactions, we created an instance of the macrophage cell, composed of four compartments: extracellular space, plasma membrane, cytosol, and nucleus. Different instances of molecules were produced and localized within distinct cellular compartments at the start (time=0). Two instances of IgG molecules (IGHG3) were present in the extracellular space, and two instances of Fey receptors were located in the plasma 49 membrane (Figure 19). The cytosol contained two copies of L y n and Gab2. There were no events occurred at the beginning of the simulation. Figure 19. A SN-based simulator, before a simulation run (time =0). SNEC - Simulation program Time: Time • v Cell: Macrophagejiuman Non-covalent Int. Non-covalent Int. Non-covalent int. Non-covalent Int. Covalent Int. Translocation Covalent Int. Covalent Int. Translocation Translocation Covalent Int. Translocation ExTKicelhiI.il Space IGHG3 IGHG3 Plasma Membione FCGR1A FCGR1A Cytosol LYN GAB2 GAB2 LYN Nucleus No results Total # of translocation events: • Total # of allosteric regulation events: 0 Total # of non-covalent interaction events: 0 Total # of covalent interaction events: 0 Figure 19. A SN-based simulator, before a simulation run (time =0). An instance of a macrophage cell (human) is composed of four compartments; extracellular space, plasma membrane, cytosol and nucleus. There are three operations for each compartment, and they are activated by the three action buttons respectively: "Non-covalent int.", "Covalent int." and "Translocation." Operations are performed according to the simulation steps as described in Figure 21. The combo boxes located at the top is used to increment the time. Before the simulation run, there was no event occurred as shown in the reports at the bottom of the screen. 50 Figure 20. A SN-based simulator, at the end of a simulation run (time=6). SNEC - Simulation program Time: Time 6 Non-covalent Int. Covalent Int. Translocation Exfi.icelliil.il Space No results Non-covalent Int. Covalent Int. Translocation Plasma Membiane FCGR1A LYN OAB2 IGHG3 LYN FCGR1A IGHG3 Total # of translocation events: 11 Total # of allosteric regulation events: 4 Total # of non-covalent interaction events: 4 Total # of covalent interaction events: 1 Cell: rviacrophagejhuman Non-covalent int. Covalent Int. Translocation Cytosol GAB2-p Non-covalent Int. Covalent Int. Translocation Nucleus No results Figure 20. A SN-based simulator, at the end of a simulation run (time=6). The figure shows the simulation outcomes at time 6. The molecules have changed their original locations, and many events have accumulated. There were 11 translocation events, 4 allosteric regulation events, 4 non-covalent interactions events, and 1 covalent interaction events. The SN-simulation was executed by running a simulation cycle for each unit of time. Figure 21 illustrates that one simulation cycle consists of three operation steps. 51 Figure 21. The sequence of simulation steps. Start (time = 0) time = time + 1 max 1 x non-covalent interaction in each compartment 1 Plasma membrane 2 C y t o s o l 3 Nucleus 1 r r max. 1 x covalent 1 interaction in each compartment 1 Plasma membrane 2 Cytosol 3 Nucleus zzr~ max 1 x translocation from each compartment Increment time 1 Extracellular space 2 Plasma membrane 3 Cytosol 4 Nucleus Figure 21. The sequence of simulation steps. This figure illustrates the simulation cycles involve three operation steps: "non-covalent interaction", "covalent interaction" and "translocation". Each operation is executed for an individual location in the specified order. After the completion of the final step, the time (or step) is incremented by one, and the same cycle repeats. In the first step, an operation searches for one pair of molecules that has the potential to interact non-covalently in one location (by checking the non-covalent interactions previously 52 specified in the pathway model). If there are multiple pairs of interacting molecules, the operation chooses the first molecule randomly and selects its partner from all the interacting molecules that are present in same location. After a pair has been determined, an event agent links both of the molecule instances. We record the time when the interaction occurs by linking the event agent with a "time stamp" agent. Table 8 shows that four non-covalent interactions have occurred in the simulation at time 2, 3, 4 and 5 respectively. After the "non-covalent interaction" operation has been executed in the plasma membrane, cytosol, and nucleus, the simulation program searches for a "covalent interaction". The operation randomly picks an enzyme whose substrates are present in the same location. After an enzyme has been determined, a substrate is randomly chosen (if there was more than one substrate) and the corresponding product is created. The covalent interaction operation is repeated in plasma membrane, cytosol, and nucleus. Table 9 shows that one covalent interaction between Lyn and Gab2 occurred at plasma membrane at time 6 in the simulation. The occurrence for both non-covalent interactions and covalent interactions require not only the presence of molecules, but also the correct conformational states of those molecules, as specified by the pathway model. Therefore, a non-covalent interaction requires both molecules to have "functional" binding domains, while a covalent interaction requires an enzyme with a "functional" active site. The occurrence of either interaction can trigger allosteric regulations and changes the conformational states of the participating molecules. Such state changes allow molecules to adopt new functions and participate in different interactions. For instance, the non-covalent interaction between IGHG3 and FCGR1A occurred at time 2 has caused an allosteric regulation, which switched the conformational state of the Lyn binding site on FCGR1A from the "non-functional" to "functional" state. After the Lyn molecule has been translocated from cytosol to plasma membrane at time 3, the activated Lyn 53 binding site on FCGR1A enabled the receptor to bind with Lyn at Time 4. Table 10 shows that there were a total of 4 allosteric regulation events in the simulation. As the final step in the simulation cycle, molecules change their locations through translocation events. One translocation can occur and move one molecule from each location. The operation randomly picks a molecule that has the ability to move by checking its localization events on the prototypical molecule. Once a molecule has been determined, a destination is randomly chosen if there is more than one location where the molecule can move to. For example, Table 11 shows that a Gab2 molecule has been translocated from cytosol to plasma membrane at Time 1. However, the FCGR1A molecule has never moved during the simulation because it can only be localized at plasma membrane as restricted by its localization event. The operation for translocations is executed for each location in the order of extracellular space, plasma membrane, cytosol, and nucleus. After the translocation events, the time (representing the step) is incremented by one unit, and the simulation cycle is repeated. At the end of the simulation (time = 6), most molecules have changed their locations and many events have accumulated (4 non-covalent interactions, 1 covalent interaction, 4 allosteric regulation, and 11 translocations as shown in Figure 20). Those events have demonstrated how the initial translocation of IGHG3 molecules from extracellular space to plasma membrane "induced" the subsequent series of events that eventually led to the phosphorylation of Gab2 molecule to Gab2-p. It is possible to simulate different biological scenarios by modifying the initial populations and distribution of molecules in each location. At the current stage, the simulator enables us to "play" different macrophage pathways and observes the actions of the molecules. It captures the stochastic behaviors of interactions to through the use of random operations. We 54 anticipate future improvement of the SN-simulator will enhance our ability to predict and validate M T B interference in macrophages. 55 Table 8. Non-covalent interaction events in the simulation. Time Molecule A Domain A Molecule B Domain B Location Time 2 FCGR1A IG IGHG3 Fc-gamma receptor binding domain Plasma Membrane Time 3 FCGR1A IG IGHG3 Fc-gamma receptor binding domain Plasma Membrane Time 4 FCGR1A Lyn binding L Y N FcgR binding Plasma Membrane Time 5 . L Y N FcgR binding FCGR1A Lyn binding Plasma Membrane Table 8. Non-covalent interaction events in the simulation. This table shows the non-covalent interaction events that occurred in the simulation, sorted by the time when the event occurred. Each event has been linked to other relevant agents including molecules, domains and states as well the location where the event happened. Column 1: time when the interaction occurred in the simulation. Column 2: name of the binding molecule A. Column 3: domain of molecule A involved in the interaction. Column 4: name of the binding molecule B. Column 5: domain of molecule B involved in the interaction. Column 6: location where the interaction occurred. Table 9. Covalent interaction events in the simulation. Time Enzyme Enzyme's domain Substrate Site State Product Site State Location Time 6 L Y N P R O T E L N K I N A S E T Y R GAB2 pYxxM Not-phosphorylated GAB2-p p Y x x M Phosphorylated Plasma Membrane Table 9. Covalent interaction events in the simulation. The table shows the covalent interaction events that occurred in the simulation. Column 1: time when the interaction occurred. Column 2: name of the enzyme. Column 3: the active site or catalytic domain of the enzyme involved. Column 4: name of the substrate. Column 5: modification site of the substrate. Column 6: phosphorylation state before the covalent interaction event. Column 7: name of the product. Column 8: modification site of the product. Column 9: phosphorylation state after the covalent interaction. Column 10: location where the interaction occurred. Table 10. Allosteric regulation events in the simulation. Time Molecule affected Domain involved as the condition State satisfied Domain affected as the response State changed to Location Time 2 F C G R 1 A IG Bound Lyn binding Func. for non-cov. Plasma Membrane Time 3 F C G R 1 A IG Bound Lyn binding Func. for non-cov. Plasma Membrane Time 4 L Y N FcgR binding Bound P R O T E I N K I N A S E T Y R Func. for cov. Plasma Membrane Time 5 L Y N FcgR binding Bound P R O T E I N K I N A S E T Y R Func. for cov. Plasma Membrane Table 10. Allosteric regulation events in the simulation. The table shows the allosteric regulation events that occurred in the simulation. Column 1: time when the event occurred. Column 2: molecule affected by the allosteric regulation responses. Column 3: domain involved in the allosteric condition. Column 4: binding state that was satisfied for the condition. Column 5: domain affected by the allosteric response. Column 6: conformational state changed by the response. Column 7: location where the allosteric regulation occurred. Table 11. Translocation events in the simulation. Time Molecule moved From To Time 1 IGHG3 Extracellular space Plasma Membrane Time 1 GAB2 Cytosol Plasma Membrane Time 2 IGHG3 Extracellular space Plasma Membrane Time 2 GAB2 Plasma Membrane Cytosol Time 2 GAB2 Cytosol Plasma Membrane Time 3 GAB2 Plasma Membrane Cytosol Time 3 L Y N Cytosol Plasma Membrane Time 4 L Y N Cytosol Plasma Membrane Time 5 GAB2 Cytosol Plasma Membrane Time 6 GAB2-p Plasma Membrane Cytosol Time 6 GAB2 Cytosol Plasma Membrane Table 11. Translocation events in the simulation. The table shows the translocation events occurred in the simulation. Column 1: time when the translocation occurred. Column 2: molecule that was translocated. Column 3: the original location of the molecule. Column 4: the destination of the molecule. CHAPTER 4 DISCUSSION 4.1 Use of SN modeling for predicting unknown macrophage res'ponses to infection The semantic modeling studies did not only allow us detailed cause-affect reconstruction of several known processes by which Mycobacterium tuberculosis interferes with human macrophages, but also predicted several cellular responses that have not been previously recognized (Table 12). Namely, as it has been discussed in Section 3.2.2, interactions between MTB surface molecules and macrophage receptors can activate the class I PI3K enzyme and induce the production of PIP3. Studies have shown that PIP3 regulates many other cellular processes in addition to phagocytosis and phagosome maturation (Cantley 2002; Wymann, Zvelebil, and Laffargue 2003). We have incorporated additional PIP3-related interactions into the model, which identified four other macrophage responses that can be affected by MTB. The details of the SN-reconstructions of MTB interference scenarios are discussed in the following sections. Table 12. Unknown macrophage responses affected by M T B interference. Cellular responses promoted by M T B Cell survival Cell cycle entry - S phase Protein synthesis Intracellular glucose uptake Table 12. Unknown macrophage responses affected by MTB interference. The table shows the cellular responses that can be promoted by MTB infection in macrophages, as predicted by the pathway model. 59 4.1.1.1 MTB increases intracellular glucose uptake in macrophage The developed SN-based pathway model illustrated by Figure 16 includes a non-covalent interaction between PIP3 and SLC2A4 (glucose transporter type 4). This interaction recruits SLC2A4 to the plasma membrane and allows the protein to transport extracellular glucose into the cytosol (Wymann, Zvelebil, and Laffargue 2003). The model encompasses such information by implementing a cellular response event, [intracellular glucose uptake], which activation depends on the above non-covalent interaction. When we incorporated these events into the semantic network, we are able to reconstruct the cause-effect relationships from PI3K activation by the MTB factors, production of PIP3, and to the increase of [intracellular glucose uptake] in macrophages (Figure 15 and 16). 4.1.1.2 MTB increases the rate of protein synthesis in macrophage An essential PIP3-interacting protein is PDPK1, a kinase that phosphorylates many substrates including RPS6KB1 (ribosomal protein S6 kinase) (Cantley 2002; Wymann, Zvelebil, and Laffargue 2003). The model incorporates a non-covalent interaction that links PIP3 to PDPK1 (Figure 16), and also includes a downstream covalent interaction -phosphorylation of RPS6KB1 by PDPK1. RPS6KB1 kinase becomes active when phosphorylated by PDPK1, and the activated RPS6KB1 phosphorylates the S6 ribosomal proteins and increases the rate of protein synthesis (Cantley 2002). Such information has been integrated into the macrophage model by a [protein synthesis] response, which links to the previous covalent interaction event with a "promoting relationship". The SN environment allows us to analyze the outcomes of MTB interference on the PI3K pathways, and predicts that the rate of protein synthesis in macrophages is increased in response to RPS6KB1 activation. 60 4.1.1.3 MTB promotes cell division in macrophage The previous experimental studies demonstrated that PDPK1 protein can phosphorylate AKT1, a kinase which regulates cell division (Cantley 2002). After activation, AKT1 in turn phosphorylates downstream proteins such as GSK3B (Wymann, Zvelebil, and Laffargue 2003). GSK3B can phosphorylate CCND1 (Cyclin DI), but only when GSK3B is un-phosphorylated (Cantley 2002). After CCND1 is phosphorylated, it is targeted to the proteasome for degradation and cell cycle entry is inhibited. Therefore, the phosphorylation of GSK3B by AKT1 deactivates the catalytic activity of GSK3B, stops the degradation of CCND1 and promotes cell cycle entry. The above information demonstrates an example of double-inhibition, which has been modeled by two covalent interactions in the SN. The first is phosphorylation of GSK3B by AKT1 (designated as"AKTl-p" on Figure 16). The second is phosphorylation of CCND1 by GSK3B. To indicate that the first interaction inhibits the second one, we created a "negative" allosteric regulation event that links the two chemical reactions. Subsequently, the second covalent interaction inhibits the [Cell cycle entry - S phase] response. The "inhibitory" relationship is represented by a red arrow with a cross sign in Figure 16. When such complex series of events have been represented by the semantic agents and relationships in the SN, we observe that MTB can induce the phosphorylation of GSK3B by AKT1 and promote cell cycle entry response in macrophages. 4.1.1.4 MTB promotes survival of macrophage The pathway model accounts for double-inhibition involving BAD and BCL2 (Figure 16). BCL2 promotes cell survival, but the action is prevented by binding with BAD (Cantley 2002; Wymann, Zvelebil, Laffargue 2003). The BAD-BCL2 binding is inhibited by phosphorylation of BAD by AKT1 (Datta et al. 1997). The model has incorporated the double-61 inhibition relationships by a "negative" allosteric regulation event, which links the upstream covalent interaction (phosphorylation of BAD by AKTl-p) to the downstream non-covalent interaction between BAD and BCL2 (Figure 16). The above non-covalent event has been connected to the [cell survival] response by an "inhibitory" relationship in the SN. The model reconstructed another MTB interference scenario, accounting for the activation of A K T l by PDPK1, the subsequent phosphorylation of BAD by A K T l , and the occurrence of the [cell survival] response in macrophages (Figure 16). The semantic networks enable us to integrate individual molecular interactions and to reconstruct MTB-interference on the macrophage PI3K-pathways, leading to the activation of [intracellular glucose uptake], [protein synthesis], [cell cycle entry - S phase] and [cell survival] responses, none of which are described in the current literature on MTB infection in macrophages. Because a successful parasite ensures the growth and survival of the host to sustain its nutrients, these responses in macrophages should be beneficial for the survival of Mycobacterium tuberculosis. Activating cell division signals in macrophage may also help the migration and spread of the bacteria to progenies of the host cell. These four cellular responses will be validated by biological experiments in MTB-infected macrophages. The pathway model predicts several connections between proteins. For example, Gu et al. (2003) has observed the phosphorylation of the GAB2 protein by the L Y N kinase, and the subsequent downstream-activation of the class I PI3K by phosphorylated GAB2. In Gu's study, the kinase activity of L Y N was activated by Fcy-receptor (FCGR1A) through non-covalent binding. An independent observation from Velasco-Velazquez (2003) suggested that the L Y N can be activated by the CR3 receptor beta subunit, CD 18 (ITGB2). Figure 15 illustrates that the model has integrated the two pieces of information on L Y N regulations. As the result, the 62 model suggested the possibility that both Fcy-receptor and CR3 receptor can activate L Y N , which phosphorylates GAB2 and activates class I PI3K. CD 14 and the class I PI3K (PIK3CA) are predicted to interact via TLR2. Hmama et al. (1999) suggested a model in which PI3K is activated indirectly by the CD14 receptor. However, the components that link the two proteins were not known. A more recent study by Muta and Takeshige (2001) has observed a direct interaction between CD14 and TLR2. Aribe et al. (2000) demonstrated that a phosphotyrosine site on TLR2 can bind to the p85 subunit of PI3K, activating PI3K catalytic activity. By combining these two pieces of evidence, the pathway model reconstructs the scenario that CD 14 receptor can bind to TLR2 that activates POK (Figure 15). The two examples demonstrate that the pathway model not only integrate and interpret current biological observations but can also formalize new hypotheses. Several assumptions were made when the model connected the individual protein-protein interactions into pathways. For instance, the model assumed the co-expression of the molecules and their corresponding activation states in the macrophage cell during MTB infection. Those assumptions will be validated by simulations and experiments such as gene arrays, protein expression and phosphorylation profiles. The pathway model can guide those experiments, and the experimental results will assist in model validation. 4.2 Advantages of using semantic networks for pathway modeling The results presented above illustrate that SN-based reconstruction of molecular pathways can predict previously unrecognized scenarios, linking the molecular events to the cellular responses. The developed semantic model for cell signalling has addressed several limitations of the conventional diagram-based pathway representation. 63 4.2.1 Specify the spatial organization of molecules The semantic model has specified the hierarchical relationships among the different biological structures; from cells to compartments, from compartment to molecules and from molecules to domains/sites. The hierarchy between intracellular compartments and molecules allowed us to define the spatial organization of molecules in a cell through the localization events and the translocation events. Therefore, the model represents "where" interactions occur. 4.2.2 Model proteins as logical, integrating and adaptive devices The organization of domains and sites and the use of allosteric regulation events have enabled us to model the cause-effect relationships between structures and functions in macromolecules. Within SN, proteins have been implemented as integrating and logical devices, and their conformational states can be switched by the occurrence of non-covalent •and/or covalent interaction events. Therefore, the model allowed us to represent the conditions and consequences of upstream interactions and downstream interactions in pathways, and the information of "what", "how" and "when" interactions occur has been specified. 4.2.3 Reduce the needfor labels and descriptions In the developed semantic model, conventional descriptions of proteins such as "enzyme", "activator" or "inhibitor" have been represented by events. For example, in the model a protein has an "enzyme" role when 1) the protein is participated in a "covalent interaction event", 2) the presence of a "functional" catalytic domain on the protein is required for the occurrence of the event, and 3) the protein itself is not modified after the event. Similarly, a protein A "activates" a protein B, when a non-covalent interaction event from protein A can turn on the "functional" state of a domain/site on protein B. Thus, the "role" or "function" of a protein has been effectively represented by the events it participated in. 64 Therefore, the model reduces the need for descriptive labels that are often ambiguous in conventional pathway representation. 4.2.4 Provide a direct communication from models to simulations In the developed model, the possible behaviors of molecules have been defined by the various interaction events they involved. Adding or modifying interactions in the pathway model changes the behaviors of molecular instances in the simulation. The semantic network environment has established a connection from the model to the simulation program where the actions of individual molecules can be observed and tested under different scenarios. The semantic networks have several other technical advantages over static pathway representation. Those advantages include faster querying capabilities, convenient data addition and more effective integration of information. With in the SN framework, a given concept (such as [protein]) is represented by one prototypical agent. Any additional information about that concept can be represented by other agents and relationships connected to the same prototype. This construct differs from a relational database where information is stored in tables. To link tables to one and another, unique identifiers or primary keys are required. The duplication of primary keys in tables creates data redundancy, introducing issues on data maintenance and consistency. Semantic networks minimize such issues by reusing existing agents, and therefore, the databases have smaller size and more functional organization compared to the relational databases. The connectivity of semantic agents in the SN also allows very rapid execution of even very complicated queries as they are not compromised by a large number of tables that are often present in relational databases. In contrast, within the SN environment a query is composed by operations which are semantic agents themselves and can be manipulated in the same way as other agents. Representing queries as semantic agents allows the information to be 65 stored and integrated, and even re-used in a different context. On the other hand a query such as a SQL query is written in the form of static computer scripts, and the information on the data interpretation is difficult to be utilized in a relational database. Semantic databases further distinguish themselves by their capability to incorporate new data types through flexible creation of prototypes. Adding new prototypes does not affect the existing data models. Therefore, a single semantic database can support multiple semantic models that represent information in different knowledge domains. 4.3 Future directions The developed semantic model of human macrophage will be expanded to include more metabolic pathways, by incorporating other types of metabolic events including methylation, acetylation and glycosylation, in addition to the current phosphorylation and de-phosphorylation processes. To model gene regulation, the non-covalent interaction events will be expanded for the binding between individual transcription factors and their corresponding gene regulatory sites. The covalent interaction event can represent transcription processes that lead to the production of mRNAs as well as translations that produce proteins. The use of non-covalent interactions will also help us to effectively represent large transcription complex that may involve more than one hundred proteins, assembled in various orders. The gene regulation logic will be modeled by leveraging the current allosteric regulation events. We anticipate representing a gene locus by a macromolecular agent that is composed of transcription factor binding sites, promoter sites, exons and introns. An event that is analogous to the allosteric regulation will link the specific transcription factor binding events to the activation/deactivation of promoter sites, which will then direct the production of specific transcripts. 66 During the pathway reconstruction, we have extracted data mainly from the literature. Because the underlying semantic model is compatible with most public available pathway and interaction databases, the number of biological entities and interactions in the pathways can be increased through automatic data integration from those sources. However, information on allosteric regulations is still presented primarily in the literature, and therefore, extraction of such information will rely on both text-niining techniques and expert curation. It is also possible to predict protein interactions in the macrophage pathways based on domain-domain interactions, which has been considered in the model. It has been shown that putative protein-protein interactions can be inferred from domain-domain or domain-motif rules (Ng, Zhang, Tan 2003; Obenauer and Yaffe 2004). Such prediction can complement experimental determined but limited protein interaction data. Thus, database including InterDom (Ng et al. 2003), iPfam (2004) and Scansite (Obenauer, Cantley, and Yaffe 2003) have compiled a list of experimental or computational derived domain-domain and domain-motif interactions. Those resources will soon be utilized to predict protein interactions not only in a single organism but also between organism such as MTB and macrophages (based on proteins with common interacting domains.) The available gene and protein expression data on macrophages will also be incorporated into the model to determine "active" or "shortest" paths in the interaction network. Previous study has shown that active sub-networks can be identified by overlaying gene expression profiles with protein-protein interaction data (Ideker et al. 2002). Within a cell, there are many paths that can activate a downstream protein. The "length" of each path varies as it consists of different number of nodes. Therefore, we anticipate that the shortest path (the one with the least number of nodes that require activation), is more likely to be "utilized" than a longer route when the cell receives a stimulus. The integration of gene/protein expression data 67 with the pathway model will allow us to estimate the protein activation states for "pathway filtering". 4.3.1 A collaborative pathway modeling environment The developed SNEC website will soon be available for the research community, when the underlying biological model is improved and the necessary computer resources are allocated. SNEC provides a collaborative environment where different researchers can share and exchange their knowledge on intracellular pathways in different organisms. The current web implementation has already allowed different users to customize biological entities and interactions, and user-specific changes have been represented effectively by creating semantic agents in the database. It is essential to reuse and exchange pathway information between researchers. The developed macrophage model represents a particular interpretation of the entire pathways in the cells. Users will be able to reuse the interactions as parts of their own pathway models, or they can copy and modify each component to reflect their own interpretation. Each interaction can then be analyzed for its "trust" or "support" by other researchers. The analysis will help us to determine the canonical part of the pathways as well as parts that are still under debates and contradiction. The research community can focus on incomplete pathways, forming new hypotheses and designing new experiments. The exchange and translation of complicated pathway information rely on a good visualization method. The PI3K-interaction maps on Figure 15-17 have attempted to visualize different types of interactions in pathways. Cook, Farley, and Tapscott (2001) and Kitano (2002) discussed several advanced visualization techniques, and some of them have been adopted by Figure 12 and Figure 18. Those visualization techniques will be utilized to implement 68 automatic graphing tools for both 2-D maps of the pathways and 3-D animations of events in simulation. We anticipate that the pathway modeling environment will provide not only advanced visualization tools by also a simulation program for pathway testing. The current SN-simulator will be improved in four aspects. Firstly, various quantitative factors will be considered in the simulation. For instance, binding affinity that is associated with non-covalent events will affect the probability and the duration on the binding of molecules. Reaction kinetics, associated with covalent events, will determine the rate of production. Both interaction rates will determine the time each interaction takes and how many events can accumulate during each time unit. Secondly, the population and distribution of molecules in each intracellular compartment will be supported by experimental data. For instance, gene expression data from microarrays supports the relative abundance of transcripts, and protein expression data provide the relative abundance of proteins. Computer algorithms such as PSORT (Nakai and Horton 1999) can also assist in predicting the localization of proteins. Thirdly, the proximity of molecules will be enhanced. Current, proximity has been represented by creating intracellular compartments, which can be further divided into smaller sub-locations. Increasing the number of locations and reducing the size of each location will improve the approximation on localization of molecules. The proximity can also be estimated through molecular complexes. We anticipate that a molecule will have a higher probability to interact with its subunits in the same complex, than other molecules outside the complex. Finally, the cellular response events will be implemented into the SN simulator, in a way that the events are triggered by the accumulation of different molecular interactions. Such construct allows us to represent the transition from quantitative to qualitative behaviors in a cell. 69 Cellular responses will serve as the final simulation outcomes for measuring the effects of extracellular stimuli including drugs. 4.3.2 A potential tool for in silico drug discovery The drug discovery process involves many stages of development, including drug target identification, drug target validation, drug design, drug identification, and drug testing. A good drug target validation process should identify the functional roles of a target in a cell or an organism and establish its cause-effect relationships to a disease (Smith 2003). Currently, such information is obtained experimentally in vitro and in vivo through gene knockouts, antisense technology, RNA interference (RNAi), and antibodies that target and inhibit the protein. However, experimental methods are both time and resource consuming. Analogous to "in silico" drug design approach that has assisted and reduced the cost in conventional drug screening, we anticipate that in silico drug target validation can help and fasten the drug discovery process. Semantic networks provide a suitable environment for drug target validation because they can determine the function of proteins in the context of interaction networks. Protein-protein interaction network has scale-free property that there are a few but essential proteins with many connections, while most of the other proteins only have a few linkages (Barabasi and Oltvai 2004). Thus, targeting a highly-connected protein should be disruptive to the pathogen, and such target can be identified from the protein-interaction networks already established in SN. In addition, semantic networks can determine.spliced variants and proteins that are able to compensate the disruption of the pathogenic target. In SN, spliced variants are characterized by their connections to the common gene agents. Proteins with similar functions can be identified from their common domains and functional sites. If the protein is too robust to attack, 70 another strategy is to target pathogenic proteins that directly interact with host proteins. Combining the protein interaction networks between the pathogen and the host will provide a list of such cross-interacting proteins. In addition to drug target validation, the SN-simulator can be used for in silico drug testing. Most drugs interact with the active sites or binding domain of host proteins. Thus cross-reactions can occur on proteins that share the same drug-binding domains. Such drug-specific interactions can be incorporated into the pathway model and the effect of drugs can be simulated. The simulation will provide a list of molecular events and cellular responses that occur as the result of such drug interference. 71 CHAPTER 5 CONCLUSION The application of semantic networks has enabled us to develop a new biological language (semantic model) for reconstructing Mycobacterium tuberculosis interference strategies in macrophage pathways. The semantic model has several advantages in characterizing complex interactions between macromolecules in cell signalling pathways, and it addressed the limitations of the conventional diagram-based pathway representation. These advantages include specifying the spatial organization of molecules; modeling proteins as logical, integrating and adaptive devices; reducing the need for labels and descriptions; providing a direct communication from models to simulations. The unique features of the semantic model allowed us to effectively reconstruct the cause-effect relationships of MTB interference in human macrophage pathways. The current knowledge on PI3K interactions and their corresponding cellular responses have been extracted from the literate and integrated into the macrophage pathway model. The SN representation enabled the traverse within the pathways, starting from the MTB surface molecules, to activated receptors, downstream proteins and occurring cellular responses in macrophages. The pathway model has predicted that MTB factors can promote [actin polymerization and rearrangement], [membrane delivery to plasma membrane], [cell survival], [cell cycle entry - S phase], [protein synthesis] and [intracellular glucose uptake] responses in macrophages. On the other hand, [recruitment of oxidase complex to phagosome] and [phagosome-lysosome fusion] are inhibited by MTB. Some of the predicted responses have been supported by previous studies, while the others have not yet been appreciated in current literature on MTB 72 infection in macrophages. New experiments will be formalized based on the pathway models to validate the responses further. The web-based application, Semantic Network Environment for Cell-modeling (SNEC), has implemented the semantic model for effective pathway building and customization. SNEC facilitates collaborative pathway studies in macrophages as well as other cellular systems. To explore the dynamics of semantic networks, we have developed a SN-based cell simulator, which captures the stochastic behaviors in a cell and produce a detailed history of events that can be traced and analyzed afterward. In the future, we anticipate enhancing the pathway model and simulation, utilizing the SN environment for in silico drug discovery, and assisting the development of new therapeutic strategies against MTB infections in macrophages. 73 REFERENCES Alberts, B., A. Johnson, J. Lewis, M . Raff, K. Roberts, and P. Walter. 2002. Molecular Biology of the Cell. 4th edition ed. New York: Garland Science. Arbibe, L., J. P. Mira, N . Teusch, L. Kline, M . Guha, N . Mackman, P. J. Godowski, R. J. Ulevitch, and U. G. Knaus. 2000. Toll-like receptor 2-mediated NF-kappa B activation requires a Racl-dependent pathway. Nat Immunol 1 (6):533-40. Bader, G. D., D. Betel, and C. W. Hogue. 2003. BIND: the Biomolecular Interaction Network Database. Nucleic Acids Res 31 (l):248-50. Bader, G. D., and C. W. Hogue. 2000. BIND~a data specification for storing and describing biomolecular interactions, molecular complexes and pathways. Bioinformatics 16 (5):465-77. Barabasi, A. L., and Z. N . Oltvai. 2004. Network biology: understanding the cell's functional organization. Nat Rev Genet 5 (2): 101-13. Bateman, A., L. Coin, R. Durbin, R. D. Finn, V. Hollich, S. Griffiths-Jones, A. Khanna, M . Marshall, S. Moxon, E. L. Sonnhammer, D. J. Studholme, C. Yeats, and S. R. Eddy. 2004. The Pfam protein families database. Nucleic Acids Res 32 (Database issue):D138-41. Benson, D. A., I. Karsch-Mizrachi, D. J. Lipman, J. Ostell, and D. L. Wheeler. 2005. GenBank. Nucleic Acids Res 33 Database Issue:D34-8. Bhat, T. N. , P. Bourne, Z. Feng, G. Gilliland, S. Jain, V. Ravichandran, B. Schneider, K. Schneider, N . Thanki, H. Weissig, J. Westbrook, and H. M . Berman. 2001. The PDB data uniformity project. Nucleic Acids Res 29 (l):214-8. BioCAD. 2004 [cited. Available from http://www.biocad.com. BioCarta. 2004 [cited. Available from http://www.biocarta.com/. BioPAX. 2005 [cited. Available from http://www.biopax.org/. Breitkreutz, B. J., C. Stark, and M . Tyers. 2003. The GRID: the General Repository for Interaction Datasets. Genome Biol 4 (3):R23. Breitkreutz, B. J., C. Stark, and M . Tyers. 2003. Osprey: a network visualization system. Genome Biol 4 (3):R22. Cantley, L. C. 2002. The phosphoinositide 3-kinase pathway. Science 296 (5573): 1655-7. 74 Cook, D. L., J. F. Farley, and S. J. Tapscdtt. 2001. A basis for a visual language for describing, archiving and analyzing functional models of complex biological systems. Genome Biol 2 (4):RESEARCH0012. Datta, S. R., H. Dudek, X. Tao, S. Masters, H. Fu, Y. Gotoh, and M . E. Greenberg. 1997. Akt phosphorylation of BAD couples survival signals to the cell-intrinsic death machinery. Cell 91 (2):231-41. Demir, E., O. Babur, U. Dogrusoz, A. Gursoy, G. Nisanci, R. Cetin-Atalay, and M . Ozturk. 2002. PATIKA: an integrated visual environment for collaborative construction and analysis of cellular pathways. Bioinformatics 18 (7):996-1003. Downward, J. 2004. PI 3-kinase, Akt and cell survival. Semin Cell Dev Biol 15 (2): 177-82. Ernst, J. D. 1998. Macrophage receptors for Mycobacterium tuberculosis. Infect Immun 66 (4): 1277-81. Finlay, B. B., and S. Falkow. 1997. Common themes in microbial pathogenicity revisited. Microbiol Mol Biol Rev 61 (2): 136-69. Fisher, M . J., G. Malcolm, and R. C. Paton. 2000. Spatio-logical processes in intracellular signalling. Biosystems 55 (l-3):83-92. Fisher, M . J., R. C. Paton, and K. Matsuno. 1999. Intracellular signalling proteins as smart' agents in parallel distributed processes. Biosystems 50 (3): 159-71. Fratti, R. A., J. M . Backer, J. Gruenberg, S. Corvera, and V. Deretic. 2001. Role of phosphatidylinositol 3-kinase and Rab5 effectors in phagosomal biogenesis and mycobacterial phagosome maturation arrest. J Cell Biol 154 (3):631-44. Garvey, T. D., P. Lincoln, C. J. Pedersen, D. Martin, and M . Johnson. 2003. BioSPICE: access to the most current computational tools for biologists. Omics 7 (4):411-20. Gough, N . R. 2002. Science's signal transduction knowledge environment: the connections maps database. Ann N Y Acad Sci 971:585-7. Griffith, R. L. 1982. Three Principles of Representation for Semantic Networks. ACM Trans. Database Syst. 7 (3):417-442. Gu, H., R. J. Botelho, M . Yu, S. Grinstein, and B. G. Neel. 2003. Critical role for scaffolding adapter Gab2 in Fc gamma R-mediated phagocytosis. J Cell Biol 161 (6): 1151-61. Harris, M . A., J. Clark, A. Ireland, J. Lomax, M . Ashburner, R. Foulger, K. Eilbeck, S. Lewis, B. Marshall, C. Mungall, J. Richter, G. M . Rubin, J. A. Blake, C. Bult, M . Dolan, H. Drabkin, J. T. Eppig, D. P. Hill, L. Ni, M . Ringwald, R. Balakrishnan, J. M . Cherry, K. R. Christie, M . C. Costanzo, S. S. Dwight, S. Engel, D. G. Fisk, J. E. Hirschman, E. L. Hong, 75 R. S. Nash, A. Sethuraman, C. L. Theesfeld, D. Botstein, K. Dolinski, B. Feierbach, T. Berardini, S. Mundodi, S. Y. Rhee, R. Apweiler, D. Barrell, E. Camon, E. Dimmer, V. Lee, R. Chisholm, P. Gaudet, W. Kibbe, R. Kishore, E. M . Schwarz, P. Sternberg, M . Gwinn, L. Hannick, J. Wortman, M . Berriman, V. Wood, N . de la Cruz, P. Tonellato, P. Jaiswal, T. Seigfried, and R. White. 2004. The Gene Ontology (GO) database and informatics resource. Nucleic Acids Res 32 (Database issue):D258-61. Hayden, M . S., and S. Ghosh. 2004. Signaling to NF-kappaB. Genes Dev 18 (18):2195-224. Health, & Development Initiative. 2004 [cited. Available from http://www.healthinitiative.org/. Hermjakob, H., L. Montecchi-Palazzi, G. Bader, J. Wojcik, L. Salwinski, A. Ceol, S. Moore, S. Orchard, U. Sarkans, C. von Mering, B. Roechert, S. Poux, E. Jung, H. Mersch, P. Kersey, M . Lappe, Y. Li , R. Zeng, D. Rana, M . Nikolski, H. Husi, C. Brun, K. Shanker, S. G. Grant, C. Sander, P. Bork, W. Zhu, A. Pandey, A. Brazma, B. Jacq, M . Vidal, D. Sherman, P. Legrain, G. Cesareni, I. Xenarios, D. Eisenberg, B. Steipe, C. Hogue, and R. Apweiler. 2004. The HUPO PSI's molecular interaction format—a community standard for the representation of protein interaction data. Nat Biotechnol 22 (2):177-83. Hermjakob, H., L. Montecchi-Palazzi, C. Lewington, S. Mudali, S. Kerrien, S. Orchard, M . Vingron, B. Roechert, P. Roepstorff, A. Valencia, H. Margalit, J. Armstrong, A. Bairoch, G. Cesareni, D. Sherman, and R. Apweiler. 2004. IntAct: an open source molecular interaction database. Nucleic Acids Res 32 Database issue:D452-5. Hmama, Z., K. L. Knutson, P. Herrera-Velit, D. Nandan, and N. E. Reiner. 1999. Monocyte adherence induced by lipopolysaccharide involves CD 14, LFA-1, and cytohesin-1. Regulation by Rho and phosphatidylinositol 3-kinase. J Biol Chem 274 (2):1050-7. Hsing, M. , J. L. Bellenson, C. Shankey, and A. Cherkasov. 2004. Modeling of cell signaling pathways in macrophages by semantic networks. BMC Bioinformatics 5 (1): 156. Hucka, M. , A. Finney, H. M . Sauro, H. Bolouri, J. C. Doyle, H. Kitano, A. P. Arkin, B. J. Bornstein, D. Bray, A. Cornish-Bowden, A. A. Cuellar, S. Dronov, E. D. Gilles, M . Ginkel, V. Gor, Goryanin, II, W. J. Hedley, T. C. Hodgman, J. H. Hofmeyr, P. J. Hunter, N . S. Juty, J. L. Kasberger, A. Kremling, U. Kummer, N . Le Novere, L. M . Loew, D. Lucio, P. Mendes, E. Minch, E. D. Mjolsness, Y. Nakayama, M . R. Nelson, P. F. Nielsen, T. Sakurada, J. C. Schaff, B. E. Shapiro, T. S. Shimizu, H. D. Spence, J. Stelling, K. Takahashi, M . Tomita, J. Wagner, and J. Wang. 2003. The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models. Bioinformatics 19 (4):524-31. 76 Ideker, T., and D. Lauffenburger. 2003. Building with a scaffold: emerging strategies for high-to low-level cellular modeling. Trends Biotechnol 21 (6):255-62. Ideker, T., O. Ozier, B. Schwikowski, and A. F. Siegel. 2002. Discovering regulatory and signalling circuits in molecular interaction networks. Bioinformatics 18 Suppl l:S233-40. iPfam. 2004 [cited. Available from http://www.sanger.ac.uk/Software/Pfam/iPfam/. Kane, L. P., M . N . Mollenauer, Z. Xu, C. W. Turck, and A. Weiss. 2002. Akt-dependent phosphorylation specifically regulates Cot induction of NF-kappa B-dependent transcription. Mol Cell Biol 22 (16):5962-74. Kanehisa, M. , and S. Goto. 2000. KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res 28 (l):27-30. Karp, P. D., S. Paley, and P. Romero. 2002. The Pathway Tools software. Bioinformatics 18 Suppl l:S225-32. Kitano, H. 2002. The standard graphical notation for biochemical networks. ICSB-2002 workshop on SBML/SBW (Stockholm). Krieger, C. J., P. Zhang, L. A. Mueller, A. Wang, S. Paley, M . Arnaud, J. Pick, S. Y. Rhee, and P. D. Karp. 2004. MetaCyc: a multiorganism database of metabolic pathways and enzymes. Nucleic Acids Res 32 Database issue:D438-42. Krull, M. , N . Voss, C. Choi, S. Pistor, A. Potapov, and E. Wingender. 2003. TRANSPATH: an integrated database on signal transduction and a tool for array analysis. Nucleic Acids Res 31 (1):97-100. Kumagai, A., and W. G. Dunphy. 1991. The cdc25 protein controls tyrosine dephosphorylation of the cdc2 protein in a cell-free system. Cell 64 (5):903-14. Lanzetti, L., A. Palamidessi, L. Areces, G. Scita, and P. P. Di Fiore. 2004. Rab5 is a signalling GTPase involved in actin remodelling by receptor tyrosine kinases. Nature 429 (6989):309-14. Le Novere, N. , and T. S. Shimizu. 2001. STOCHSIM: modelling of stochastic biomolecular processes. Bioinformatics 17 (6):575-6. Lemer, C , E. Antezana, F. Couche, F. Fays, X. Santolaria, R. Janky, Y. Deville, J. Richelle, and S. J. Wodak. 2004. The aMAZE LightBench: a web interface to a relational database of cellular processes. Nucleic Acids Res 32 Database issue:D443-8. Lin, X., E. T. Cunningham, Jr., Y. Mu, R. Geleziunas, and W. C. Greene. 1999. The proto-oncogene Cot kinase participates in CD3/CD28 induction of NF-kappaB acting through the NF-kappaB-inducing kinase and IkappaB kinases. Immunity 10 (2):271-80. Lindberg, D. A., B. L. Humphreys, and A. T. McCray. 1993. The Unified Medical Language System. Methods InfMed 32 (4):281-91. Loew, L. M. , and J. C. Schaff. 2001. The Virtual Cell: a software environment for computational cell biology. Trends Biotechnol 19 (10):401-6. Macaluso, M. , G. Russo, C. Cinti, V. Bazan, N . Gebbia, and A. Russo. 2002. Ras family genes: an interesting link between cell cycle and cancer. J Cell Physiol 192 (2): 125-30. McCray, A. T., and S. J. Nelson. 1995. The representation of meaning in the UMLS. Methods InfMed 34 (1-2): 193-201. Mendes, P. 1997. Biochemistry by numbers: simulation of biochemical pathways with Gepasi 3. Trends Biochem Sci 22 (9):361-3. Meresse, S., O. Steele-Mortimer, E. Moreno, M . Desjardins, B. Finlay, and J. P. Gorvel. 1999. Controlling the maturation of pathogen-containing vacuoles: a matter of life and death. Nat Cell Biol 1 (7):E183-8. Moura, A. C , M . Modolell, and M . Mariano. 1997. Down-regulatory effect of Mycobacterium leprae cell wall lipids on phagocytosis, oxidative respiratory burst and tumour cell killing by mouse bone marrow derived macrophages. Scand J Immunol 46 (5):500-5. Mulder, N . J., R. Apweiler, T. K. Attwood, A. Bairoch, A. Bateman, D. Binns, P. Bradley, P. Bork, P. Bucher, L. Cerutti, R. Copley, E. Courcelle, U. Das, R. Durbin, W. Fleischmann, J. Gough, D. Haft, N . Harte, N . Hulo, D. Kahn, A. Kanapin, M. Krestyaninova, D. Lonsdale, R. Lopez, I. Letunic, M . Madera, J. Maslen, J. McDowall, A. Mitchell, A. N . Nikolskaya, S. Orchard, M. Pagni, C. P. Ponting, E. Quevillon, J. Selengut, C. J. Sigrist, V. Silventoinen, D. J. Studholme, R. Vaughan, and C. H. Wu. 2005. InterPro, progress and status in 2005. Nucleic Acids Res 33 Database Issue:D201-5. Murray, J. T., C. Panaretou, H. Stenmark, M . Miaczynska, and J. M . Backer. 2002. Role of Rab5 in the recruitment of hVps34/pl50 to the early endosome. Traffic 3 (6):416-27. Muta, T., and K. Takeshige. 2001. Essential roles of CD 14 and lipopolysaccharide-binding protein for activation of toll-like receptor (TLR)2 as well as TLR4 Reconstitution of TLR2- and TLR4-activation by distinguishable ligands in LPS preparations. Eur J Biochem 268 (16):4580-9. Nakai, K., and P. Horton. 1999. PSORT: a program for detecting sorting signals in proteins and predicting their subcellular localization. Trends Biochem Sci 24 (l):34-6. Neves, S. R., and R. Iyengar. 2002. Modeling of signaling networks. Bioessays 24 (12): 1110-7. Ng, S. K., Z. Zhang, and S. H. Tan. 2003. Integrative approach for computationally inferring protein domain interactions. Bioinformatics 19 (8):923-9. Ng, S. K., Z. Zhang, S. H. Tan, and K. Lin. 2003. InterDom: a database of putative interacting protein domains for validating predicted protein interactions and complexes. Nucleic Acids Res3\ (l):251-4. Obenauer, J. C , L. C. Cantley, and M . B. Yaffe. 2003. Scansite 2.0: Proteome-wide prediction of cell signaling interactions using short sequence motifs. Nucleic Acids Res 31 (13):3635-41. Obenauer, J. C , and M . B. Yaffe. 2004. Computational prediction of protein-protein interactions. Methods Mol Biol 261:445-68. Pavletich, N . P. 1999. Mechanisms of cyclin-dependent kinase regulation: structures of Cdks, their cyclin activators, and Cip and 1NK4 inhibitors. J Mol Biol 287 (5):821-8. Peri, S., J. D. Navarro, R. Amanchy, T. Z. Kristiansen, C. K. Jonnalagadda, V. Surendranath, V. Niranjan, B. Muthusamy, T. K. Gandhi, M . Gronborg, N . Ibarrola, N . Deshpande, K. Shanker, H. N . Shivashankar, B. P. Rashmi, M . A. Ramya, Z. Zhao, K. N . Chandrika, N . Padma, H. C. Harsha, A. J. Yatish, M . P. Kavitha, M . Menezes, D. R. Choudhury, S. Suresh, N . Ghosh, R. Saravana, S. Chandran, S. Krishna, M . Joy, S. K. Anand, V. Madavan, A. Joseph, G. W. Wong, W. P. Schiemann, S. N . Constantinescu, L. Huang, R. Khosravi-Far, H. Steen, M . Tewari, S. Ghaffari, G. C. Blobe, C. V. Dang, J. G. Garcia, J. Pevsner, O. N . Jensen, P. Roepstorff, K. S. Deshpande, A. M . Chinnaiyan, A. Hamosh, A. Chakravarti, and A. Pandey. 2003. Development of human protein reference database as an initial platform for approaching systems biology in humans. Genome Res 13 (10):2363-71. Pruitt, K. D., T. Tatusova, and D. R. Maglott. 2005. NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res 33 Database Issue:D501-4. Russell, D. G. 2001. Mycobacterium tuberculosis: here today, and here tomorrow. Nat Rev Mol Cell Biol 2 (8):569-77. Schlesinger, L. S., C. G. Bellinger-Kawahara, N . R. Payne, and M . A. Horwitz. 1990. Phagocytosis of Mycobacterium tuberculosis is mediated by human monocyte complement receptors and complement component C3. J Immunol 144 (7):2771-80. Selkov, E., Jr., Y. Grechkin, N . Mikhailova, and E. Selkov. 1998. MPW: the Metabolic Pathways Database. Nucleic Acids Res 26 (l):43-5. Shannon, P., A. Markiel, O. Ozier, N . S. Baliga, J. T. Wang, D. Ramage, N . Amin, B. . 79 Schwikowski, and T. Ideker. 2003. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 13 (11):2498-504. Shapiro, G. I., and J. W. Harper. 1999. Anticancer drug targets: cell cycle and checkpoint control. JClin Invest 104 (12): 1645-53. Sigrist, C. J., L. Cerutti, N . Hulo, A. Gattiker, L. Falquet, M . Pagni, A. Bairoch, and P. Bucher. 2002. PROSITE: a documented database using patterns and profiles as motif descriptors. Brief Bioinform 3 (3):265-74. Smith, C. 2003. Drug target validation: Hitting the target. Nature 422 (6929):341, 343, 345 passim. Stenmark, H., and R. Aasland. 1999. FYVE-finger proteins—effectors of an inositol lipid. J Cell 5c/ 112 (Pt23):4175-83. Stephens, L., C. Ellson, and P. Hawkins. 2002. Roles of PI3Ks in leukocyte chemotaxis and phagocytosis. Curr Opin Cell Biol 14 (2):203-13. Tall, G. G., M . A. Barbieri, P. D. Stahl, and B. F. Horazdovsky. 2001. Ras-activated endocytosis is mediated by the Rab5 guanine nucleotide exchange activity of RIN1. Dev Cell 1 (l):73-82. Tjelle, T. E., T. Lovdal, and T. Berg. 2000. Phagosome dynamics and function. Bioessays 22 (3):255-63. Tomita, M . 2001. Whole-cell simulation: a grand challenge of the 21st century. Trends Biotechnol 19 (6):205-10. Tomita, M. , K. Hashimoto, K. Takahashi, T. S. Shimizu, Y. Matsuzaki, F. Miyoshi, K. Saito, S. Tanida, K. Yugi, J. C. Venter, and C. A. Hutchison, 3rd. 1999. E-CELL: software environment for whole-cell simulation. Bioinformatics 15 (l):72-84. Vanhaesebroeck, B., S. J. Leevers, K. Ahmadi, J. Timms, R. Katso, P. C. Driscoll, R. Woscholski, P. J. Parker, and M . D. Waterfield. 2001. Synthesis and function of 3-phosphorylated inositol lipids. Annu Rev Biochem 70:535-602. Vanhaesebroeck, B., and M . D. Waterfield. 1999. Signaling by distinct classes of phosphoinositide 3-kinases. Exp Cell Res 253 (l):239-54. Vasudeva, K., and U. S. Bhalla. 2004. Adaptive stochastic-deterministic chemical kinetic simulations. Bioinformatics 20 (l):78-84. Velasco-Velazquez, M . A., D. Barrera, A. Gonzalez-Arenas, C. Rosales, and J. Agramonte-Hevia. 2003. Macrophage—Mycobacterium tuberculosis interactions: role of complement receptor 3. Microb Pathog 35 (3): 125-31. 80 Vieira, O. V., R. J. Botelho, L. Rameh, S. M . Brachmann, T. Matsuo, H. W. Davidson, A. Schreiber, J. M . Backer, L. C. Cantley, and S. Grinstein. 2001. Distinct roles of class I and class III phosphatidylinositol 3-kinases in phagosome formation and maturation. J Cell flio/155(l):19-25. Vieira, O. V., C. Bucci, R. E. Harrison, W. S. Trimble, L. Lanzetti, J. Gruenberg, A. D. Schreiber, P. D. Stahl, and S. Grinstein. 2003. Modulation of Rab5 and Rab7 recruitment to phagosomes by phosphatidylinositol 3-kinase. Mol Cell Biol 23 (7):2501-14. Visual, Knowledge. 2004 [cited. Available from http://www.visualknowledge.com. von Mering, C , L. J. Jensen, B. Snel, S. D. Hooper, M . Krupp, M . Foglierini, N . Jouffre, M . A. Huynen, and P. Bork. 2005. STRING: known and predicted protein-protein associations, integrated and transferred across organisms. Nucleic Acids Res 33 Database Issue:D433-7. Wilkinson, M . D., and M . Links. 2002. BioMOBY: an open source biological web services proposal. Brief Bioinform 3 (4):331-41. Wurmser, A. E., J. D. Gary, and S. D. Emr. 1999. Phosphoinositide 3-kinases and their FYVE domain-containing effectors as regulators of vacuolar/lysosomal membrane trafficking pathways. J Biol Chem 274 (14):9129-32. Wymann, M . P., M . Zvelebil, and M . Laffargue. 2003. Phosphoinositide 3-kinase signalling— which way to target? Trends Pharmacol Sci 24 (7):366-76. Xenarios, I., L. Salwinski, X. J. Duan, P. Higney, S. M . Kim, and D. Eisenberg. 2002. DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions. Nucleic Acids Res 30 (l):303-5. Xu, S., A. Cooper, S. Sturgill-Koszycki, T. van Heyningen, D. Chatterjee, I. Orme, P. Allen, and D. G. Russell. 1994. Intracellular trafficking in Mycobacterium tuberculosis and Mycobacterium avium-infected macrophages. J Immunol 153 (6):2568-78. Yu, H., C. Friedman, A. Rhzetsky, and P. Kra. 1999. Representing genomic knowledge in the UMLS semantic network. Proc AMIA Symp:\S\-5. Zanzoni, A., L. Montecchi-Palazzi, M . Quondam, G. Ausiello, M . Helmer-Citterich, and G. Cesareni. 2002. MINT: a Molecular INTeraction database. FEBS Lett 513 (l):135-40. 81 APPENDICES Appendix A - Definitions of the icons. Icons used in the visualization of the semantic model (Figure 7-11, 13,14) Biological structures cell domain or site intracellular compartment » ^ small molecule Events 1 macromolecule localization translocation V'Pj allosteric regulation |jn[] response (allosteric regulation) non-covalent interaction binding event A covalent interaction enzyme event substrate event 1 ^ 1 condition (allosteric regulation) ^ product event cellular response condition (cellular response) binding event B 82 States Bound (binding state) B T not-bound (binding state) phosphorylated (phosphorylation state) not-p ho sp hor yl at e d (phosphorylation state) 3 ^ functional for non-covalent int. (conformational state) non-functional for non-covalent int. (conformational state) functional for covalent int. (conformational state) non-functional for covalent int. (conformational state) Additonal icons used in the visualization of the interaciton maps (Figure 15-17) a W positive allosteirc regulation H „ promotion (derived relationship) negative allosteirc regulation inhibition (derived relationship) 83 Appendix B - Semantic Network Environment for Cell-modeling (SNEC) We have implemented a website, Semantic Network Environment for Cell-modeling (SNEC), using the V K application development environment. Through this interactive website, users can unitize the developed semantic model and resources in the BioCAD database to build intracellular pathways collaboratively. SNEC allows data from external sources such as literature to be incorporated into the pathway models (including data on localization of proteins, organization of domains and sites in macromolecules, allosteric regulations, conditions and effects of interactions, and promotion and inhibition of cellular response). The unique features of SENC are described in following sections. Utilize current information to customize biological structures SNEC allows users to create different biological structures including cell, intracellular compartments, proteins, domains and small molecules, based on the annotated biological concepts in the database. For example, as a first step to create a user-specific pathway model, a user searches for existing cell prototypes such as human macrophages in the BioCAD database, and a user-specific instance of the prototypical cell is created. SNEC also creates three intracellular compartments, which are plasma membrane, cytosol and nucleus for the user-cell by default, and additional compartments can be added by the user later. The user can search the protein database in BioCAD by protein synonym or accession number, and an example of P I K 3 C A is shown on Appendix B l . When the user finds and adds a protein, a user-specific instance of the prototypical protein is created in the database. The hyperlink on the user-protein directs the user to a protein detail page (Appendix B2), which contains annotations that have been previously incorporated into BioCAD. Those annotations include gene locus name, protein synonyms, accession number (RefSeq ID, GenBank ID), descriptions on protein functions and Gene Ontology classifications. 84 When the user-protein is created, instances of any Pfam domains, which are components of the prototypical protein, are also created and connected to the protein instance. Appendix B4 shows that the user can add additional domains and sites from sources such as Prosite, Prefile, Prints, Prodom, Profile, Prosite, Smart, SSF and TigrFams. The user also has the option to create a user-defined domain or a phosphorylation site. Define the behavior of molecules by creating different types of events After the user has added the cells, proteins or any other biological components, different types of events can be created. The navigation buttons on the left panel of the website allows users to move to detailed web pages for events such as localizations. Appendix 3 shows that localization events are defined by selecting an intracellular compartment in a user-specific cell. The user can also create allosteric regulations on proteins by adding an allosteric regulation event and its corresponding condition and responses events (Appendix B5, B6 and B7). Non-covalent and covalent interactions are defined by specifying the participating molecules, domains and sites and their states involved (Appendix B8, B9 and BIO). In addition, the user can create or re-use a cellular response event in the database, and define the occurrence of a cellular response by adding a condition, which considers a particular molecule and its required states (Appendix B l 1). Analyze and traverse interactions upstream and downstream One of the most distinguishable features in SNEC is the connection between molecular interactions (non-covalent and covalent interactions) and the conformational and functional changes (allosteric regulations) on the participating molecules caused by those interactions. For every allosteric regulation, SNEC identifies the "upstream" interactions that can satisfy the allosteric conditions, and "downstream" interactions that are effected by the allosteric responses. 85 For instance, Appendix B5 shows that one of the many allosteric regulations on PIK3CA (PI3K-pllO) is the activation of "PI3_PI4_kinase" domain when the "PI3K_RBD" is bound. The hyperlink of the condition event directs the user to a detail page (Appendix B6), which identifies upstream interactions that can satisfy this condition. In this example, the condition is satisfied by the non-covalent interaction between HRAS and PIK3CA. It should be noted that the molecular interactions and the allosteric regulations are connected indirectly through their common domains and states. Such an indirect connection allows the information on the interaction and the information on the allosteric regulation to be specified independently. It facilitates the reduction and storage of complex knowledge into several individual but interconnected pieces of information. Dynamic operations were implemented to search and report back for any molecular interaction that can satisfy or be affected by the allosteric regulations, and thus allow the information to be integrated. The allosteric response detail page (Appendix B7) shows the molecular interactions that are affected (enabled or disabled) by the response. In this example, the switch of the conformation state to "functional for covalent interaction" on "PI3_PI4_kinase" domain enables the covalent interaction between PIK3CA and PIP2 molecule. The hyperlink on the covalent interaction brings up a detail page for the covalent interaction (Appendix BIO). On both the non-covalent interaction and the covalent interaction pages (Appendix 9 and 10), the dynamic reports show the allosteric conditions that either activate or deactivate the molecular interaction, and the allosteric responses that are caused by the interaction. For example, the occurrence of the non-covalent interaction between Hras and PIK3CA (Appendix B9) requires the "SMALLGTP" binding site on HRAS to be bound. After the non-covalent interaction between Hras and PIK3CA has occurred, the allosteric response indicates that "PI3_PI4_kinase" domain on PIK3CA will become "functional". 86 The context of a molecular interaction page (Appendix B9 or BIO) is an interaction event, and the interaction web page contains hyperlinks to allosteric regulation pages. The context of an allosteric regulation page is either the allosteric condition (Appendix B6) or the response (Appendix B7) event, and the allosteric regulation page contains hyperlinks to the molecular interactions. This design allows users to traverse between the upstream interactions and the downstream interactions in either direction, through allosteric regulations as the intermediate and integrating points. On the molecular interaction page, participating molecules including binding molecules, enzymes, substrates and products also have hyperlinks to their own interaction-summary pages such as the one shown on Appendix B8. It allows users to change the current context to another molecule and explore parallel interactions or cellular response that involve the other molecule. SNEC enabled us to build and traverse the macrophage pathways and to identify cellular responses that can be affected by MTB. The pathways and their participating components are described in Section 3.2 of the main text. SNEC - screenshots B l - Protein search page A protein can be searched by its synonym or accession number. The search can be fdtered based on organisms. The top report shows proteins in the BioCAD database, and the second report shows proteins in the user's pathway models. The navigation buttons on the left side are used to navigate between different biological concepts such as cells and interactions. The hyperlink on a protein links to a protein's detail page. B2 - Protein's detail page The context of this page is a user-created protein. The reports show the annotations that are derived from various sources. They include synonyms, database accession numbers, general 87 descriptions of the protein, and GO classifications. The navigation tree on the left has been expanded to show the additional pages regarding the protein. Each navigation button directs the user to a specific web page for "Localization", "Domain & Sites", "Allosteric regulation" or "Interactions". B3 - Protein's localization page The context of this page is a user-created protein. The search on the top of the page allows user to search for cells that he has created. After a cell has been selected, a user adds a compartment as a possible location for the protein. The bottom report shows the localizations that have been previously specified for the protein. B4 - Protein's domain/site page The context of this page is a user-created protein. The first report shows all annotated domains and sites on this protein (e.g. Pfam domains). The second report shows any user-defined domain or site. The three action buttons create a domain, a site or a PTM respectively. A user can also add additional domain and sites as annotated by other sources. B5 - Allosteric regulation summary page The context of this page is a user-created protein. The first report shows all the allosteric regulations that involve this protein. A new allosteric regulation can be added by an action button. The hyperlinks in this report bring up the detail information of an allosteric regulation, and display the condition and response reports located at the bottom of the page. A new condition or response can be added via the corresponding action buttons. The reference button brings up a reference page where a research article can be added to support the allosteric 88 regulation. The hyperlink in the condition report links to a condition detail page, and the hyperlink in the response report links to a response detail page. B6 - Allosteric regulation's detail - <condition> page The context of this page is a condition event of an allosteric regulation. The first report shows the information that is currently associated with a condition event. The information can be modified by selecting any protein that can interact with the original protein. This allows an allosteric regulation to be defined across proteins in the same complex. The combo boxes are used to specify the domain, binding state and phosphorylation state that are required for the condition. After the information has been specified, the two dynamic reports at the bottom will be updated and show any non-covalent or covalent interaction that can "satisfy" the condition. In this example, a non-covalent interaction between HRAS and PI3K3A can satisfy the condition. The hyperlinks in the reports link to interaction detail pages. B7 - Allosteric regulation's detail - <response> page The context of this page is a response event of an allosteric regulation. The first report shows the information that is currently associated with this response event. The information can be modified by selecting any protein that can interact with the original protein. This allows an allosteric regulation to be defined across proteins in the same complex. The combo boxes are used to specify the domain, and conformational states that are changed to as the result of the response. After the information has been specified, the dynamic reports are updated and show any non-covalent or covalent interaction that is "enabled or disabled" by the response. In this example, a covalent interaction catalyzed by the enzyme PIK3CA is enabled by the response (due to the kinase domain and the functional state associated with the response). The hyperlinks in the reports link to interaction detail pages. 89 B8 - Interaction summary page The context of this page is a user-created protein. This page shows non-covalent interactions, covalent interactions and cellular responses that involve this protein. Each report shows the detail of the interactions, and new interactions can be added by the different action buttons. The hyperlinks in the report link to the detail pages for interactions or cellular responses. B9 - Non-covalent interaction's detail page The context of this page is a non-covalent interaction event. To specify the interaction, a user first searches for a molecule in his models. The molecule is then added as one of the binding molecules (A or B). The two reports in the middle show the current binding molecules associated with this event. The binding domains and additional phosphorylation states (optional) are specified by combo boxes. The dynamic reports at the bottom show any activating condition that is required for this interaction to occur as well as any deactivating condition that prevents this interaction. For example, the occurrence of this non-covalent interaction depends on the "SMALLGTP" binding domain on HRAS to be in "bound' state. The bottom report also shows the effects or consequences of the interaction if it occurs. For instance, the occurrence of this interaction can cause the kinase domain on PIK3CA to become "functional for covalent interaction". The hyperlinks in the condition report link to condition events of allosteric regulations, and the hyperlinks in the effect report link to response events of allosteric regulations. BIO - Covalent interaction's detail page 9 0 The context of this page is a covalent interaction event. To specify the interaction, a user first searches for a molecule in his models. The molecule is then added as an enzyme, a substrate or a product. The three reports in the middle show the current molecules associated with this event. An active site or a catalytic domain can be specified for the enzyme. In addition, a modification site and its corresponding phosphorylation state can be specified if the substrate is a protein. The dynamic reports at the bottom show any activating condition that is required for this interaction to occur as well as any deactivating condition that prevents this interaction. For example, the occurrence of this covalent interaction depends on the "PI3K RBD" (the Ras binding domain) on PIK3CA to be at "bound' state. On the other hand, this interaction is inhibited if the "PI3K_P85B" (the p85 binding domain) on PIK3CA is bound. The bottom report also shows the effects or consequences of the interaction. The hyperlinks in the condition report link to condition events of allosteric regulations, and the hyperlinks in the effect report link to response events of allosteric regulations. B l 1 - Cellular response detail page The context of this page is a cellular response event. The report shows all conditions that are required for the occurrence of this cellular response. For example, the response "actin polymerization and rearrangement" occurs when the "GTP binding domain" is bound on ARF6. 91 Bl - Protein search page 3 Protein Search <web> 171215449 20386527 Microsoft Internet Explorer File Edit View Favorites Tools Help SNEC Semantic Network Environment for Cell-modeling o n powered by Upstream Biosciences Molecules Proteins GSBESSSS9 Prote in na me: piK3CA Accession: Organ is m: Ho mo sap iens Proteins in the BioCAD database: Piotein Synonym OKjanisui Accession SwissPiot ID SwissPiot Name Enzyn PIK3CA PTDINS-3-KINASE P110, PI3-KINASE P110 SUBUNIT Al DUA Homo sapiens Nlvl_006218 P42336 P11A_HUMAN 2.7.1.15 2.7.1.13 ' V Proteins in my models: show all proteins My Piotein Refseq PIK3CA NM_006218 Search Add B2 - Protein's detail page Molecules > Proteins > PIK3CA > Detail Name (gene locus name): PIK3CA SwissProtname: P11A HUMAN Created and Used by: Mike Organ is m: Homo sapiens Synonym PTDINS-3-KINASE P110 PI3-KINASE P110 SUBUNIT ALPHA PHOSPHATIDYLINOSITOL-4,5-BISPHOSPHATE 3-KINASE CATALYTIC SUBUNIT, ALPHA ISOFORM EC 2.7.1.153 PI3K Accession Type 5453891 Genbank Identifier NM_006218 Ref_Seq NP_006209.1 Refseq Version Number NM_006218.1 Refseq Version Number P42336 SWISS-PROT 5453892 Genbank Identifier Description phosphoinositide-3-kinase, catalytic, alpha polypeptide Homo sapiens phosphoinositide-3-kinase, catalytic, alpha polypeptide (PIK3CA), mRNA. Gene Ontology Teim GO ID Type signal transduction GO:0007165 biological_process 1 -phosphatidylinositol 3-kinase complex GO: 0005942 cellular_component phosphatidylinositol 3-kinase activity GO: 0016303 molecular_function transferase activity GO:0016740 molecularjunction B3 - Protein's localization page Molecules Proteins L o c a l i z a t i o n Q3SZE53SSB Molecules > Proteins > PIK3CA > Localization Protein name: PIK3CA RefSeq accession: NM_DD6218 Cell name: 1. Select a cell Macrophagejiuman Mike Localization Search My cells Source B-cell human Mike T-cell human Mike Macrophage human Mike 2. Select a compartment: Created and Used by: Mike 3, Add this location: Add a new location Intracellular compartment Extracellular space Nucleus Cytosol Plasma Membrane Cell Intracellular compartment Macrophagejiuman Cytosol Macrophage, human Plasma Membrane B4 - Protein's domain/site page Molecules Proteins Domains & Sites Molecules > Proteins > PIK3CA > Domains and Sites Protein name: PIK3CA RefSeq accession: NM_006218 Created and Used by: Mike Domains and sites derived from public databases (e.g. Pfam): Domain Site Type Souice Accession Fiom lesiclue To lesidiie a of Allosteric Regulations involved PI3K P85B Domain P F A M PF02192 31 108 1 PI3K C2 Domain P F A M PF00792 350 485 0 PI3K RBD Domain P F A M PF00794 173 292 1 PI3KA Domain P F A M PF00613 519 704 0 PI3 PI4 KINASE Domain P F A M PF00454 796 1015 3 User defined domains and sites Add a domain Add a site Add a PTM Name Type Souice Fiom lesidiie To lesidiie = of Allosteric Regulations involved adaptor bindinq domain User-defined domain 0 Other annotated domains and sites to add: Add V Domain and site Type Souice Fiom lesidiie To lesidiie KINASEJ. IKE Domain S S F 5 35 C 2 _ C A L B Domain S S F 326 483 PI3_4_KINASE_2 Domain Prosite 900 920 B5 - Allosteric regulation summary page Molecules > Proteins > PIK3CA > Allosteric regulation Protein name: PIK3CA RefSeq accession: NMJD06218 All allosteric regulations on this protein: Created and Used by: Mike Add a new allosteric regulation] Allosteiic ie<j illation § of conditions Piotein in condition Domains leqniied States lequiied = of i espouses Piotein in i espouse Domains affected State changed to Refeience Allosteric regulation 1 PIK3CA PI3K_P85B Bound 1 PIK3CA PI3_PI4_KINASE Non-func. for cov. Vanhaesebroeck, ' 1999 Allosteric regulation 1 PIK3CA PI3K_RBD Bound 1 PIK3CA PI3_PI4_KINASE Func. for cov. Cantley, 2002 # of references: 2 Detail on the selected allosteric regulation: Condition - When the following conditions are met: Reference Add a new condition Piotein Domain Site name Fiom lesidue To lesidue Binding state Phosphoiylation state PIK3CA PI3K_RBD 173 292 Bound 1 Response - The conformation of these domains/sites are changed: Add a new response Piotein Domain Site Fiom lesidue To lesidue Confoimational state foi non-covalent Int. Confoimational state foi covalent Int. PIK3CA PI3_PI4_KINASE 796 1015 Func. for cov. B6 - Allosteric regulation's detail - <condition> page Molecules > Proteins > PIK3CA > Allosteric regulation > Condition Piotein Domain Site name Fiom lesidiie To lesidiie Binding state Phosphoiylation state PIK3CA PI3K_RBD 173 292 Bound To change —> 1. select a protein involved in this condition: From PIK3CA Or from a protein that it interacts with 2. specify a domain and its states required for this condition: Domain/Site involved: PI3K RBD 173 292 Binding state requi red: Bound v Name RefSeq AC Phosphorylat ion state requi red: V PIK3CA NM_006218 PIK3R1 NM_181524 OK HRAS NM_176795 Interactions that can satisfy this condition: Non-covalent interactions: Molecule A Domain A PTM state lequiied Molecule B Domain B PTM state lequiied HRAS .: PI3K-p110 binding domain PIK3CA PI3K_RBD Covalent interactions: Enzyme Enzyme's domain Substiate Suhstiate's site Suhstiate's site -lesidiie Suhstiate's site state -hefoi e Pioduct Pi o duct's site Pi o duct's site -lesidiie Pioduct's she state - aftei No results B7- Allosteric regulation's detail - <response>page Molecules > Proteins > PIK3CA > Allosteric regulation > Response Piotein Domain Site Fiom lesidue To lesidue State foi non-covalent Int. State foi covalent Int. PIK3CA PI3_PI4_KINASE 796 1015 Func. for cov. To change — > 1. select a protein involved in this response: From PIK3CA Or from a protein that it interacts with Name RefSeq AC PIK3CA NM_006218 PIK3R1 NM_181524 HRAS NM_176795 2. specify a domain and its conformational state: Domain/Bite involved: State for non-covalent interactions State for covalent interactions: PI3 PI4 KINASE 796 1015 V Func. for cov, V OK Interactions that are enabled by this response: Non-covalent interactions: Interactions that are disabled by this response: Non-covalent interactions: Molecule A Domain A Molecule B Domain B No results Molecule A Domain A Molecule B Domain B No results Covalent interactions: Covalent interactions: Enzyme Substiate Pioduct PIK3CA PIP2 PIP3 Enzyme Substiate Pioduct No results B8 - Interaction summary page i p i d s Molecules > Proteins > PIK3CA > Interactions Protein name: PIK3CA RefSeq accession: NM_006218 Non-covalent Interactions: # = 2 Created and Used by: Mike ( Add a non-covalent interaction j Molecule A Molecule A - A C Domain A Domain A - AC Fiom lesidiie To lesidiie PTM state lequiied Molecule B Molecule B - AC Domain B Domain B - AC Fiom lesidiie To lesidiie 1 HRAS N M J 76795 .: PI3K-p110 binding domain PIK3CA NM_006218 PI3K_RBD PF00794 173 292 Covalent interactions: # = 1 Add a new covalent interaction: [ current protein as an enzyme j [ current protein as a substrate j Enzyme Enzyme -AC Enzyme's domain Domain -AC Fiom lesidiie To lesidiie Substiate Substiate - A C Substiate's site Residue Substiate's site state -befoie Piodtict PIK3CA NM_006218 PI3_PI4 .KINASE PF00454 796 1015 PIP2 PIP3 Cellular responses: # = 0 [Add a condition for a cell response] Celliil.ii Response Condition molecule Molecule -AC Domain Site Domain Site -AC Fiom lesidiie To lesidiie Binding state Phospho state Total § of conditions S of lefeiences No results B9 - Non-covalent interaction's detail page Non-covalent Interaction > Use the search to add or replace molecule with this interaction: Accession: Mo lecu le na me: piK3CA Search Add / replace molecule A Molecules in my models Type Refseq PIK3CA Protein NM_006218 Add / replace molecule B # of references: 2 Reference Molecules involved in this non-covalent interaction: Binding; molecule A RefSeq AC HRAS NM_176795 Binding molecule B RefSeq AC PIK3CA NM_006218 Domain/Site involved: , ; piac-piio binding domain Additional PTM state required: Domain/Bite involved: pi3K RBD 173 292 Additional PTM state required: The allosteric conditions for this non-covalent interaction: Activating conditions Deactivating conditions Piotein Domain oi site lequiied Fiom lesidue To lesidue Binding state Modification state HRAS SMALL_OTP 1 161 Bound Piotein Domain oi site lequiied Fiom lesidiie To lesidiie Binding state Modification state < No results < > < Irhe allosteric effects of this non-covalent interaction: Piotein Domain oi site affected Fiom lesidue To lesidue Functional state foi a non-covalent i nte i action Functional state foi a covalent i nte i action # of conditions § of i espouses PIK3CA PI3 PI4 KINASE 796 1015 Func. for cov. BIO - Covalent interaction's detail page Covalent Interaction > PIK3CA I Name: Use the search to add or replace molecules in this interaction: # of references: l Accession: Search Reference Add/Replace Enzyme Molecules in my models Type Refseq PIK3CA Protein NM_006218 Add/Replace Substrate Add/Replace Product Molecules involved in this covalent interaction: Enzyme RefSeq A C PIK3CA NM_006218 Enzyme's domain/Site involved: PI3 PI4 KINASE 796 1015 Substiate RefSeq A C P1P2 Pioduct RefSeq A C PIP3 Domain/Site involved: State before: Domain/Site involved: State after: The allosteric conditions of this convalent interaction: Activating conditions: Piotein Domain oi site lequiied Fiom lesidue To lesidue Binding state Modification state 9 Of conditions # of iespouses PIK3R1 SH2 333 408 Bound 1 1 PIK3CA PI3K_RBD 173 292 Bound 1 1 Deactivating conditions: Piotein Domain oi site lequiied Fiom lesidue To lesidue Binding state Modification state » of conditions = of i espouses PIK3CA PI3K_P85B 31 108 Bound 1 1 The allosteric effect of this covalent interaction: I x Bll - Cellular response detail page Cellular response: Actin polymerization and rearrangement Conditions: # of references: 1 Reference Add a condition Molecule Domain Site Fiom position To position Binding state Phospho state ARF6 .: OTP binding domain Bound P o l y s a c c h a r i d e s Interaction OK C e l l response S i m u l a t i o n Appendix C - List of molecules and events in the macrophage pathway model CI - Molecules in the macrophage model Name Synonym Type Refseq ACTN1 ALPHA-ACTININ C Y T O S K E L E T A L ISOFORM, F-ACTIN CROSS LINKING PROTEIN, ALPHA-ACTIN IN 1, N O N -M U S C L E ALPHA-ACTININ 1 Protein NP001093.1 AKT1 EC 2.7.1.-, PROTEIN KINASE B, PKB, R A C - A L P H A SERINE/THREONINE K INASE, C-AKT, R A C - P K - A L P H A Protein NP 005154.1 AKT2 R A C - P K - B E T A , EC 2.7.1.-, P K B BETA, R A C - B E T A SERINE/THREONINE PROTEIN KINASE, PROTEIN KINASE AKT-2 , PROTEIN K INASE B, B E T A Protein NP001617.1 AKT3 R A C - P K - G A M M A , PKB G A M M A , R A C - G A M M A SERINE/THREONINE PROTEIN KINASE, EC 2.7.1.-, PROTEIN K INASE AKT-3 , PROTEIN K INASE B, G A M M A , STK-2 Protein NP_859029.1 AKT3 R A C - P K - G A M M A , PKB G A M M A , R A C - G A M M A SERINE/THREONINE PROTEIN KINASE, EC 2.7.1.-, PROTEIN K INASE AKT-3 , PROTEIN K INASE B, G A M M A , STK-2 Protein NP 005456.1 ARF6 ADP-ribosylation factor 6 Protein NP_001654.1 B A D BAD, BCL-2-L IKE 8 PROTEIN, B C L - X L / B C L - 2 ASSOCIATED DEATH PROMOTER, BCL-2 BINDING COMPONENT 6, BCL2-ANTAGONIST OF C E L L DEATH Protein NP_004313.1 BCL2 APOPTOSIS R E G U L A T O R BCL-2 Protein NP 000624.1 C3 C O M P L E M E N T C3 PRECURSOR Protein NP 000055.1 CCNB1 G2/MITOTIC-SPECIFIC C Y C L I N B l Protein N P J 14172.1 CCND1 BCL-1 ONCOGENE, PRAD1 ONCOGENE, Gl /S-SPECIFIC C Y C L I N DI Protein NP 444284.1 CD14 M Y E L O I D CELL-SPECIFIC LEUCINE-RICH GLYCOPROTEIN, M O N O C Y T E DIFFERENTIATION ANTIGEN CD 14 PRECURSOR Protein NP 000582.1 CDC2 EC 2.7.1.-, CDK1 , CYCL IN-DEPENDENT KINASE 1, C E L L DIVISION CONTROL PROTEIN 2 H O M O L O G , P34 PROTEIN KINASE Protein NP_001777.1 CDC25A EC 3.1.3.48, M-PHASE INDUCER PHOSPHATASE 1, D U A L SPECIFICITY PHOSPHATASE CDC25A Protein NP 001780.1 CDC27 C E L L DIVISION C Y C L E PROTEIN 27 H O M O L O G , H-NUC, CDC27HS Protein NP 001247.2 CDK10 E C 2.7.1.-, C E L L DIVISION PROTEIN KINASE 10, SERINE/THREONINE-PROTEIN KINASE PISSLRE Protein NP_003665.2 CDK2 E C 2.7.1.-, P33 PROTEIN KINASE, C E L L DIVISION PROTEIN KINASE 2 Protein NP 001789.2 103 CDK7 CDK-ACTIVATING KINASE, EC 2.7.1.-, C A K , STK1, 39 K D A PROTEIN KINASE, TFIIH B A S A L TRANSCRIPTION F A C T O R C O M P L E X K INASE SUBUNIT, P39 M O 15, C A K 1 , C E L L DIVISION PROTEIN KINASE 7 Protein NP_001790.1 C H U K EC 2.7.1.-, INHIBITOR OF N U C L E A R F A C T O R K A P P A - B K INASE A L P H A SUBUNIT, N U C L E A R FACTOR N F K A P P A B INHIBITOR KINASE A L P H A , I -KAPPA-B KINASE 1, IKK1, IKK-ALPHA, CONSERVED HELIX-LOOP-HELIX UBIQUITOUS KINASE, IKK-A, IKAPPAB KINASE, I K A P P A - B K INASE A L P H A , N F K B I K A , I K B K A Protein NP_001269.2 CKS2 CKS-2, CYCL IN-DEPENDENT KINASES R E G U L A T O R Y SUBUNIT 2 Protein NP001818.1 EEA1 early endosome antigen 1 Protein NP 003557.1 FCGR 1A CD64 ANTIGEN, FCRI, HIGH AFFINITY I M M U N O G L O B U L I N G A M M A FC RECEPTOR I PRECURSOR, F C - G A M M A RI, IGG FC RECEPTOR I Protein NP_000557.1 FGR EC 2.7.1.112, P55-FGR, PROTO-ONCOGENE TYROSINE-PROTEIN KINASE FGR, C-FGR Protein NP_005239.1 GAB2 GRB2-associated binding protein 2 Protein NP_536739.1 GDP guanosine diphosphate Nucleotide GRB2 growth factor receptor-bound protein 2 Protein NP_002077.1 GSK3B E C 2.7.1.37, GSK-3 BETA, G L Y C O G E N SYNTHASE KINASE-3 BETA Protein NP_002084.2 GTP guanosine triphosphate Nucleotide HCK HEMOPOIETIC C E L L K INASE, EC 2.7.1.112, TYROSINE-PROTEIN KINASE HCK, P59-HCK/P60-HCK Protein NP_002101.2 HRAS TRANSFORMING PROTEIN P21/H-RAS-1, C-H-RAS Protein NP_789765.1 IGHG3 HDC, IG G A M M A - 3 CHAIN C REGION, H E A V Y CHAIN DISEASE PROTEIN Protein NG_001019 ITGAM NEUTROPHIL A D H E R E N C E RECEPTOR, CR-3 A L P H A CHAIN, C E L L S U R F A C E GLYCOPROTEIN MAC-1 A L P H A SUBUNIT, INTEGRIN A L P H A - M PRECURSOR, CD1 IB, L E U K O C Y T E ADHESION RECEPTOR M O l Protein NP_000623.1 ITGB2 CD 18, C O M P L E M E N T RECEPTOR C3 BETA-SUBUNIT, C E L L SURFACE ADHESION GLYCOPROTEINS LFA-1/CR3/P150,95 BETA-SUBUNIT, INTEGRIN BETA-2 PRECURSOR Protein NP 000202.1 LPS lipopolysaccharide Phospholipid L Y N EC 2.7.1.112, TYROSINE-PROTEIN KINASE L Y N Protein NP 002341.1 ManLAM Mannose-capped lipoarabinomannan Phospholipid MAP3K14 EC 2.7.1.37, N F - K A P P A BETA- INDUCING KINASE, HSNIK, SERINE/THREONINE PROTEIN KINASE NIK, MITOGEN-ACTIVATED PROTEIN KINASE KINASE K INASE 14 Protein NP 003945.1 104 MAP3K8 EC 2.7.1.-, C-COT, MITOGEN-ACTIVATED PROTEIN KINASE KINASE KINASE 8, COT PROTO-ONCOGENE SERINE/THREONINE-PROTEIN KINASE, C A N C E R O S A K A THYROID ONCOGENE Protein NP_005195.2 NCF4 NCF-4, P40PHOX, P40-PHOX, NEUTROPHIL CYTOSOL FACTOR 4, NEUTROPHIL N A D P H OXIDASE FACTOR 4 Protein NP 000622.1 NFKB1 EBP-1, DNA-BINDING FACTOR KBF1, N U C L E A R FACTOR N F - K A P P A - B P105 SUBUNIT Protein NP_003989.1 NFKB2 ONCOGENE LYT-10, LYT10, N U C L E A R FACTOR NF-K A P P A - B P100/P49 SUBUNITS, H2TF1 Protein NP_002493.2 NFKBIA MAJOR HISTOCOMPATIBILITY C O M P L E X ENHANCER-BINDING PROTEIN MAD3, IKB-ALPHA, I -KAPPA-B-A L P H A , I K A P P A B A L P H A , N F - K A P P A B INHIBITOR A L P H A Protein NP 065390.1 PDK1 EC 2.7.1.99, P Y R U V A T E D E H Y D R O G E N A S E K INASE ISOFORM 1, [PYRUVATE D E H Y D R O G E N A S E [LIPOAMIDE]] K INASE ISOZYME 1, MITOCHONDRIAL PRECURSOR Protein NP 002601.1 PDK2 EC 2.7.1.99, P Y R U V A T E D E H Y D R O G E N A S E KINASE ISOFORM 2, [PYRUVATE D E H Y D R O G E N A S E [LIPOAMIDE]] K INASE ISOZYME 2, MITOCHONDRIAL PRECURSOR Protein NP 002602.2 PDPK1 EC 2.7.1.37, PROTEIN PRO0461, 3-PHOSPHOINOSITIDE DEPENDENT PROTEIN KINASE-1, HPDK1 Protein NP 002604.1 PI phosphorylates phosphatidylinositol Phospholipid POP phosphatidylinositol-3-phosphate Phospholipid PIK3C3 phosphoinositide-3-kinase class 3, Vps34 Protein NP_002638.1 PIK3CA PTDINS-3-KINASE PI 10, PI3-KINASE PI 10 SUBUNIT A L P H A , PHOSPHATIDYLINOSITOL-4,5-BISPHOSPHATE 3-KINASE C A T A L Y T I C SUBUNIT, A L P H A ISOFORM, EC 2.7.1.153, PI3K Protein NP 006209.1 PIK3R1 PTDINS-3-KINASE P85-ALPHA, PI3K, PHOSPHATIDYLINOSITOL 3-KINASE R E G U L A T O R Y A L P H A SUBUNIT, PI3-KINASE P85-ALPHA SUBUNIT Protein NP_852664.1 PIK3R1 PTDINS-3-KINASE P85-ALPHA, PI3K, PHOSPHATIDYLINOSITOL 3-KINASE R E G U L A T O R Y A L P H A SUBUNIT, PI3-KINASE P85-ALPHA SUBUNIT Protein NP 852665.1 PIK3R4 phosphoinositide-3-kinase, regulatory subunit 4, pi50 Protein NP 055417.1 PIP2 phosphatidylinositol-4,5-bisphosphate Phospholipid PIP3 phosphatidylinositol-3,4,5-bisphosphate Phospholipid PSCD3 G E N E R A L RECEPTOR OF PHOSPHOINOSITIDES 1, CYTOHESTN 3, GRP1, A R N 0 3 PROTEIN, ARF NUCLEOTIDE-BINDING SITE OPENER 3 Protein NP_004218.1 P X N PAXILLIN Protein NP_002850.1 RAB5A R A S - R E L A T E D PROTEIN RAB-5A Protein NP 004153.2 105 RBI PP110, RETINOBLASTOMA-ASSOCIATED PROTEIN, RB, P105-RB Protein NP 000312.1 R E L A N U C L E A R FACTOR N F - K A P P A - B P65 SUBUNIT, TRANSCRIPTION FACTOR P65 Protein NP 068810.1 R E L B TRANSCRIPTION FACTOR RELB, I-REL Protein NP_006500.1 RIN1 RAS INHIBITOR JC99, RAS INTERACTION/INTERFERENCE PROTEIN 1, RAS A N D R A B INTERACTOR 1 Protein NP 004283.1 RPS6KB1 EC 2.7.1.-, P70-S6K, R IBOSOMAL PROTEIN S6 KINASE, S6K Protein NP 003152.1 SLC2A4 SOLUTE CARRIER F A M I L Y 2, FACILITATED G L U C O S E TRANSPORTER, M E M B E R 4, G L U C O S E TRANSPORTER T Y P E 4, INSULIN-RESPONSIVE Protein NP 001033.1 SOS1 SOS-1, SON OF SEVENLESS PROTEIN H O M O L O G 1 Protein NP_005624.2 TLN1 TALIN 1 Protein NP 006280.2 TLR2 TOLL-L IKE RECEPTOR 2 PRECURSOR, TOLL/ INTERLEUKIN 1 RECEPTOR-LIKE PROTEIN 4 Protein NP 003255.2 Y W H A B PROTEIN KINASE C INHIBITOR PROTEIN-1, KCIP-1, 14-3-3 PROTEIN B E T A / A L P H A , PROTEIN 1054 Protein NP 647539.1 Appendix CI. Molecules in the macrophage model. The table shows the list of prototypical molecules that have been included in the pathway model. Those proteins include cell receptors such as FCGR 1A (Fey), ITGAM (CDllb), ITGB2 (CD18), CD14, TLR2 that are relevant to the process of bacterial internalization of macrophages. Two distinct classes of PI3Ks have been modeled: the class IPI3K composed of p85 regulatory (PIK3R1) and pi 10 catalytic subunits (PIK3CA), and the class III PI3K composed of pi 50 (PIK3R4) and PIK3C3 subunits. The pathway model contained various kinases such as Lyn, PDKl (PDPK1) and AKT1, and small GTP proteins including Ras (HRAS), ARF6 and Rab5a. Adaptor proteins, Gab2 and Grb2, and transcription factors, NF-kB, have also been incorporated into the model. Column 1: the molecule name (gene locus names for proteins). Column 2: the synonyms or EC number if the molecule is an enzyme. Column 3: the type of the molecule. Column 4: the accession number from RefSeq. 106 C2 - Non-covalent interactions in the macrophage model Molecule A Domain A Domain A -accession number Molecule B Domain B Domain B -accession number Reference ACTN1 integrin-binding domain ITGB2 Cytoskeleton protein binding domain Velasco-Velazquez et al. 2003 AKT1 P H PF00169 PIP3 binding site for PH domain Cantley 2002; Wymann, Zvelebil, and Laffargue 2003 AKT2 PH PF00169 PIP3 Downward 2004 A K T 3 PH P1P3 Downward 2004 ARF6 GTP binding domain GTP Cantley 2002; Stephens, Ellson, and Hawkins 2002 ARF6 Grpl binding domain PSCD3 SEC7 PF01369 Cantley 2002 ARF6 GDP binding domain GDP Cantley 2002; Stephens, Ellson, and Hawkins 2002 BCL2 BH3 PS01259 B A D BH3 Cantley 2002 CCNB1 Cdc2 binding CDC2 Cyclin binding Pavletich 1999 CD 14 LPS bidning domain LPS Hmamaetal. 1999 EEA1 Rab5 binding RAB5A EEA1 binding Vieira et al. 2003 EEA1 F Y V E PI3P Stenmark and Aasland 1999; Wurmser, Gary, and Emr 1999 FGR CD 18 binding domain ITGB2 Src-family tyrosine kinase binding domain Velasco-Velazquez et al. 2003 GAB2 PH PIP3 binding site for PH domain Gu et al. 2003 GAB2 pYxxM PIK3R1 SH2 PF00017 Cantley 2002 H C K CD18 binding domain ITGB2 Src-family tyrosine kinase binding domain Velasco-Velazquez et al. 2003 HRAS GDP binding domain GDP Macaluso et al. 2002 HRAS S M A L L G T P TIGR00231 GTP Macaluso et al. 2002 HRAS PI3K-pl 10 binding domain PIK3CA PI3K_RBD PF00794 Vanhaesebroeck and Waterfield 1999; Cantley 2002 IGHG3 Fc-gamma receptor binding domain FCGR1A IG PF00047 Gu et al. 2003 I T G A M V W A PF00092 C3 Velasco-Velazquez et al. 2003 ITGB2 CD1 lb binding domain ITGAM CD 18 binding domain Velasco-Velazquez et al. 2003 ITGB2 Src-family tyrosine kinase bidning domain L Y N CD 18 binding domain Velasco-Velazquez etal. 2003 L Y N FcgR binding FCGR1A Lyn binding Gu et al. 2003 NCF4 P X PF00787 PI3P Stephens, Ellson, and Hawkins 2002 NFKB1 RHD R E L A RHD PF00554 Hayden and Ghosh 2004 NFKB2 RHD PF00554 R E L B RHD PF00554 Hayden and Ghosh 2004 N F K B I A NF-kB binding R E L A RHD for IkB binding Hayden and Ghosh 2004 PDPK1 P H R E L A T E D SSF50729 PIP3 binding site for PH domain Cantley 2002 PIK3C3 PI kinase ManLAM Fratti et al. 2001 PIK3CA PI3K_P85B PF02192 PIK3R1 pi 10-binding domain Wymann, Zvelebil, and Laffargue 2003 PIK3R4 Vps34p-binding PIK3C3 pl50-binding Vanhaesebroeck and Waterfield 1999 PIK3R4 WD40 RAB5A pi 50 binding Murray et al. 2002 PSCD3 PH PF00169 PIP3 binding site for PH domain Cantley et al. 2002 P X N integrin-binding domain ITGB2 Cytoskeleton protein binding domain Velasco-Velazquez et al. 2003 R A B 5 A PIP3 binding domain PIP3 Stephens, Ellson, and Hawkins 2002 R A B 5 A GDP-binding GDP Murray, 2002, Lanzetti, 2004, Tall, 2001 R A B 5 A S M A L L _ G T P TIGR00231 GTP Murray et al. 2002; Lanzetti et al. 2004; Tallet al. 2001 RIN1 Rab5 GEF RAB5A GEF binding Tallet al. 2001 SLC2A4 PIP3 binding domain PIP3 Wymann, Zvelebil, and Laffargue 2003 S0S1 RASGEF PF00617 HRAS GEF binding domain Wymann, Zvelebil, and Laffargue 2003 SOS1 Proline-rich GRB2 SH3 PF00018 Wymann, Zvelebil, and Laffargue 2003 TLN1 integrin-binding domain ITGB2 Cytoskeleton protein binding domain Velasco-Velazquez et al. 2003 TLR2 pYxxM PIK3R1 SH2 PF00017 Arbibe et al. 2000 TLR2 CD 14 binding domain CD14 TLR2 binding domain Muta and Takeshige 2001 Y W H A B phosphoserine binding B A D S99 Cantley 2002 domain Appendix C2. Non-covalent interactions in the macrophage model. The table shows the list of non-covalent event prototypes that were incorporated in the pathway model. Column 1: name of the binding molecule A (gene locus names for proteins). Column 2: domain of molecule A that participates in the interaction. Column 3: accession number for annotated domains (A) (abbreviation: PF, Pfam; PS, PROSITE; TIGR, TIGRFAMs; SSF, SUPERFAMILY). Column 4: name of the binding molecule B (gene locus names for proteins). Column 5: domain of molecule B that participates in the interaction. Column 6: accession number for annotated domains (B) Column 7: literature that supports the interaction. C3 - Covalent interactions in the macrophage model Enzyme Enzyme's domain Enzyme's domain -accession number Substrate Substrate's site Substrate's site state - before Product Product's site Product's site state - after Reference A K T l PROTEIN_KINASE_ST PS00108 MAP3K8 S400 Not-phosphorylated MAP3K8 S400 Phosphorylated Kane et al. 2002 A K T l PROTEIN_KINASE_ST PS00108 B A D SI 18 Not-phosphorylated BAD SI 18 Phosphorylated Datta et al. 1997 A K T l PROTErN_KTNASE_ST PS00108 GSK3B S9 Not-phosphorylated GSK3B S9 Phosphorylated Wymann, Zvelebil, and Laffargue 2003 A K T l PROTEIN_KlNASE_ST PS00108 B A D S99 Not-phosphorylated BAD S99 Phosphorylated Datta et al. 1997 CDC2 PROTEIN_KINASE_ST PS00108 RBI S807 Not-phosphorylated RBI S807 Phosphorylated Shapiro and Harper 1999 CDC25A RHODANESE PF00581 CDC2 Y15 Phosphorylated CDC2 Y15 Not-phosphorylated Kumagai and Dunphy 1991 CDK7 P R O T E I N K I N A S E S T PS00108 CDC2 T161 Not-phosphorylated CDC2 T161 Phosphorylated Pavletich 1999 C H U K P R O T E I N K I N A S E S T PS00108 NFKB2 S Not-phosphorylated NFKB2 S Phosphorylated Hayden and Ghosh 2004 GSK3B PROTEIN_KlNASE_ST PS00108 CCND1 S Not-phosphorylated CCND1 s Phosphorylated Cantley 2002 L Y N P R O T E I N K I N A S E T Y R PS00109 GAB2 Y Not-phosphorylated GAB2 Y Phosphorylated Gu et al. 2003 MAP3K14 PROTEIN_KINASE_ST PS00108 CHUK S180 Not-phosphorylated C H U K S180 Phosphorylated Hayden and Ghosh 2004 MAP3K8 PROTEIN_KINASE_ST PS00108 MAP3K14 S Not-phosphorylated MAP3K14 S Phosphorylated Lin etal. 1999 PDPK1 PKINASE PF00069 AKT2 T Not-phosphorylated AKT2 T Phosphorylated Downward 2004 PDPK1 PKINASE PF00069 RPS6KB1 T389 Not-phosphorylated RPS6KB1 T389 Phosphorylated Cantley 2002 PDPK1 PKINASE PF00069 AKT1 T308 Not-phosphorylated AKT1 T308 Phosphorylated Cantley 2002 PIK3C3 PI kinase PI PI3P Vanhaesebroec ket al. 2001 PIK3CA PI3_PI4_KINASE PF00454 PIP2 PIP3 Vanhaesebroec k and Waterfield 1999 Appendix C3. Covalent interactions in the macrophage model. The table shows the list of covalent event prototypes that were incorporated in the pathway model. Column 1: name of the enzyme. Column 2: active site or catalytic domain of the enzyme. Column 3: accession number for annotated domains (for the Enzyme; abbreviation: PF, Pfam; PS, PROSITE). Column 4: name of ihe substrate. Column 5: modification site of the substrate. Column 6: phosphorylation state before the covalent interaction. Column 7: name of the product. Column 8: modification site of the product. Column 9: phosphorylation state after the covalent interaction. Column 10: literature that supports the interaction. C4 - Allosteric regulations in the macrophage model Protein in condition Domains required States required Protein in response Domains affected State changed to Reference AKT1 T308 Phosphorylated AKT1 PROTEIN_KINASE_ST Func. for cov. Downward 2004 ARF6 GDP binding domain Bound ARF6 GTP binding domain Non-func. for non-cov. Cantley 2002; Stephens, Ellson, and Hawkins 2002 ARF6 Grpl binding domain Bound ARF6 GDP binding domain Non-func. for non-cov. Cantley 2002 B A D SI 18 Phosphorylated BAD BH3 Non-func. for non-cov. Cantley 2002; Datta et al. 1997 B A D S99 Bound, Phosphorylated B A D SI 18 Func. for cov. Datta et al. 1997 CDC2 Gyclin binding, Y15, T161 Bound, Not-phosphorylated, Phosphorylated CDC2 PROTEIN_KINASE_ST Func. for cov. Pavletich 1999; Kumagai and Dunphy 1991 C H U K S180 Phosphorylated C H U K P R O T E I N K I N AS E_ST Func. for cov. Hayden and Ghosh 2004 FCGR1A IG Bound FCGR1A Lyn binding Func. for non-cov. Gu et al. 2003 FGR CD 18 binding domain Bound FGR P R O T E I N K I N A S E T Y R Func. for cov. Velasco-Velazquez et al. 2003 GAB2 Y Phosphorylated GAB2 Y Func. for non-cov. Wymann, Zvelebil, and Laffargue 2003; Gu et al. 2003 GSK3B S9 Phosphorylated GSK3B PROTEIN_KINASE_ST Non-func. for cov. Wymann, Zvelebil, and Laffargue 2003 HRAS GDP binding domain Bound HRAS SMALL_GTP Non-func. for non-cov. Macaluso et al. 2002 HRAS S M A L L G T P Bound HRAS PI3K-pl 10 binding domain Func. for non-cov. Wymann, Zvelebil, and Laffargue 2003 HRAS GEF binding domain Bound HRAS GDP binding domain, S M A L L G T P Non-func. for non-cov., Func. for non-cov., Cantley 2002, Wymann, Zvelebil, and Laffargue 2003 ITGAM V W A Bound ITGB2 Src-family tyrosine kinase binding domain Func. for non-cov. Velasco-Velazquez et al. 2003 L Y N CD 18 binding domain Bound L Y N PROTEIN_KINASE_TYR Func. for cov. Velasco-Velazquez et al. 2003 L Y N FcgR binding Bound L Y N PROTEIN_KINASE_TYR Func. for cov. Gu et al. 2003 MAP3K14 S Phosphorylated MAP3K14 P R O T E I N K I N A S E S T Func. for cov. Lin etal. 1999 MAP3K8 S400 Phosphorylated MAP3K8 P R O T E I N K I N A S E S T Func. for cov. Kane et al. 2002 PIK3C3 PI kinase Bound PIK3C3 PI kinase Non-func. for cov. Fratti et al. 2001 PIK3CA P I 3 K P 8 5 B Bound PIK3CA PI3_PI4_KINASE Non-func. for cov. Vanhaesebroeck and Waterfield 1999 PIK3CA PI3K_RBD Bound PIK3CA PI3_PI4_KINASE Func. for cov. Vanhaesebroeck and Waterfield 1999; Cantley 2002 PIK3R1 SH2 Bound PIK3CA PI3_PI4_KINASE Func. for cov. Vanhaesebroeck and Waterfield 1999 RAB5A S M A L L G T P Bound RAB5A pi50 binding Func. for non-cov. Murray et al. 2002 RAB5A GDP-binding Bound RAB5A SMALL_GTP Non-func. for non-cov. Murray et al. 2002 RAB5A GEF binding Bound RAB5A GDP-binding Non-func. for non-cov. Tall et al. 2001 R E L A RHD for IkB binding Bound RELA Nuclear localization sequence (NLS) Non-func. for non-cov. Hayden and Ghosh 2004 Appendix C4. Allosteric regulations in the macrophage model. The table shows the list of allosteric regulation prototypes that were incorporated in the pathway model. Column 1: name of the protein involved in the condition events. Column 2: domain and sites of the protein, required for the conditions. Column 3: states (binding or phosphorylation states) required for the conditions. Column 4: name of the protein affected by the response events. Column 5: domains and sites of the protein, affected by the response. Column 6: states (conformational states) affected by the responses. Column 7: literature that supports the allosteric regulation. C5 - Cellular responses and their conditions in the macrophage model Cellular Response Molecule in condition Domain/Site involved Binding state Phosphorylation state Reference Actin polymerization and rearrangement ARF6 GTP binding domain Bound Cantley 2002 Cell cycle entry - S phase CCND1 phospho-S or T Not-phosphorylated Cantley 2002; Wymann, Zvelebil, and Laffargue 2003 Cell survival BCL2 BH3 Not Bound Cantley 2002; Wymann, Zvelebil, and Laffargue 2003 Intracellular glucose uptake SLC2A4 PIP3 binding domain Bound Wymann, Zvelebil, and Laffargue 2003 Membrane delivery to plasma membrane ARF6 GTP binding domain Bound Stephens, Ellson, and Hawkins 2002 Phagosome and lysosome fusion EEA1 FYVE Bound Fratti et al. 2001 Protein synthesis RPS6KB1 Threonine phosphorylation site Phosphorylated Cantley 2002; Wymann, Zvelebil, and Laffargue 2003 Recruitment of oxidase complex to phagosome NCF4 PX Bound Stephens, Ellson, and Hawkins 2002 Appendix C5. Cellular responses and their conditions in the macrophage model. The table shows cellular response prototypes that were incorporated in the pathway model. Column 1: name of the cellular response. Column 2: molecule that is required as the condition to induce the cellular response. Column 3: domain or site of that molecule, involved in the condition. Column 4: Binding state required for the condition. Column 5: phosphorylation state required for the condition. Column 6: literature that supports the cellular response and its condition. 

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0092077/manifest

Comment

Related Items