Deciphering multi-state mobility within single particle trajectories of proteins on the plasma membrane by Jennifer S. Morrison B.Sc., Simon Fraser University, 2008 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE in The Faculty of Graduate Studies (Mathematics) THE UNIVERSITY OF BRITISH COLUMBIA (Vancouver) August 2010 c Jennifer S. Morrison 2010 Abstract Single particle tracking is a powerful technique often used in the study of dynamic mechanisms on the cell surface such as binding, confinement and trafficking. Experimental trajectories can be used to detect changes in the lateral mobility of individual molecules over time and space. Therefore, a potential problem in the analysis of single particle trajectories is to account for transitions between modes of mobility. Here we present two coupled statistical methods which characterize particle mobility that is temporally and spatially heterogeneous. The first method detects periods of drift diffusion or reduced mobility within single trajectories due to transient associations with other biomolecules. The second locates spatial domains which have higher or lower concentrations of these associating molecules. The trajectory is modeled as the outcome of a two-state Hidden Markov model parameterized by the diffusion coefficients and drift velocities of each state and the rates of transitions between them (which may change in space). Transitions between states arise from association and disassociation with a binding partner, either membrane-associated or cytosolic. These associations lead to either reduced Brownian diffusion or drift diffusion. An adapted Markov chain Monte Carlo algorithm was used to optimize parameters and simultaneously select the most favorable model of lateral mobility (transient reduced mobility or transient drift diffusion) and to locate spatial domains. Analysis of simulated particle tracks with a wide range of parameters successfully distinguished between the two models, gave accurate estimates for parameters and accurately located spatial domains. ii Preface My supervisor, Daniel Coombs, and his postdoctoral researcher, Raibatak Das, have been involved in all areas of this thesis. They provided feedback, edited manuscripts, assisted with computational difficulties and gave advice in interpreting results. I performed all computations and simulations in Matlab or C, produced all plots in Matlab, and designed and illustrated all schematics. Chapter 2 – The single particle tracking section of chapter 2 is modified from a book chapter that I co-authored with Daniel Coombs and Raibatak Das. I performed the required research and wrote the first copy of this section and Daniel Coombs edited the manuscript. The content in the remainder of the chapter was written by me. I designed and created all figures in this chapter. Chapter 3 – Daniel Coombs and Raibatak Das initially developed a hidden Markov model to detect transient reduced mobility in single particle tracks. They also proposed that the model should be extended to include directed motion. I designed and parametrized this extended hidden Markov Model and statistical method for detecting transient directed motion. I also wrote all Matlab and C code and simulations used to produce all the figures. I also wrote the manuscript while Daniel Coombs and Raibatak Das edited it. Chapter 4 – I proposed to extend the hidden Markov model to detect spatial domains. I also designed and parametrized the layered hidden Markov Model and statistical analysis for identifying domains. Together with Daniel Coombs, I designed some simulations on which to test the method and came up with a way to measure the size of domains. I wrote all Matlab and C code and simulations used to produce all the figures. I wrote the manuscript and Daniel Coombs edited it. A version of Chapter 2 is being prepared for publication as part of a chapter in a text book entitled “Cellular Domains”. A manuscript containing versions of Chapters 3 and 4 is being prepared for publication in a peer-reviewed journal. iii Table of Contents Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii List of Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix List of Abbreviations and Symbols . . . . . . . . . . . . . . . . . . . . . x Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Brief chapter summary . . . . . . . . . . . . . . . . . . . . . . . 2 Biological Background . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Review of plasma membrane organization and protein diffusion 2.1.1 Cytoskeleton picket fence model . . . . . . . . . . . . 2.1.2 Lipid rafts . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Single particle tracking . . . . . . . . . . . . . . . . . . . . . . 2.2.1 Theoretical basics . . . . . . . . . . . . . . . . . . . . 2.2.2 Analysis of single particle tracks . . . . . . . . . . . . 2.2.3 Models of plasma membrane domains studied with SPT 2.2.4 Consequences for T cell signaling . . . . . . . . . . . . . . . . . . . . . 1 1 3 3 4 5 8 8 12 14 17 iv Table of Contents 3 Deciphering Two-State Mobility in Single Particle Trajectories . . 3.1 Theory - hidden Markov model of single particle tracks . . . . . 3.2 Materials and methods . . . . . . . . . . . . . . . . . . . . . . 3.2.1 Statistical analysis of particle trajectories . . . . . . . . 3.2.2 Generation of simulated single particle tracks . . . . . . 3.2.3 LFA-1 and CD45 labeling and single particle trajectories 3.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 Method verification with simulated data . . . . . . . . . 3.3.2 Analysis of LFA-1 and CD45 particle trajectories . . . 3.3.3 Study of simulated confined trajectories . . . . . . . . . 3.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Extensions to the Hidden Markov Model: Detection of Spatial Domains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.1 First layer of the hidden Markov model - identifying raft domains . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.2 Second layer of the hidden Markov model - identifying ligand-enriched domains . . . . . . . . . . . . . . . . . . 4.2 Materials and methods . . . . . . . . . . . . . . . . . . . . . . . 4.2.1 Parameter optimization . . . . . . . . . . . . . . . . . . 4.2.2 Generation of simulated single particle tracks . . . . . . . 4.2.3 Estimating areas of spatial domains . . . . . . . . . . . . 4.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 . . . . . . . . . . . 18 19 21 21 26 27 27 27 34 38 39 42 44 44 46 48 48 51 51 52 56 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 63 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 Appendices A Tables Containing Results from Analysis of Simulated Data . . . . 75 B Algorithms Used in the Generation of Simulated Data and Statistical Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 v List of Tables 3.1 Results from two-state HMM analysis of LFA-1 and CD45 trajectories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 A.1 Accuracy of parameter estimates and track segmentation for simulated tracks with TRM . . . . . . . . . . . . . . . . . . . . . . . 76 A.2 Accuracy of parameter estimates and track segmentation for simulated tracks with TDM . . . . . . . . . . . . . . . . . . . . . . . 77 A.3 Accuracy of parameter estimates and track segmentation for simulated tracks within thin channels . . . . . . . . . . . . . . . . . . 78 A.4 HMM analysis of simulated tracks within raft domains . . . . . . 79 A.5 HMM analysis of simulated tracks within lipid-enriched domains . 80 vi List of Figures 2.1 Schematic illustration of the cytoskeleton picket fence model . . . 2.2 Illustrating the differences between raft and non-raft regions in the 6 membrane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.3 Schematic drawing of an SPT experiment . . . . . . . . . . . . . 9 2.4 Three models of molecular movement and their resulting MSD versus time plots . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 11 MSD is a global measurement and cannot detect transient changes in mobility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.1 Schematic illustration of the two-state hidden Markov model . . . 22 3.2 Flowchart showing our modified Metropolis Hastings MCMC algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Example of the parameter optimization method for a simulated trajectory with TRM . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 30 Error in parameter estimates for simulated tracks with TRM and varying transition probabilities . . . . . . . . . . . . . . . . . . . 3.6 29 Example of the parameter optimization method for a simulated trajectory with TDM . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5 26 31 Errors in parameter estimates for simulated tracks with TDM and a range of transition probabilities . . . . . . . . . . . . . . . . . . 32 3.7 Model performance over different track lengths . . . . . . . . . . 34 3.8 Accuracy of model prediction with experimental noise added to simulated tracks . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.9 35 Results from analysis of simulated tracks with TDM and varying transition probabilities using the method presented in Arcizet et al. 36 vii List of Figures 3.10 Examples of experimental tracks of labelled CD45 and LFA-1 single particles on live T cells which have channel-like trajectories . . 38 3.11 An example of a single particle track confined within a thin channel segmented into two-state with HMM analysis . . . . . . . . . . . 39 4.1 Schematic of the HMMs for spatial domain detection . . . . . . . 45 4.2 The effects of the user-defined support size and the decay exponent on error in domain identification . . . . . . . . . . . . . . . . . . 4.3 48 Flowchart showing the parameter estimation algorithm for both layers of the HMM. . . . . . . . . . . . . . . . . . . . . . . . . . 49 4.4 Example of the method used for estimating the size of spatial domains 53 4.5 Typical results from the MCMC parameter optimization and track segmentation for a simulated trajectory with lipid rafts . . . . . . 4.6 54 Typical results of two-layered MCMC parameter optimization and track segmentation for a simulated trajectory within square ligand enriched domains . . . . . . . . . . . . . . . . . . . . . . . . . . 4.7 Layered MCMC analysis on trajectories with varying domain sizes, barrier permeability and ligand concentrations . . . . . . . . . . . 4.8 58 Error in layer two state sequences and estimated domain sizes for varying domain sizes and ligand concentrations . . . . . . . . . . 4.9 55 59 Results from the layered MCMC analysis on trajectory with nonsquare ligand enriched domains . . . . . . . . . . . . . . . . . . . 60 4.10 Track length versus error in parameter estimates for trajectories within ligand enriched domains . . . . . . . . . . . . . . . . . . . 61 viii List of Programs B.1 Algorithm for generating a two-state Markov chain for given diffusion coefficients D1 and D2 , velocities v1 and v2 , and transition rates P12 and P21 . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 B.2 Forward algorithm for calculating log likelihood of parameter values θ of a two-state HMM for a given trajectory observation O=o1 , ...oN . In layer 1, oj = (xj , yj ) and in layer 2, oj = (sj−1 , sj ). . . . . . . ˆ = B.3 Backward algorithm for identifying the most likely states Q 82 sˆ1 , ...ˆ sN of the particle for a given trajectory observation O=o1 , ...oN and estimated parameters θˆ including the transition probabilities Pˆ12 and Pˆ21 . This algorithm can be used in either layer of the HMM. 82 B.4 Algorithm for classifying states in the HMM as diffusive or directed given an observed trajectory O, estimated parameters θˆ and ˆ . . . . . . . . . . . . . . . . . . . an estimated state sequence Q. 82 B.5 MCMC likelihood-based parameter estimation algorithm for estimating the posterior distribution P (θ|O) for the model parameters θ and observation from a single particle trajectory O. This method can be used in either layer of the HMM. . . . . . . . . . . . . . . 83 ix List of Abbreviations and Symbols (xi , yi ) a particle’s displacement in the ith frame of a single particle track [Lr ] equilibrium ligand concentration in a poor region [Lr ] equilibrium ligand concentration in a rich region Dmacro macroscopic diffusion coefficient Dmicro microscopic diffusion coefficient θˆ an estimate of an arbitrary model parameter θ koff first order unbinding constant for ligand or motor binding reaction kon forward rate constant for ligand or motor binding reaction xi position of the particle after i frames i (θ oj ) log likelihood of the observation oj in frame j given the model parameters θ and that the particle is in state i L(θ oj ) likelihood of the observation oj in frame j given the model parameters θ and that the particle is in state i φs direction of advection velocity for state s Pjump probability that a particle jumps over a barrier in a simulated trajectory x List of Abbreviations and Symbols P12p probability of the particle transitioning from an unbound to a bound state in a poor region in one frame P12r probability of the particle transitioning from an unbound to a bound state in a rich region in one frame Prp probability of the particle transitioning from a rich region to a poor region in one frame τ time between frames Ok observed outcome in the kth layer of the HMM; k is one or two Qk state sequence in the kth layer of the HMM; k is one or two θk set of parameters in the kth layer of the HMM; k is one or two A user-defined maximum distance away from xi used in the calculation of Wi in layer two of the HMM α user-defined decay exponent used in the calculation of Wi in layer two of the HMM Ds diffusion coefficient of state s with units µms−1 Dnon-raft diffusion coefficient outside a raft domain Draft diffusion coefficient inside a raft domain N Number of frames in a trajectory Pij fixed probability that the particle switches from state i to j in one frame of layer 1 si a particle’s state in the ith frame in the first layer of the HMM ti a particles state in the ith frame in the second layer of the HMM vs magnitude of advection velocity for state s with units µms−1 xi List of Abbreviations and Symbols Wi weighted average of diffusive states near the particle at frame i; dependent on α and A CD45 protein tyrosine phosphatase receptor C cytoD cytochalasin D DMSO buffer with control FCS fluorescence correlation spectroscopy FRAP fluorescence recovery after photobleaching FRET fluorescence resonance energy transfer GPI glycosylphosphatidylinositol HMM hidden Markov model LFA-1 lymphocyte function-associated antigen 1 MCMC Markov chain Monte Carlo MH Metropolis Hastings MSD mean squared displacement PMA phorbol-12-myristate-13-acetate SPT single particle tracking TDM transient directed motion TRM transient reduced mobility xii Acknowledgements First, thank you to my supervisor Daniel Coombs who continually provided excellent guidance and support throughout my time at UBC. I also thank Dodo Das for all his helpful feedback and advice. Thank you to colleagues and collaborators at the University of Alberta for helpful discussions. I sincerely appreciate the welcoming, simulating and supportive community created by the Math Biology group. I thank the Mathematics department and IAM for providing amazing learning opportunities and facilities. I also thank the IAM faculty for their inspirational teaching. I would like to acknowledge financial support from the Natural Sciences and Engineering Research Council, UBC and the International Graduate Training Center in Mathematical Biology. Finally, thank you to my friends and family for all their support and encouragement. xiii Dedication For my mom, Susanne Morrison. xiv Chapter 1 Introduction The plasma membrane is the barrier between a cell’s interior and exterior environment. As such, it serves a fundamental role in many critical cell functions including cell-cell interactions, cell motility and signal transduction. These tasks are carried out through a tremendously complex network of interacting proteins and lipids that reside on or near the cell surface. Often, associations between surface molecules cannot be studied directly. However, their interactions may be interpreted through changes in their mobility over time or space. Conversely, the lateral mobility of membrane components can also have a powerful influence over how and where these associations occur. Therefore, understanding and quantifying mobility is crucial in understanding membrane function and organization. In this thesis we develop a mathematical model of a particle’s trajectory within the plasma membrane. The model encompasses switches between different states of mobility throughout the trajectory, which we attribute to binding events or changes in the particle’s spatial environment. The overall goal is to quantify these state switches within single particle tracks and, in doing so, gain important insight about the particle’s associations or spatial membrane structure. 1.1 Brief chapter summary Chapter 2 presents biological background about plasma membrane organization and the resulting molecular movement within it. We focus on heterogeneity of mobility in the membrane. Single particle tracking, a powerful fluorescence microscopy technique used to study particle movement, is also reviewed. In Chapter 3, we develop a mathematical model and statistical method for deciphering multi-state mobility in single particle trajectories. From observing a particle’s path, our method is able to distinguish between when it is bound with another 1 1.1. Brief chapter summary molecules or just freely diffusing. Depending on what type of molecule the particle associates with, its bound form can either undergo reduced diffusion or directed diffusion. Our method is able to discriminate between these two types of mobility. We confirm the accuracy of our analysis using simulated data and present the results from the analysis of two proteins on the surface of T lymphocytes. In Chapter 4 we propose two extensions to the method developed in the previous chapter. The original model can be extended to detect two classes of spatial domains in the plasma membrane. Namely, structures exhibiting properties of lipid rafts and those which are enriched in membrane components that interact with the particle of interest. We perform a rigorous analysis of simulated data to exemplify the method’s ability to quantify these domains. Chapter 5 concludes the thesis with a summary of our main results along with possible directions for future work. 2 Chapter 2 Biological Background Approximately one third of all proteins in the cell are associated with the plasma membrane and operate in a complex network in order to perform a variety of vital operations — signal transduction, cell motility, cell activation etc [44]. The structure and architecture of the membrane influences the lateral mobility of its constituents and therefore affects the molecular associations that are instrumental to its many functions. This chapter will provide a review of plasma membrane organization and the lateral mobility of proteins and lipids within it. Fluorescence microscopic techniques that observe molecular movement on the cell surface, such as fluorescence recovery after photobleaching (FRAP), fluorescence correlation spectroscopy (FCS) , fluorescence resonance energy transfer (FRET) and single particle tracking (SPT), have been valuable in the advancement of this field. FRAP and FCS are both averaging methods which observe ensemble particle movements. SPT’s greatest strength lies in its ability to resolve subpopulations that tend to get smoothed out in the ensemble methods. This technique has led to revolutionary advancements in the understanding of plasma membrane organization throughout the past decade. SPT and some of the most important experimental results from this method are reviewed in Section 2.2 of this chapter. In addition, I will present some mathematical models of lateral mobility in the membrane and how they can be applied to SPT data. 2.1 Review of plasma membrane organization and protein diffusion A pioneer model of the plasma membrane by Singer and Nicholson, called the fluid mosaic model, describes the membrane as a two-dimensional fluid wherein mem- 3 2.1. Review of plasma membrane organization and protein diffusion brane lipids and proteins diffuse freely and uniformly [67]. More recently, results from high resolution fluorescence microscopy techniques have led to new theories regarding the organization of the plasma membrane. The three major observations which contradict the fluid-mosaic model are as follows: • Molecules diffuse 5–50 times slower on live cells than on artificial membranes [34]. • Larger molecules diffuse slower than expected. According to Saffman and Delbruck, in a Singer-Nicholson membrane D ∝ ln(1/r) where r is the radius of the molecule and D is the diffusion coefficient [53]. However, SPT experiments by the Kusumi group and others show that larger molecules diffuse at a much slower rate than this relation predicts [27, 29, 41, 42, 51]. • Molecules undergo different types of motion besides pure diffusion such as confinement, anomalous diffusion and directed diffusion. Nowadays, a more widely accepted view of the membrane is that it is laterally segregated into dynamic spatial domains. Some well-known examples of large functional membrane domains are focal adhesions, synaptic junctions, coated pits, caveolae and cell-cell adhesion structures. Membrane heterogeneity can explain all three contradictions to the Singer-Nicholson model, mostly in the context of the two prevailing models of small scale membrane organization: the cytoskeleton fence and lipid rafts. These structures not only facilitate the formation of the larger domains but also effect the lateral mobility of all membrane-associated molecules. 2.1.1 Cytoskeleton picket fence model Standard fluorescence microscopy experiments will measure protein lateral diffusion after stabilizing or destabilizing the cytoskeleton. Observations from these experiments show that the cytoskeleton plays an important role in membrane particle mobility. The first group to show this correlation was the Kusumi group in their FRAP experiments with Band-3 (a erythrocyte anion channel protein): a reduction in spectrin oligomers (a cytoskeleton-associated protein) led to a decrease in the fraction of immobile Band-3 on the membrane [75]. Following this work, 4 2.1. Review of plasma membrane organization and protein diffusion they performed many SPT experiments (discussed in detail in Section 2.2) and devised the well-known picket fence model for the cytoskeleton. Their experiments indicated that proteins undergo what they termed as hop diffusion in the plasma membrane. In the short term, diffusion is confined within compartments with barriers consisting of the cortical cytoskeleton (fence) and associated proteins (pickets). These barriers temporarily confine the protein though steric hindrance and hydrophobic interactions. Through fluctuations in the barrier, the protein is able to escape and move to a neighboring compartment. Fig. 2.1 illustrates how a protein moves throughout the fence-like structure in the plasma membrae. An important feature of the cytoskeleton fence is that it is dynamic, allowing it to play a prominent part in a diverse array of cellular processes at the membrane. For example, cytoskeleton fence re-arrangement can orchestrate complex signaling processes by creating large-scale changes in protein distributions. This is crucial in the formation of the immunological synapse during lymphocyte activation [19, 74]. Also, cell polarization and motility are wholly dependent upon these dynamic cytoskeleton re-arrangements. The cytoskeleton can also act as a causeway on which cytoskeletal motors can traffic cellular components. For example, the directed movement of proteins and vesicles along actin filaments is thought to be an essential step in the release of neurotransmitters at the neural synapse [17]. Finally, the cytoskeleton has recently been shown to play a role in the formation of lipid rafts during cell signaling events (see next section) [77]. 2.1.2 Lipid rafts Cell membranes contain various classes of lipids and proteins that differ in their physico-chemical properties. The consequence of this is that the membrane contains spatial heterogeneity arising from lipid-lipid immiscibility. Some membrane phospholipids, such as sphingolipids, have long and highly saturated carbon-atom chains compared to others like glycerolphospholipids. As a result, they have stronger lateral cohesion and associate favorably with cholesterol to form tightly packed and ordered regions [33, 65]. Conversely, other regions of the membrane tend to adopt a more mobile and fluid phase. The components and organization of these two membrane regions are illustrated in Fig. 2.2. These ordered domains were 5 2.1. Review of plasma membrane organization and protein diffusion Plasma membrane cortical actin cytoskeleton actin-associated proteins trans-membrane protein protein trajectory Figure 2.1: Schematic illustration of the cytoskeleton picket fence model proposed by the Kusumi lab [34, 54]. Long-term lateral diffusion of proteins is reduced due to non-specific interactions with barriers made up of the cortical cytoskeleton (black lines) and its associated proteins (blue and orange). The protein undergoes short term confined diffusion within compartments with long term stochastic hops between compartments. first isolated using detergents such as Triton X-100 [7]. The detergent-resistant domains that were extracted also retained a set of key proteins involved in signal transduction, suggesting that these domains play a crucial role in signaling [77]. Examples of these ordered phase preferring proteins include glycosylphosphatidylinositol (GPI)-anchored proteins, heterotrimeric G proteins and Src-family tyrosine kinases such as Lyn [50]. On the basis of these observations, the raft hypothesis was developed by Simons and Ikonin which proposes that sphingolipids and cholesterol mediate the formation of ordered lipid clusters called rafts which attract specific proteins thereby mediating signal transduction [64]. Despite much evidence supporting the existence of raft domains, the size and the functions of these domains are still debated [77]. The controversy mainly arises because these domains are too small and transient to be optically resolved. However, recent advances in microscopy and imaging are now providing new insights about their properties [20, 46, 62, 70, 72]. Many single particle tracking experiments have contributed to these advances and some are discussed in detail in Sec6 2.1. Review of plasma membrane organization and protein diffusion A non-raft trans-membrane protein glycerophospholipid (unsaturated) B glycerophospholipid (saturated) sphingomyelin/ glycospingolipids GPI-anchored protein raft-associated protein cholesterol Figure 2.2: Illustrating the differences between raft (B) and non-raft (A) regions of the membrane. Rafts contain a high concentration of sphingomyelin, glycosphingolipids, cholesterol and GPI-anchored proteins. Glycerophospholipids in rafts have saturated fatty acid tails which are straighter and fit more favorably next to cholesterol compared to those with bulky unsaturated tails. Overall, rafts are more ordered and tightly packed compared to non-raft sections of the membrane. tion 2.2. Measurements of raft size range from 5 to 1000 nanometers while lifetime estimates are highly varied and range from microseconds to hours [33]. It is believed that the smaller, more transient rafts have too few proteins to actively be involved in signalling. However, upon cellular activation, rafts can coalesce into larger signaling structures. This process is facilitated through interactions with the actin cytoskeleton [33, 77]. Proteins exhibit slower diffusivity within rafts and are therefore trapped within these areas for longer than other regions of the membrane [33]. This leads to the enrichment of specific membrane constituents within raft domains which is hypothesized to promote more molecular interactions and effective signal transduction. These predictions are corroborated with theoretical study by Nicolau finding that a maximal molecular collision rate both inside and outside 7 2.2. Single particle tracking domains is achieved with a 10% coverage of the membrane by small (∼ 6 nm diameter) rafts and Draft /Dnon-raft =0.5 [43]. Both lipid rafts and the cytoskeleton fence directly affect protein mobility in the membrane through aggregation, confinement, directed motion etc. In addition, it is widely believed that the choreography of intracellular signaling following surface receptor engagement is affected by localization and mobility of key signaling proteins. Single particle tracking is a powerful technique that can be used to characterize and quantify transmembrane receptor mobility as well as interactions with cytosolic proteins. 2.2 Single particle tracking Single particle tracking is a single-molecule method that yields a direct observation of molecular motion. The protein of interest is labeled with an immunofluorescent probe or an antibody-coated bead, at a sufficiently low concentration so as to allow the visualization of single molecules or diffraction-limited clusters (see Fig. 2.3A). Using high-speed video microscopy, the particles position is then tracked over time to obtain its trajectory as the experimental output (Fig. 2.3B-C). Typical SPT experiments have a spatial resolution of tens of nanometers and a time resolution of tens of milliseconds (although experiments with microsecond resolution have been performed [24, 41, 76]). It is also common to treat cells with pharmacological agents (such as cytochalasin D, which disrupts the actin cytoskeleton) and examine resulting changes in the particle trajectories. In principle, particle trajectories contain information about the nature of molecular motion, signatures of spatiotemporal heterogeneity and the presence of multiple subpopulations, though it is a non-trivial task to extract this information in a statistically reliable fashion. The difficulty is due to the inherent noisiness of a random walk — this randomness may suggest modes of motion or patterns that are not actually present. 2.2.1 Theoretical basics Analyzing single particle tracking experiments requires an understanding of how molecules diffuse in the membrane. Brownian diffusion is the simplest form of ran- 8 2.2. Single particle tracking A IMAGE OF THE LABELED PARTICLE B PARTICLE POSITIONS ANALYZE C TRAJECTORIES Figure 2.3: Schematic drawing of SPT experiment. (A) The protein of interest is labeled with an immunofluorescent probe or an antibody-coated bead at a sufficiently low concentration so individual particles can be distinguished. The images are analyzed to give the particle positions at every time step (B) and taken together, these positions create a trajectory of the particle positions over time (C). Trajectories can then be analyzed to determine the type of particle mobility. Shown here are pure random walks (blue), directed motion (red), confined motion (green). A random walk with blinking (purple) and a trajectory with merging and splitting (black) demonstrate difficulties in image processing that may arise in these experiments. dom thermal motion, and assumes that molecules move randomly without memory of any previous positions. Therefore, a population of pure Brownian particles move in an identical manner throughout the membrane with their bulk behaviour obeying the diffusion equation ∂c = D∇2 c ∂t (2.1) where c is the concentration of particles and D the diffusion coefficient. However, in the presence of membrane domains and the underlying cortical cytoskeleton, molecular motion is not always purely diffusive and spatiotemporally homogeneous. Different types of molecular motion have been observed in the plasma membrane including • pure Brownian diffusion • directed motion: possibly through association with cytoskeletal motors • anomalous sub-diffusion: due to confinement, the presence of barriers (corrals) or binding [61] 9 2.2. Single particle tracking The standard way to remove noise or randomness from a trajectory and classify the particle’s mobility is to calculate the mean squared displacement (MSD). In SPT, particle positions are recorded in the form of a time sequence {xn = x(nτ ), yn = y(nτ ) | n = 0, 1, 2, ...N } where τ is the frame length and N is the number of steps in the trajectory. Given a single particle trajectory, the mean squared displacement curve can be calculated in the following way [48]: MSD(nτ ) = i=N −n (xi+n i=0 − xi )2 + (yi+n − yi )2 , N −n+1 n = 1, ..., N/2 (2.2) The analytical forms of the MSD versus time curves for the different modes of motion are MSD(t) =4Dt MSD(t) =4Dt + (vt)2 MSD(t) =4Dtα , α < 1 pure Brownian diffusion (2.3a) directed diffusion (2.3b) anomalous sub-diffusion (2.3c) The three modes of motion along with their corresponding MSD versus time curves are presented in Fig. 2.4. Anomalous sub-diffusion can arise in a number of ways and two examples are shown in the figure. The first, termed walking confined diffusion, is where a fast diffusing molecule is confined to a slowly diffusing domain. This is representative of diffusion within a lipid raft [12]. The second example is hop diffusion within the cytoskeletal fence. When only the first few time points on the MSD curve are used (n < 4), the resulting diffusion coefficient it is known as Dmicro . Using more data points gives an estimate for the macroscopic diffusion coefficient, Dmacro . Comparing these two estimates is another way to classify mobility. For example, in certain types of molecular movement such as walking confined diffusion and hop diffusion, Dmicro < Dmacro . While the MSD analysis is a fast and simple way to classify mobility, it can only provide a global classification of a trajectory. If the particle undergoes transient changes in mobility within an individual track, these cannot be resolved using the MSD curve. Fig. 2.5 illustrates this idea with the MSD analysis of two such trajectories. The particles are switching between free Brownian diffusion and either reduced Brownian diffusion (A-C) or directed dif10 2.2. Single particle tracking A Free diffusion B C Anomalous diffusion Directed diffusion ) walking confined diffusion hop diffusion D Mean squared displacement dif fus ion Di rec ted MSD (B n( sio p d ure A) u iff ) (C usion f f i d alous Anom (A) MSD = 4Dt (B) MSD = 4Dt + v2t2 (C) MSD = 4Dtα,α<1 time Figure 2.4: Three models of molecular movement and their resulting MSD versus time plots. (A) Brownian diffusion — MSD grows linearly in time. (B) Directed diffusion — the MSD versus time curve is quadratic. (C) Two examples of anomalous diffusion — MSD grows less than linearly in time. Anomalous diffusion can arise in different ways and we show two examples here. Walking confined diffusion occurs when a particle undergoes fast diffusion while confined within a slowly diffusing domain. Hop diffusion is confined diffusion within domains over short times with occasional hops between domains over longer time periods. 11 2.2. Single particle tracking fusion (D-F). In each case, only an average “effective” diffusion coefficient and velocity can be extracted. More rigorous statistical methods are needed to quantify this type of particle movement in single particle trajectories. In Chapter 3 we present one such method which can resolve the diffusive properties of individual states in trajectories like these. An in-depth review of other methods developed to detect heterogeneity in single particle trajectories is also presented in the next section. 2.2.2 Analysis of single particle tracks In classical SPT analysis, particle trajectories are classified on the basis of their MSD versus time plot (see section 2.2.1). The MSD as a function of time is calculated from the individual displacements [35, 48] and tracks are then sorted into various modes of motion based on fitting their MSD curve to the equation (see Eqns. 2.3a- 2.3c and [1, 61, 79]). Other MSD-based parameters, such as the relative deviation from Brownian diffusion [35], and the radius of gyration tensor [52, 56], have also been used to categorize particle trajectories. A series of theoretical studies by Saxton has examined anomalous diffusion due to confinement, the presence of obstacles, or binding, and proposed algorithms for detecting these through MSD measurements [56–60]. An analytical treatment also exists for diffusion through an infinite array of impermeable barriers, relating the macroscopic and microscopic diffusion coefficients to the average compartment size and the average residency time within a compartment [45]. The results from this study were used to test a hop diffusion model for membrane phospholipids and MHC molecules [24, 76]. Directed motion on the plasma membrane has been observed in SPT experiments for proteins such as E-cadhedrin [35], major histocompatibility complex II (MHC II) [79] and GABA receptors on nerve growth cones [5]. Bouzigues et al. present a sliding window statistical analysis which estimates velocity and diffusion coefficients in tracks where a particle switches between directed and pure diffusive modes. They found drift occurs most often in the central region of the cell rather than in the actin-rich lamellipodia, thus defining two distinct domains for GABA receptor mobility. 12 2.2. Single particle tracking D 1.0 A 0.4 y (μm) y (μm) 0.6 0.0 0.2 -0.4 -0.2 -1.0 B 0.4 -0.6 -0.2 x (μm) 2.0 -1.0 E Free diffusion Reduced diffusion -0.6 -0.2 x (μm) 2.0 6.0 2.0 6.0 Free diffusion Directed diffusion 1.0 y (μm) y (μm) 0.6 0.0 0.2 -0.4 -0.2 -1.0 -0.2 x (μm) -1.0 2.0 F Free diffusion Reduced diffusion Entire trajectory 0.8 0.4 0 0 -0.6 1.0 -0.2 x (μm) Directed diffusion Free diffusion Entire trajectory 8.0 MSD (μm2) 1.2 -0.6 MSD (μm2) C 1.6 6.0 4.0 2.0 0.4 0.8 1.2 Time (s) 1.6 2.0 0 0 0.5 Time (s) 1.0 1.5 Figure 2.5: MSD is a global measurement and cannot detect transient changes in mobility. (A, B) trajectory with transient reduced mobility — the particle switches between fast (red) and slow (blue) diffusive states. (C) The resulting MSD curve is plotted in black where the slope is proportional to a weighted average of the slow and fast diffusion coefficients. More sophisticated statistical methods are needed to extract the MSD curves for each state shown in red and blue. (D, E) Trajectory with transient directed motion — the particle switches between diffusive (blue) and directed (red) motion. (F) The resulting MSD curve is plotted in black. As in C, only a weighted average of the parameters defining the particle’s movement can be extracted from this curve. 13 2.2. Single particle tracking MSD-based classifications assume a single mode of motion within each trajectory, and as such, are suitable for identifying the global behaviour of all trajectories in an experiment, or classifying trajectories into subpopulations exhibiting different characteristics. A different class of statistical analysis exists for identifying transient confinement [39, 66], or transient directed motion [3, 5] within individual trajectories. Methods used to detect these transient events are typically based on a sliding average technique: MSD is calculated for short windows along the trajectory to identify deviations from Brownian diffusion within segments of an individual track. An issue with these methods is their need for a judicious choice of input parameters such as the length of the sliding window, and the assumed size of confinement regions. Changes in these parameter values could alter the results significantly. In Chapters 3 and 4, we present a novel technique to identify transient changes in directed motion and confinement within single trajectories that requires fewer user-supplied parameters and compare it to the previous sliding window techniques. Our method is based on a hidden Markov model developed to identify transient periods of reduced mobility in single particle tracks of the integrin LFA-1 and the phosphatase CD45 on the surface of T cells [8, 11]. 2.2.3 Models of plasma membrane domains studied with SPT The analysis of particle trajectories with anomalous diffusion motivated the study of three main classes of membrane domain models: • Cytoskeleton picket fence model • Lipid rafts • Protein-protein interactions and clustering Cytoskeleton picket fence A membrane skeleton fence model was first proposed by Sako and Kusumi after performing SPT on the α2-macroglobulin receptor [54]. They observed short-term confined diffusion within a sub-micron sized compartment and long term hop diffusion between compartments (see Fig. 2.1). Partial destruction of the cytoskeleton decreased the confined diffusion mode and increased the simple diffusion. This model is further evidenced by SPT experiments in which 14 2.2. Single particle tracking the residency times within corrals decrease when the cytoplasmic side of the protein is removed [37, 55]. More recently, two impressive experiments in which multiple proteins are labelled simultaneously show that both Fc RI receptors on the surface of mast cells and B cell receptors on B cells tend not to cross GFP-labelled cortical actin cytoskeleton barriers [2, 74]. Hop diffusion has been observed in the trajectories of many transmembrane proteins such as E-Cadhedrin [55], transferrin receptor [54], Band-3 [37, 73] and MHC II [76]. The model was revised after SPT experiments with the unsaturated phospholipid DOPE were performed by the Kusumi lab. Labelled DOPE was only tracked on the outer leaflet of the PM and thus had no cytoplasmic domain to directly interact with the cytoskeleton fence. Despite this, hop diffusion was observed in the trajectories [24, 41]. This led to the hypothesis that there are transmembrane anchor protein pickets bound to the skeleton fence which block the free diffusion of lipids and proteins via steric hindrance and hydrodynamic friction effects. Monte Carlo simulations predicted that the skeleton fence must be 20 − 30% covered by these picket proteins [24]. There are three major consequences of the cytoskeleton picket fence model: • It explains why diffusion coefficients measured in liposomes are much higher than in live cells. • Oligomers — formed due to receptor crosslinking, or through other molecular associations — have a much lower probability of crossing fence barriers due to their size, explaining the discrepancy between what is predicted by the Stokes-Einstein relation and what is observed experimentally. [34]. • It explains why many experiments demonstrate that Dmicro Dmacro . Pro- tein diffusion is unhindered over the short timescales used to measure Dmicro . But, over the longer timescales used to measure Dmacro , the protein feels the effects of the boundary as it is temporarily confined within each domain [24]. . Lipid rafts Lipid rafts also play a major role in modulating the mobility of mem- brane components. It is conjectured that lipid rafts may increase clustering of 15 2.2. Single particle tracking raft-philic proteins during signaling [13, 33]. Because lipid rafts are so small, (≈ 10 − 200 nm) [31], SPT is an effective way to study them due to its high resolution. One important observation achieved using SPT is that raft proteins experience a reduction in their microscopic diffusivity (Dmicro ) by a factor of 2 to 5 which is attributed to the viscosity within the domains [12, 46]. This is one major difference between raft domains and the cytoskeleton fence — in hop diffusion, the proteins exhibit the same Dmicro as if they were freely diffusing [33]. To investigate the importance of rafts, it is common to deplete cholesterol or sphingolipids to see how this influences the diffusivity of single molecules. Jacobson and colleagues used this protocol in their experiments with the raft-associated transmembrane proteins Thy-1 and GM1. Depletion led to a decrease in the abundance of the ≈ 200 nanometer confinement zones and lower confinement times, yet it had no effect on the diffusion of a non-raft associated protein [16, 63]. However, interpretations of these experiments remain controversial since cholesterol depletion affects the distribution of PIP with unknown effects on the cortical actin cytoskeleton. It has also been suggested that the small size and short lifetimes of lipid domains results in inadequate detection by SPT [15]. Protein-protein interactions and confinement Protein clusters are important for receptor signaling and it is believed that cluster formation is facilitated by lipid rafts and the cytoskeleton fence. However, specialized protein-protein interactions provide the specificity required for signaling within these micro-domains. Destainville and colleagues show that under pairwise protein interaction, the smallscale individual protein diffusion coefficient (Dmicro ) is inversely proportional to the concentration of proteins within a cluster and thus is proportional to the area of the cluster (Dmicro ∝ area). This relationship was observed in single particle tracks of the µ–opioid receptor (a G-protein coupled receptor) on NRK cells [12]. After thorough statistical analysis, they concluded that confinement by lipid rafts or the cytoskeleton cannot explain the observed area dependence [39]. Instead, they propose that the receptors are undergoing “walking confined diffusion” described by short-term confined diffusion within a protein cluster and long-term simple diffusion of the entire cluster. Stable protein aggregates have been observed using SPT on other cell types as well — activated T cells being one example (discussed 16 2.2. Single particle tracking below) [10, 18]. Their existence is also supported by numerical simulations by Destainville [14]. Although the role and biogenesis of many protein clusters remains as an open question, SPT has allowed for rapid advancement of knowledge in this area. 2.2.4 Consequences for T cell signaling SPT has been an invaluable tool in the study of T cell signaling. During T cell activation, a coordinated reorganization of the cytoskeleton and signaling molecules forms a highly-concentrated signaling region (the immunological synapse) containing clusters of T cell receptors (TCR) [69]. Using SPT, cytoskeleton fences, lipid rafts and protein-protein interactions have all been found to play important roles in this process. For example, sophisticated statistical analysis of single particle tracks of CD45 and LFA-1 illustrates the importance of the actin cytoskeleton in cellular activation. Protein association with actin leads to a reduced mobility state and therefore the dynamic re-organization of the cytoskeleton during activation has direct consequences on the lateral mobility of these proteins [8, 11]. A critical step in T cell activation is the recruitment of signaling molecules into the core of the immunological synapse. The recruitment of one particular protein, Lck (a Src family kinase), was observed through SPT — and both lipid rafts and the cytoskeleton fence were required [30]. It is known that the actin cytoskeleton is thought to be concentrated towards the core of the synapse, leading to slower diffusion and thus longer residency time around TCR clusters. Researchers hypothesize that this effect is enhanced by Lcks association with lipid rafts [30]. Once at the centre, further single particle experiments combined with FRAP and fluorescence imaging show that Lck preferentially associates with some proteins (CD2 and LAT) and not with others (CD45). These interactions are independent of the cytoskeleton and lipid rafts [18]. It is hypothesized that these protein-protein interactions concentrate or exclude specific proteins in order to facilitate effective signaling. This is one example of how sophisticated analysis of single particle trajectories has led to a better understanding of the complex mechanisms involved in a large-scale cell operation. 17 Chapter 3 Deciphering Two-State Mobility in Single Particle Trajectories In this chapter we present a method to decipher and quantify two-state mobility in single particle trajectories. Given a single particle track, this method can identify each state as either pure Brownian diffusion or directed diffusion and estimate the parameters involved: diffusion coefficients, velocities and state transition rates. Therefore, a trajectory can be classified in three ways with our analysis: • both states are pure Brownian diffusion • one state is pure Brownian diffusion while the other is directed diffusion • both states are directed diffusion Our method performs well in each case, however the first two are the most biologically relevant so we focus on these two here. We call the first classification transient reduced mobility (TRM) and the second transient directed motion (TDM). Transitions between states could arise from binding and unbinding events between the tagged molecule and other cellular proteins or cytoskeletal motors (in the case of transient drift). This analysis is novel in its ability to classify the particle’s mobility in this way - previous methods must assume one of these three types before analyzing trajectories [3, 5, 11]. Below, we compare our results to the statistical method by Arcizet et al. which detects TDM. This work is an extension of the analysis by Das et al. [11], where only TRM is considered. Tracks are modeled as the observable outcome of a two-state hidden Markov model (HMM) (see Section 3.1) and parameters are estimated via a Markov Chain Monte Carlo (MCMC) likelihood maximization algorithm (see Section 3.2.1). Discrimination between TRM and TDM models is performed simultaneously with parameter estimation during 18 3.1. Theory - hidden Markov model of single particle tracks the MCMC algorithm. We carry out a rigorous analysis on simulated data and present results from two experimental case studies involving lymphocyte functionassociated antigen 1 (LFA-1) and protein tyrosine phosphatase receptor C (CD45) on the surface of T lymphocytes. Both sets of experiments have been analyzed before using MSD [9] and our earlier HMM method ([8, 11]), which revealed evidence for two-state behavior. Here, we confirm that this two-state behavior is likely due to transient protein-protein interactions and not localized directed motion. Further, we discuss results from simulations in which particles undergo constrained lateral movement. 3.1 Theory - hidden Markov model of single particle tracks We model a single particle track for a transmembrane protein which is dynamically switching between two states of lateral mobility (classified as either directed motion or pure Brownian diffusion). This is an extension of the model by Das et al. which considered only purely diffusive states [11]. We assume that the labelled protein switches between states of mobility through dynamic associations with a homogeneously distributed binding partner in the cytosol. This binding is represented by the bimolecular reaction kon.true GGGGGGGGGGB P +SF GG C. koff (3.1) Here, S is the cytosolic substrate, P is the free form of the protein (undergoing pure diffusion), and C is the bound form (undergoing either reduced pure diffusion or directed motion if S is a cytoskeletal motor). The kinetics of this interaction are characterized by the bimolecular forward rate constant, kon.true , and a first order unbinding constant, koff . If we assume a homogeneous spatial distribution of S, at equilibrium the reaction is effectively first order with kon = kon.true · [S]eq , where [S]eq is the equilibrium concentration of substrate. With this assumption, state transitions are modeled as unimolecular reactions with first order kinetics. We 19 3.1. Theory - hidden Markov model of single particle tracks summarize the model as follows: kon GGGGGGGB {D1 , v1 , φ1 } F GG {D2 , v2 , φ2 } koff (3.2) The particle switches between the two states with rates kon and koff . In either state, si = 1, 2, the particle has a diffusion coefficient, Dsi , advection velocity with magnitude vsi and direction φsi . For notational simplicity vsi will be denoted as vsi . The magnitude of advection velocity stays constant throughout the trajectory, however the direction φsi is not unique and may change each time the particle enters state si . If vsi = 0 for i = 1, 2 then the reaction reduces to the TRM model presented in [11]. Finally, we make the assumption that the particle is imaged instantaneously and state transitions occur only at the acquisition time, implying that the particle is entirely in one state during successive image frames. This assumption is valid for transition rates much larger than the frame rate. For a constant time between frames, τ , we can translate the unimolecular reaction into a set of ordinary differential equations dPs1 = −kon Ps1 + koff Ps2 dt dPs2 = kon Ps2 − koff Ps2 dt 1 = Ps1 + Ps2 (3.3a) (3.3b) (3.3c) where Ps1 and Ps2 represent the proportion of protein in states 1 and 2 respectively. Solving this system with the initial condition Ps1 (0) = 1, Ps2 (0) = 0 for Ps2 (τ ) leads to an expression for the fixed probability, P12 , for the particle to switch from state 1 to 2 between successive frames: P12 = kon 1 − e−(kon +koff )τ . kon + koff (3.4) Similarly, solving for Ps1 (τ ) with the initial condition Ps1 = 0, Ps2 = 1 gives P21 = koff 1 − e−(kon +koff )τ . kon + koff (3.5) 20 3.2. Materials and methods Therefore, we have the following two-by-two transition probability matrix for our hidden Markov model of the single particle trajectory in terms of the unimolecular kon and koff rates from Eq. 3.2: P11 = 1 − P12 P21 = koff kon +koff 1 − e−(kon +koff )τ P12 = kon kon +koff 1 − e−(kon +koff )τ P22 = 1 − P21 (3.6) where Pij is the fixed probability that the particle switches from state i to j in time τ. The state sequence of the particle during an SPT observation is regarded as a two-state hidden Markov process. A displacement (xi , yi ) in the particle track is the outcome of Brownian or drift diffusion with parameters (diffusion coefficients and velocity) corresponding to the particle state at that interval. Therefore the trajectory, represented as O1 = (x1 , y1 ) · (x2 , y2 ) · · · (xN , yN ) is the observed outcome of a two-state HMM (the added 1 is for notational convenience in the next chapter). The particle’s state sequence Q1 = s1 , s2 , ..., sN (si = 1,2), which specifies the distribution from which the particle displacement is completed at each step, is hidden from the observer. See Fig. 3.1 for a schematic representation of the two-state hidden Markov chain. We developed a likelihood-based approach to estimate model parameters, infer the underlying state sequence and choose the type of lateral diffusion for each state. 3.2 3.2.1 Materials and methods Statistical analysis of particle trajectories For a single particle track, we observe a sequence of individual displacements O1 which are taken to be the outcome of a two-state hidden Markov model with the following set of parameters: θ1 = {D1 , D2 , P12 , P21 , v1 , v2 , φ11 , ..., φ1n1 , φ21 , ..., φ2n2 }. We assume that the particle has the same magnitude of velocity each time it enters state si , however the angle of advection may change. These angles are given by φsi 1 , ...φsi ni where ni is the number of times the particle is in state si during a single trajectory. We use a Bayesian statistical method known as Markov chain 21 3.2. Materials and methods Transient reduced mobility (TRM): Transient directed motion (TDM): OBSERVED HIDDEN (fast) D1(slow) D2 D1(slow) D D ,V2 D2,V2 (no drift) D1 2 (no drift) 1 Figure 3.1: Schematic illustration of the two-state hidden Markov model for the two tested scenarios: TRM and TDM. The states are hidden from the observer and must be inferred from the HMM analysis using MCMC. Monte Carlo (MCMC) to estimate the HMM parameters θ1 . There are three main components involved in the analysis: calculating a likelihood function, segmenting a trajectory into its most likely state sequence, and classifying each state as directed or diffusive. Parameter estimation by Markov chain Monte Carlo Given our observed trajectory, the objective is to measure the posterior probability given by Bayes’ Theorem: P (θ1 |O1 ) = P (θ1 )P (O1 |θ1 ) P (O1 ) (3.7) where P (θ1 ) is the prior distribution for the set of parameters θ1 , P (O1 |θ1 ) is proportional to the likelihood function (see Section 3.2.1 below) and P (O1 ) is the normalizing constant. It is difficult to calculate the denominator in Eq. 3.7, however an advantageous statistical MCMC algorithm called Metropolis-Hastings (MH) can generate a sample from our target distribution P (θ1 |O1 ) without know22 3.2. Materials and methods ing this constant of proportionality. The algorithm iteratively explores the parameter space through a succession of small displacements along each parameter axis. For each displacement to a new set of parameters θA , the log of the numerator in Eq. 3.7 is calculated. For all parameters, we use a uniform distribution over a suitable range of positive values for the prior P (θ1 ). If the likelihood increases, θA is automatically accepted. Otherwise the parameters are accepted with a probability equal to the fractional change in the numerator after the proposed move. We have also adapted the MH algorithm to simultaneously classify the particle’s mobility; If a new parameter set is accepted while the algorithm is sampling the posterior distribution, we use a statistical hypothesis test described below to identify each state as either directed or purely diffusive. Fig. 3.2 and Program B.5 summarize the steps of the algorithm. Typically MCMC runs were 1−2×105 steps long with an initial burn-in phase during which the MCMC trajectories approach equilibrium. The size of displacements along parameter axes were adaptively tuned to achieve the conventional acceptance rate of approximately 40% after burn-in [25]. We report the mode of the resulting distribution as our parameter estimate, θˆ1 . Likelihood calculation The likelihood function is a function L of the parameters of a statistical model (θ1 in our case) that plays a key role in statistical inference. Given an observed single particle trajectory O1 , each step of the MCMC algorithm described above requires an expression for the likelihood function L ∝ P (θ1 |O1 ) for the set of model parameters θ1 . First, we derive the likelihood function for a single particle displacement by solving the advection diffusion equation with the initial condition being a delta function at the origin. The solution to this partial differential equation is the probability density of the particle’s position after time t. Therefore, the solution at time τ , P ((x, y), τ |Dsi , vsi , φsi ), is proportional to the likelihood of a diffusion coefficient Dsi and advection velocity of magnitude vsi and angle φsi given an individual displacement (xj , yj ). 23 3.2. Materials and methods Li (Dsi , vsi , φsi |(xj , yj )) ∝ P ((xj , yj )|Dsi , vsi , φsi ) = −(xj − vi cos(φsi )τ )2 − (yj − vsi sin(φsi )τ )2 1 exp 4πDsi τ 4Dsi τ (3.8) and the corresponding log likelihood is i (xj , yj ) ∝− (xj − vsi cos(φsi )τ )2 + (yj − vsi sin(φsi )τ )2 − log(Dsi τ ) (3.9) 4Dsi τ The proportionality in Eq. 3.8 arises because the likelihood function is only defined up to an arbitrary multiplicative constant. Consequently, the log likelihood function is only defined to an arbitrary additive constant. In Eq. 3.9 we only retain terms with explicit dependence on model parameters. For an entire track O1 with state sequence Q1 , the likelihood for a given parameter set θ1 is L(θ1 |O1 ) ∝ P [O1 |θ1 ] = P [O1 |Q1 , θ]P [Q1 |θ1 ]. (3.10) all Q For a 2-state HMM, this likelihood equation involves N 2N +1 calculations. To make this more efficient we use the forward-backward algorithm instead, which requires only 4N calculations to compute Eq. 3.10 (see Program B.2) [49]. Track segmentation For a particular θ1 and a set of displacements from a single particle track O1 , we ˆ 1 = s1 , s2 ...sN for the Markov chain calculate the most likely state sequence, Q using the backward algorithm given in Program B.3 [49]. Classifying states as diffusive or directed For a given set of estimated parameters, θˆ1 , we can segment the particle track using ˆ 1 as described above. Next we ask whether each the most likely state sequence Q segment (i.e. a series of displacements during which the particle state does not 24 3.2. Materials and methods change) contains a non-zero advection component. Suppose that throughout the ˆ 1 . Each entire track the particle enters state si a total of ni times according to Q time the protein is in state si , we calculate the average distance the particle moves along each axis; x j and y j , j = 1, 2, ..., ni . We take our null hypothesis to be that the particle is undergoing Brownian diffusion and the alternative hypothesis to be that the particle’s lateral diffusion contains a non-zero advection component. If the null hypothesis is true, x j and y j are normally distributed with mean 0 √ and standard deviation 2Dsi τ / m, where m is the number of steps the particle spends in state si . Therefore, we have 2ni independent z-test statistics for state si , which we analyze using Fisher’s method for multiple comparisons [22]. Briefly, if k independent tests are performed and the p-value for test j is pj then the sum −2 k j=1 ln pj approximately follows the χ22k distribution. If the p-value is lower than 0.01, then we reject the null hypothesis and set the magnitude of the advection velocity to be consistent over the entire trajectory: 1 vsi = ni ni x 2 j + y 2j . (3.11) j=1 The angle of advection each time the particle is in state si is defined as φij = arctan y x j , j = 1, 2, ..., ni . (3.12) j By defining the advection angle seperately for each segment j, we account for possible changes in the particle’s direction. The p-value used in Fisher’s method is the only user input parameter contained in our analysis and values ranging from 0.01 to 0.10 worked well with simulated data. For each segment of the trajectory in which a particle was predicted to be exclusively in one state, analysis of simulated data showed that the first and last steps were the most likely to be classified incorrectly. For this reason, we exclude the outer two steps in each hypothesis test. For some parameter values this increases the accuracy of model prediction in the analysis of simulated data (data not shown). This method of model classification is included in Program B.4 and integrated into the MCMC analysis in Program B.5. 25 3.2. Materials and methods initialize parameters θ0={D1, D2, p12, p21, |v1|=0, |v2|=0} initially classify both states as purely diffusive calculate likelihood for θ0 perturb parameters θA={D1, D2, p12, p21, |v1|, |v2|} only perturb velocities if state is classified as directed calculate likelihood for θA use current mobility classification for each state accept new parameters according to Metropolis Hastings if θA is accepted: get most likely state sequence with backward algorithm test for directed motion with Fisher’s method repeat ~105 times Maximum likelihood estimate: Histogram of accepted parameters after burn-in is the estimate for P(θ|O). The mode of P(θ|O) gives maximum likelihood parameter estimate. Figure 3.2: Flowchart showing our modified Metropolis Hastings MCMC algorithm. 3.2.2 Generation of simulated single particle tracks Simulated trajectories were generated to test the accuracy of our method. Program B.1 contains a detailed description of this process. We simulated two-state trajectories consisting of N displacements, (x1 , y1 ), ..., (xN , yN ), with each occurring over a constant time interval τ . We first generate a 2-state Markov chain with a transition matrix as described in Eq. 3.6 to generate the state sequence of the particle, Q1 . The particle displacements along the x and y axes were drawn randomly from a normal distribution with variance 2Dsi τ and mean vsi cos φsi τ or vsi sin φsi τ respectively. Each time a particle entered a state with positive advection velocity, we held the magnitude vsi fixed, but the direction of velocity was 26 3.3. Results drawn from a uniform angle distribution. The direction of advection was then held constant until the next state transition. In trajectories where a particle was confined within a barrier, a measure for barrier permeability, Pjump , was set for the trajectory. If a drawn displacement required the particle to cross a barrier, it crossed with probability Pjump and reflected according to Snell’s law with probability 1 − Pjump . 3.2.3 LFA-1 and CD45 labeling and single particle trajectories Experimental LFA-1 and CD45 trajectories are from Cairo et al. [8, 9]. Briefly, 1 micron beads were labeled with a Fab fragment of an LFA-1 (TS1/18) or CD45 (HI30) binding antibody. The beads were then blocked to prevent non-specific binding, and Jurkat T cells (clone E6.1, ATCC, Manassas, VA, USA) were labeled with beads and observed using video microscopy. Cells were treated with HBSS buffer containing either a vehicle control (DMSO), phorbol-12-myristate13-acetate (PMA) or cytochalasin D (cytoD). Trajectories were collected on live cells at 1000 FPS (1 ms) and converted to trajectories using Metamorph (Universal Imaging, Downington, PA, USA). 3.3 3.3.1 Results Method verification with simulated data We generated simulated tracks, as described in the Materials and Methods and Program B.1, to test the robustness of our HMM analysis over different track lengths, changes in model parameters, the presence of spatial membrane structures and the addition of experimental noise. Two representative simulated tracks and their subsequent HMM analyses are shown in Figs. 3.3 and 3.4. The two examples depict the two models considered here, namely TDM and TRM (Fig. 3.1). Given a single particle track, we choose an initial guess for the parameter set θ0 = {D1 , D2 , P12 , P21 , v1 = 0, v2 = 0} assuming no drift velocity in either state. We then calculated the log-likelihood L(θ0 |O1 ) using the forward algorithm and Eqs. 3.9 and 3.10. We use a modified MH MCMC algorithm to sample the posterior distribution of the parameters θ1 starting with uniform priors over a sensible physiological parameter range (see Program B.5) [47]. While MCMC is computa27 3.3. Results tionally less efficient than gradient-based schemes, it is less liable to be trapped in local maxima due to its stochastic nature. Secondly, by sampling the log-likelihood landscape around the maxima, the algorithm provides an estimate for the full posterior distribution (see Eq. 3.7). This gives a more complete measure of uncertainty compared to gradient-based maximization methods. We altered the usual rejection scheme to include the model selection process (see Fig. 3.2). In this way, our modified MCMC algorithm chooses the most likely movement model (TRM or TDM), while simultaneously finding parameter estimates which maximize the likelihood function. Figs. 3.3 and 3.4, B and C, show the MCMC trajectories. There is an initial burn-in period of approximately 1 × 104 iterations (shaded sections) where trajectories approach equilibrium. After burn-in, trajectories settle close to the correct parameter values with small stochastic fluctuations. Note that a positive velocity is predicted for state 1 after iteration 1 × 104 in Fig. 3.4, which indicates that the TDM model is deemed more likely than TRM here. The distribution of these accepted values for each parameter are plotted in parts D and E. These histograms represent the estimated posterior distribution L(θ|O1 ). We take the mode ˆ Relative errors in parameter estiof this distribution as our parameter estimate θ. mates were less than 15%. Errors are typically larger for the estimated transition rates compared to the diffusion coefficients and drift velocity since the number of state switches is much smaller than the track length. We tested the MCMC parameter optimization and model selection scheme for a range of parameter values and conditions. Figs. 3.5 and 3.6 along with Tables A.1 and A.2 report results from simulations where P12 and P21 vary uniformly over the range [10−2.5 , 10−0.5 ] for the TRM and TDM models respectively. For each set of parameter values we simulate 10 tracks and report the mean from all individual analyses. In most cases, parameter estimates are close to their true values, with relative errors typically less than 5%. Figs. 3.5 A and 3.6 A show the accuracy of the predicted state sequence for each set of simulated experiments. We see a general trend of decreasing accuracy with increasing values of P12 and P21 , however, the error is always less than 5%. The plots in part B show the error in the estimated diffusion coefficient for state 1 Dˆ1 . In addition, Fig. 3.6 C shows the relative error in the estimated value of v1 , vˆ1 . No equivalent relative error plot is shown for the TRM trajectories because estimated velocities were always zero. 28 3.3. Results True State Sequence D A 0.00 y (μm) −0.02 −0.04 −0.06 −0.08 −0.10 −0.12 Estimated State Sequence (98.4% accurate) State 1 - slow State 2 - fast −0.02 0.00 0.02 0.04 0.06 0.08 0.1 0.12 0 200 400 600 800 1000 x (μm) B -2 10 D2 D1 D2 -3 10 D1 frequency E D1 RE: 3.1% 2000 D2 RE: 3.2% 1000 10-4 0 10,000 20,000 30,000 40,000 0 C 10 0 10-3 10-4 10-2 diffusion coefficients: D1, D2(μm2s-1) MCMC iterations F 2000 frequency p12 p21 p21 10-1 p12 p12 RE: 11.4% p21 RE: 12.2% 1000 10-2 0 10,000 20,000 30,000 MCMC iterations 40,000 0 0 0.05 0.10 0.15 State transition probabilities: p12 and p21 Figure 3.3: Example of parameter optimization method on a simulated TRM trajectory (A) labeled in red (reduced) and blue (free). The track has the following parameters: N = 4000, D1 = 10−4 µm2 /s, D2 = 10−2 µm2 /s, P12 = 0.03, P21 = 0.10, v1 = v2 = 0µm/s. (B, C) The MCMC trajectories begin with an initial random guess and converge to the likelihood maxima after an initial burn-in phase shaded in grey. (D) 1000 frames of the actual and estimated state sequence where black represents state 1 and white represents state 2. (E, F) Histogram of parameter values from the MCMC trajectory where black dashed vertical lines are the true parameter values. Relative errors were less than 2% for diffusion coefficients and less than 13% for transition probabilities. v1 and v2 were predicted to be exactly zero using Fisher’s method. 29 3.3. Results A 0.4 True State Sequence D y (μm) 0.3 0.1 0.0 −0.3 B Estimated State Sequence (98.2% accurate) 0.2 State 1 - directed State 2 - diffusion −0.2 −0.1 0.0 0.1 10 0 0.2 x (μm) 200 E 3000 0 frequency v1 D2 D1 10-2 400 600 D1 RE: 1.6% 800 1000 D2 RE: 0.07% v1 RE: 1.4% 2000 1000 10-4 0 10,000 20,000 30,000 0 40,000 10-4 MCMC iterations 0 10-2 10-1 2 -1 100 -1 D1, D2(μm s ) and v1(μms ) p12 p21 F 10-1 p12 RE: 1.6% 2000 frequency C 10 10-3 p21 RE: 15% 1000 10-2 0 10,000 20,000 30,000 MCMC iterations 40,000 0 0 0.05 0.10 0.15 p12 and p21 Figure 3.4: Example of parameter optimization method on a simulated TDM trajectory (A) labeled in red (directed) and blue (diffusive). Parameters are: N = 4000, D1 = 10−4 µm2 /s, D2 = 10−2 µm2 /s, P12 = 0.03, P21 = 0.10, v1 = 10µm/s, v2 = 0. (B, C) MCMC trajectories begin with an initial guess and converge to the likelihood maxima after the burn-in phase (grey). (D) 1000 frames of the actual and estimated state sequence where black represents state 1 and white represents state 2. (E, F) Histogram of parameter values from the MCMC trajectory; black dashed vertical lines are the true parameter values. Relative errors were less than 2% for diffusion coefficients and velocity and less than 15% for transition probabilities. v2 was predicted to be exactly zero using Fisher’s method. 30 3.3. Results A 0.03 B 0.1 0.1 0.1 0.08 0.02 0.32 0.06 0.32 p21 0.01 0.01 p21 0.04 0.01 0.02 0.0032 0.0032 0 0.0032 0.01 p120.032 0.1 0.0032 0.01 0.032 p12 0.1 0 Figure 3.5: Error in parameter estimates for simulated tracks with TRM and varying transition probabilities (from 0.0032 to 0.1). Ten tracks were simulated for each set of transition probabilities. State 1 was the slow bound state with D1 = 10−4 µm2 /s, and state 2 was the freely diffusing state with D2 = 10−2 µm2 /s. (A) Error in the predicted state sequence (< 5%). (B) Percent relative error of the predicted diffusion coefficient for state 1 (Dˆ1 ) averaged over the 10 simulated tracks. The biggest errors occur when P12 is large and P21 is small but they are still less than 10%. Analysis indicated TRM over TDM as the more favorable model for all tracks. 31 3.3. Results A 0.035 B 0.15 0.1 0.1 0.025 0.1 0.32 0.32 p21 0.015 p21 0.01 0.01 0.05 0.005 0.0032 0.0032 0 0.0032 0.01 C p 0.032 12 0.1 0.0032 0.01 0.05 0.1 0.04 0.32 0.03 p21 0.032 p12 0.1 0.02 0.01 0.01 0.0032 0 0.0032 0.01 p120.032 0.1 Figure 3.6: Errors in parameter estimates for simulated tracks with TDM and a range of transition probabilities (from 0.0032 to 0.1). Ten tracks were simulated for each set of transition probabilities. State 1 had drift diffusion with D1 = 10−4 µm2 s−1 and v1 = 1µms−1 , and state 2 was the freely diffusing state with D2 = 10−2 µm2 s−1 . (A) Error in the predicted state sequence (< 5%). (B) Relative error of the estimated diffusion coefficient for state 1 (Dˆ1 ) averaged over the 10 simulated tracks. The largest errors occur when P12 is large and P21 is small. (C) Relative error of the estimated drift velocity for state 1 (vˆ1 ) averaged over the 10 tracks. Our analysis indicated TDM over TRM as the more favorable model for all tracks. 32 3.3. Results For all combinations of transition rates we tested, our algorithm discriminates with 100% accuracy between TDM and TRM. When estimating parameters for state 1, errors are largest when P12 /P21 1. Conversely, errors in parameter estimates for state 2 are largest when P21 /P12 1 (see Tables A.1 and A.2). In these two cases, the larger error is likely due to the fact that the particle is spending a very short amount of time in one of the states thus providing fewer data points to estimate from. Overall, however, errors in the estimated diffusion coefficients were less than 20% (and usually less than 10%). Relative errors in the estimated drift velocity were always less than 5%. According to these results, our model is applicable for a wide range of reasonable transition rates, although one should be wary of the accuracy of parameter estimates when one transition rate is predicted to be much larger than the other. We also varied velocities and diffusion coefficients in our simulations. HMM analysis produced accurate parameter estimates except when both of the diffusion coefficients and advection velocities differ by less than a factor of 2. In this case, the two states are indistinguishable and the estimates for P12 and P21 in the MCMC algorithm do not converge during the burn-in phase. Results from simulations where diffusion coefficients were varied are tabulated in Tables A.1 and A.2. Track length and experimental noise were also considered in our analysis. As expected, relative errors and the spread of posterior distributions decrease with increasing track length (see Fig. 3.7). With experimental noise added to tracks, parameter estimates and model selection were accurate. Our analysis is not reliable if the noise exceeds diffusion in its contribution to the expected particle displacement (see Fig. 3.8). Finally, we compared our results from analysis of simulated tracks with those using an alternative sliding window technique presented in Arcizet et al. [3]. Only trajectories with transient drift were analyzed since the Arcizet method does not consider TRM. We analyzed the same trajectories as those used for Fig. 3.6. This method involves the use of three carefully tuned user-input parameters (the window length and two threshold values), whereas our method does not. For the model parameters tested, we attempted to use the set of user-input parameters which gave the most accurate estimates. Parameter and state sequence estimates from the alternative analysis are plotted in Fig. 3.9. Relative errors (< 25% for the estimated 33 3.3. Results A 60 20 0 0 2000 3000 1000 Length of trajectory D1 D2 p12 p21 v1 40 CV (%) 40 RE(%) B D1 D2 p12 p21 v1 4000 20 0 0 2000 3000 1000 Length of trajectory 4000 Figure 3.7: To examine the effect of trajectory length on the accuracy and variability of maximum likelihood parameter estimates, we simulated 10 independent particle trajectories for 100, 500, 1000 and 4000 frames. (A) Relative error of parameter estimates averaged over the 10 frames for each trajectory length. (B) The coefficient of variation (CV), which measures the spread of the estimated posterior distribution, is plotted for each trajectory length. In all trajectories, particles were undergoing TDM with D1 = 10−4 µm2 /s, D2 = 10−2 µm2 /s, v1 = 1µm/s, v2 = 0, P12 = 0.05 and P21 = 0.01. velocity vˆ1 and < 60% for diffusion coefficient Dˆ1 ) are greater than those obtained from our HMM analysis. 3.3.2 Analysis of LFA-1 and CD45 particle trajectories We analyzed a set of experimental SPT data for the T cell integrin, LFA-1. LFA-1 is critical for lymphocyte adhesion and activation and has previously been studied using SPT [9, 11, 32]. In these studies, LFA-1 lateral mobility was shown to be highly dependent on interactions with the cytoskeleton. Analysis of experimental trajectories revealed two separate populations of the protein - one freely diffusing state and one with reduced mobility caused by cytoskeletal interactions. In our previous analysis, a two-state HMM revealed switching behavior between these states. However, this model did not consider drift diffusion. To rule out directed motion in either state, we extended the analysis of these LFA-1 trajectories using the more comprehensive method presented here. LFA-1 was labelled with beads coated with either ICAM-1 or the antibody TS-1/18 while cells were treated with cytocholasin D, activated with PMA (phorbol 12-myristate 13-acetate) or untreated. In all LFA34 fraction of velocity estimates with relative error <10% 3.3. Results 1 0.8 0.6 0.4 0.2 2D2τ 2D1τ 0 10-4 10-3 10-2 SD of noise added Figure 3.8: Accuracy of model prediction with experimental noise added to simulated tracks. 20 simulated TDM trajectories over a range of noise values were analyzed. We plot the frequency of estimated directed velocities (vˆ1 ) with a relative error of less than than 10% against the standard deviation of noise added to the trajectories. The dashed lines correspond to the standard deviation of the Gaussian for the particle’s diffusivity in state si ( 2Dsi τ ). In all trajectories, particles were undergoing TDM with D1 = 10−4 µm2 /s, D2 = 10−2 µm2 /s, v1 = 1µm/s, v2 = 0, P12 = 0.032 and P21 = 0.032. 35 3.3. Results A 0.4 0.1 B 0.1 0.6 0.32 0.4 C 0.1 0.2 0.32 0.15 0.3 0.32 p21 0.2 0.01 p21 0.01 0.2 0.1 0.0032 0.01 0.032 p12 0.1 0.1 0.01 0.05 0.0032 0.0032 0 0.0032 p21 0 0 0.0032 0.01 0.032 p12 0.1 0.0032 0.01 0.032 p12 0.1 Figure 3.9: Results from analysis of simulated tracks with TDM and varying transition probabilities (from 0.0032 to 0.1) using the method presented in Arcizet et al. [3]. 10 tracks were simulated for each set of transition probabilities. State 1 had drift diffusion with D1 = 10−4 µm2 s−1 and v1 = 1µms−1 , and state 2 was the freely diffusing state with D2 = 10−2 µm2 s−1 . These are the same parameters that were used in Fig. 3.6. (A) Error in the estimated state sequence. (B) Relative percent error of the estimated diffusion coefficient for state 1 (Dˆ1 ) averaged over the 10 simulated tracks. (C) Relative percent error of the estimated advection velocity. 1 particle tracks examined, the TRM model was favored over the TDM model (see Table 3.1). Our maximum likelihood estimates for diffusion coefficients and reaction rates agree with the previous results tabulated in Das et al. [11]. This is expected as the 2-state TRM model used in this previous work is a special case of the HMM we consider here. CD45 is a receptor-like protein tyrosine phosphatase (RPTP) that is a critical component in the activation of T cells [28]. Previous analysis of CD45 trajectories were consistent with TRM as a result of cytoskeletal contacts that are enhanced during cellular activation [8]. Evidence for this argument is strengthened with the results from TRM HMM analysis, where the number of steps classified as slowly diffusing is enhanced in cells activated with PMA and reduced in cells with cytoskeletal disruption. Since the model used in Cairo et al. did not consider drift, we analyzed CD45 tracks again using the new HMM method. As in the LFA-1 experiments, T cells were treated with cytoD, activated with PMA or untreated. The TRM model was again more favourable than the TDM model in all CD45 trajectories examined, confirming previous results (see Table 3.1). 36 3.3. Results Table 3.1: Results from two-state HMM analysis of LFA-1 and CD45 trajectories Protein LFA-1 LFA-1 LFA-1 LFA-1 LFA-1 LFA-1 CD45 CD45 CD45 Label TS1/18 ICAM-1 TS1/18 ICAM-1 TS1/18 ICAM-1 HI30 HI30 HI30 Treatment untreated untreated PMA PMA cytoD cytoD untreated PMA cytoD n 75 38 39 24 36 48 68 31 55 Model selected TRM TRM TRM TRM TRM TRM TRM TRM TRM Mean p-value 0.8701 0.8835 0.8235 0.8964 0.6226 0.6937 0.7747 0.8411 0.8853 0.6606 0.8334 0.7681 0.8546 0.8558 0.8842 0.9472 0.6959 0.7689 Results from a two-state HMM applied to an ensemble of n independent LFA-1 or CD45 trajectories. Proteins were labeled with beads coated with TS1/18, ICAM-1 or HI30 and cells underwent various treatments as indicated. We report the model (TRM or TDM) determined to be most likely by our HMM analysis. The indicated p-value is the probability of observing a given trajectory assuming that there is no directional component in state 1 or 2 respectively. The mean of n p-values for each state and experimental condition is reported in the table. Because the TRM model is most likely, the remaining parameter estimates (Dˆ1 , Dˆ2 , Pˆ12 and Pˆ21 ) are the same as those reported previously [8, 11]. 37 3.3. Results A B C D 0.1 μm Figure 3.10: Examples of experimental tracks of labelled CD45 and LFA-1 single particles on live T cells which have channel-like trajectories. (A) LFA-1 on control cell (B) LFA-1 on activated cell (treated with PMA). (C) CD45 on control cell (D) CD45 on activated cell (treated with PMA). 3.3.3 Study of simulated confined trajectories Many of the CD45 and LFA-1 particle trajectories qualitatively looked directed or as though the particle was diffusing through a thin channel (see Fig. 3.10). In order to determine if our algorithm erroneously predicts directed motion in a channel, we simulated particle tracks within thin channels and analyzed them with the 2-state model proposed here. An example track is shown in Fig. 3.11. Multiple simulations were performed over a range of transition probabilities (0.01 to 0.1) and ratios of diffusion coefficients (D1 /D2 from 2 to 100). Channels were infinitely long and had a width of 0.005 µm, which is only approximately twice the expected individual step length for the particle in the fast diffusive state. None of the tracks were falsely classified as undergoing TDM with our analysis (see Table A.3). Therefore, snake-like trajectories (like those seen in the LFA-1 and CD45 data) classified with no drift component using our analysis, could arise from Brownian diffusion within thin channels. 38 3.4. Discussion 0.02 state 1 - fast state 2 - slow y (μm) 0.01 0 -0.01 -0.02 -0.04 -0.02 0 0.02 0.04 0.06 x (μm) Figure 3.11: An example of a single particle track confined within a thin channel segmented into two-state with HMM analysis. The particle is undergoing TRM within a thin channel (0.005µm) which was approximately the expected step length of the fast diffusive state. The boundaries of the channel were reflective. No transient directed motion was predicted (see Table A.3 for all results). 3.4 Discussion We have presented a HMM for particle tracks along with a method for parameter optimization. Our method can accurately discriminate between trajectories with TRM and trajectories with TDM, while producing good estimates for model parameters, as confirmed by simulated data. Analysis of LFA-1 trajectories indicates that a TRM model is appropriate for this protein, confirming the results previously reported [8, 9, 11]. Analogous conclusions were drawn from the analysis of CD45 trajectories in that the TRM model was shown to be more likely than TDM [8]. Trajectories of LFA-1 and CD45 which qualitatively appear to be directed initially motivated the application of our HMM method, but no drift was detected. However, Brownian diffusion within thin channels could also produce trajectories with these snake-like shapes and we demonstrated with simulated data that our method does not falsely predict a non-zero drift component under these conditions. A number of methods have been proposed to identify single molecule tracks that deviate from homogenous Brownian diffusion. Some of these are global methods that classify tracks based on their statistical fit to models of lateral mobility 39 3.4. Discussion such as drift diffusion or confined diffusion [21, 35, 56]. However, these methods do not consider transient changes in mobility that occur within a single trajectory. Some previous methods address these transient changes in lateral mobility arising from confinement [39, 66], binding [11, 38] or drift [3, 5]. The method presented in this chapter does not explicitly consider confinement, but it is the first to combine multiple diffusion models in a statistically robust fashion. The analysis we present in this chapter is local in that the particle displacements are not averaged over multiple steps as they are in MSD calculations. The likelihood calculations consider each step individually. With this type of analysis, effects due to confinement cannot be detected because it takes more than one individual step for the particle to sense the presence of compartment boundaries [41, 57]. Methods such as [66] or [39], where displacements are averaged over a larger number of steps, are needed to detect these types of confinement zones. On the other hand, if a protein exhibits different diffusivity within confinement zones, then the location and size of these zones can be estimated through an inspection of the predicted state sequence from the HMM analysis. Results from this type of analysis could be useful in experiments in which some other membrane components are labelled [2, 18, 74]. For example, one could use the predicted state sequence from the HMM analysis to show that one state is more likely in proximity to a labelled component. In the next chapter, we extend the HMM model in order to detect and quantify these types of spatial domains. Finally, the analysis we present here requires a low number of user input parameters. The only user-selected parameter is the significance level for the p-value used in the hypothesis tests. Similar methods average displacements over a sliding window to detect transient changes [3, 5, 39, 66]. This approach often requires at least two user input parameters, namely the window length and a threshold condition. The values of these parameters can greatly impact the outcome of the analysis. Moreover, these techniques cannot produce accurate segmentations of the beginning and end of particle tracks since there are fewer segments covering these steps. The hidden Markov model is a practical and easily adaptable tool. In regards to molecular diffusion, it has the capacity to be used in many different systems in which lateral mobility of particles changes with time. The method can also be developed further to include more states and different modes of motion. Our model 40 3.4. Discussion should prove useful for analyzing many experimental trajectories and extracting valuable information regarding the spatiotemporal dynamics of receptor mobility. 41 Chapter 4 Extensions to the Hidden Markov Model: Detection of Spatial Domains Cell membranes are known to be heterogeneous structures, featuring distinct domains on the scale of tens to hundreds of nanometers [34]. The lateral segregation helps organize the complex system of interacting proteins and lipids on the cell surface. Recent fluorescence microscopy experiments indicate that many of these domains are characterized by the enrichment or exclusion of membrane constituents or reduced lateral diffusivity of components within them [12, 16, 18, 46, 71] (see Chapter 2). Both of these features increase the number of protein-protein interactions that occur on the cell surface and in doing so, facilitate more effective signaling [33, 43, 44, 74]. Therefore, characterizing and quantifying spatial domains and the lateral mobility of biomolecules within them will provide valuable insight into a multitude of important membrane functions. Using SPT to accomplish this is preferable compared to ensemble averaging methods such as FRAP which tend to smooth out spatial heterogeneity in particle trajectories. In this chapter we extend our HMM model and MCMC statistical analysis to create a novel method for detecting and quantifying membrane spatial domains in single particle trajectories. We can detect spatial domains of two types: • Domains wherein proteins or lipids exhibit a reduced microscopic diffusion coefficient. SPT experiments illustrate that lipid rafts exhibit this property - localized diffusion coefficients of raft-philic proteins are estimated to be reduced by ∼ 2 − 5 times within rafts [16, 46]. • Domains with enriched ligand concentration. There have been multiple ob42 Chapter 4. Extensions to the Hidden Markov Model: Detection of Spatial Domains servations of protein-islands containing enriched concentrations of signaling molecules. For example, microdomains on T cells are enriched in LAT, Lck and CD2 which are necessary in initiating antigen recognition [18]. Also, Gαi2 (a heterotrimeric G protein) and Lyn (a Src family tyrosine kinase) exhibit transient recruitment to clusters of CD59 in order to activate Lyn on the surface of human epithelial cells [71]. For concreteness, any domain with the first property will be referred to as a raft and those of the second type will be called ligand-enriched domains. Detection of the raft-type domains is a direct application of the method presented in Chapter 3. Like before, we seek to describe a protein that is undergoing TRM by switching between fast and slow Brownian diffusion. However in this case, state switching occurs in space and not time - previously, reduced diffusivity was initiated upon ligand binding and now this switch in mobility occurs when the particle enters a spatial raft-type domain. Using simulated data, we show that our method can accurately segment trajectories into fractions that are inside or outside rafts. We can also estimate diffusivity both outside and inside rafts as well as the size of each domain. In order to detect ligand-enriched domains, we add a second layer to the HMM. Recent SPT experiments have shown that lateral mobility of proteins can be highly dependent on their interaction with heterogeneously distributed proteins or lipids. In our model, a non-uniform distribution of ligands causes spatial heterogeneity in the rates of switching between different mobility types - in a domain rich in ligands, the particle will switch to the reduced (bound) diffusive state more often than in a domain which is poor in ligands. Therefore, the aim of the second layer is to identify spatial domains which have either high or low concentrations of these ligands. Tracks are once again modeled as the outcome of a two-state HMM and we estimate parameters using MCMC. In the second layer, the two states represent the location of the particle - whether it is in a ligand-rich region or ligand-poor region. In this chapter, we present the results from an in-depth analysis of simulated data. We have not yet applied these methods to experimental data. 43 4.1. Theory 4.1 Theory Here we introduce an extension of the HMM presented in Chapter 3 to detect raft domains and a novel layered HMM to detect ligand-enriched domains. As before, the models describe a protein that is switching between different states of mobility within an individual trajectory. Previously, these state switches were not dependent on the particle’s position since we assumed they were initiated through binding with a homogeneously distributed ligand. Now we relax this assumption to allow for spatially-dependent state switches. The first type of spatially-dependent switching occurs in the presence of raft type domains wherein the particle undergoes reduced diffusion each time it enters one of these regions. We use the original HMM model and statistical analysis to locate and quantify these domains. However, the transition probabilities (P12 and P21 ) can no longer be interpreted using molecular reaction rates of particle-ligand binding; instead they depend on the particle’s diffusivity and the size, distribution and barrier permeability of the rafts. To account for spatially dependent switching in the presence of ligand-enriched domains, we add a second layer to our original HMM. With the layered model, we can detect areas of rich and poor ligand concentration; these areas create spatial heterogeneity in the transition rates between the slow (bound) and fast (unbound) diffusive states. We optimize the model parameters for both layers sequentially using a modified Markov chain Monte Carlo (MCMC) optimization algorithm (see Section 4.2 for more detail on the statistical analysis). 4.1.1 First layer of the hidden Markov model - identifying raft domains In the first layer, the goal is to classify the mobility of the particle at each point along the trajectory. We summarize the model as follows: kon GGGGGGGB {D1 } F GG {D2 } koff (4.1) The particle switches between the two states by entering and leaving rafts with rates kon and koff . In either state, si = 1, 2, the particle has a diffusion coefficient, Dsi . 44 4.1. Theory A B LAYER 1 LAYER 2 Slow Fast OBSERVED HIDDEN Slow Fast Spatial info. OBSERVED HIDDEN Slow Fast Rich Poor raft domain Figure 4.1: Schematic of the HMMs for spatial domain detection. In an HMM, the states are hidden from the observer (grey background) and must be inferred from the observation (white background) using a statistical method. (A) Detection of raft-like domains. Only one layer of the HMM is needed. (B) Detection of ligandenriched domains (two layers). Left: Layer one (mobility layer). Here the goal is to characterize the mobility at each step of the trajectory. Right: Layer two (spatial layer). The goal in this layer is to segment the trajectories into areas with rich and poor ligand concentrations. Testing for drift in either state (inside or outside the lipid rafts) is not included in the analysis presented here but would be a straightforward extension to this model (see Chapter 3). Additionally, we make the assumption that the particle is imaged instantaneously and state transitions occur only at the acquisition time, implying that the particle is entirely in one state during successive image frames. This assumption is valid for transition rates much larger than the frame rate, i.e. when rafts are not too small or too close together. With this assumption, we arrive at our previously derived two-by-two transition probability matrix (Eq. 3.6). As before, the state sequence of the particle (whether it is inside or outside 45 4.1. Theory a domain) during an SPT observation is regarded as a two-state hidden Markov process. A displacement (xi , yi ) in the particle track is the outcome of Brownian diffusion with parameters corresponding to the particle’s position at that interval in time. So again, the trajectory O1 (a one is added to indicate the first layer) is the observed outcome of our two-state HMM and the particle’s state sequence Q1 is hidden from the observer. See Fig. 4.1A for a schematic representation of this model. We use the same likelihood-based approach as before to estimate model parameters and infer the underlying state sequence. Regardless of whether we are investigating particle diffusion within lipid raft domains or a receptor-ligand reaction causing TRM (as in the previous chapter), we can use the same statistical analysis. 4.1.2 Second layer of the hidden Markov model - identifying ligand-enriched domains The layered HMM models the trajectory of a protein that is switching between different states of lateral mobility where the concentration of ligand is not homogeneous in space. We assume that there are two types of spatial domains - those with rich ligand concentration and those with poor ligand concentration - although an extension to more domain types is possible. In the first layer, we estimate diffusion coefficients and the state sequence Qˆ1 using the method presented in Chapter 3. The goal in the second layer is to determine where these rich and poor regions are located along the trajectory. In this layer, the particle’s state sequence Q2 = t1 , t2 , ..., tN (ti = r, p) specifies whether the particle is in a region of the membrane rich or poor in ligand and is hidden from the observer. Fig. 4.1B is a schematic representation of the layered HMM. In the work presented here, the particle is assumed to be undergoing TRM, however the model can easily be extended to include drift. Since kon is proportional to the equilibrium concentration of ligand, P12r ≈ ([Lr ]/[Lp ]) P12p and P21r ≈ P21p if the time between frames is small. P12r and P12p are the probabilities of transitioning from the unbound fast diffusive state to the bound state in a rich and poor region respectively, while [Lr ] and [Lp ] are the equilibrium ligand concentrations in the rich and poor regions. Therefore, when the particle is in a rich region, it is more likely to be in the bound state (assumed 46 4.1. Theory to be s2 ) in layer one of the model since P12r > P12p . In fact, diffusion within lipid rafts can be thought of as a limiting case of this scenario where P12r = 1 and P12p = 0 leading to the particle always being in the bound state within the domains. Additionally, the estimate for P12 obtained after applying the first layer of the algorithm to a trajectory where a particle diffusing in these spatial domains is a weighted average of the transition rates in each region: Pˆ12 = fr P12r +(1−fr )P12p . fr is the fraction of the membrane covered by regions that are rich in ligand. As in the first layer, state transitions are modeled as a unimolecular reaction: k+ GGGGGGB {[Lp ], P12p } F GG {[Lr ], P12r } k− (4.2) The particle diffuses between rich and poor regions at rates, k+ and k− , that depend on the size of the domain, the diffusion coefficients and the permeability of the barriers between domains. These rates can be translated to state transition probabilities in the Markov model in the same way as in Eq. 3.6 to get Prp and Ppr . This assumption is valid for transition rates are much greater than the frame rate and when spatial domains are distributed uniformly. Just as in layer one, the state sequence describing the ligand concentration at each step is regarded as a two-state hidden Markov process. The observed outcome of the HMM at each step i of the trajectory is the diffusive state at position i (sˆi estimated in layer 1), as well as a weighted average of the diffusive states of the particle positions surrounding each step. Without the spatial information, the algorithm would erroneously choose the optimal state sequence such that all fast diffusive states occur in poor regions and all slow diffusive states occur in fast regions (like in raft domains). The weighted average of diffusive states near xi is calculated as Wi Wi = 1 ni d(xi , xj )−α (sˆj − 1) + 1 (4.3) d(xi ,xj )<A where d(xi , xj ) is the distance between particle positions xi and xj and ni is the number of these positions within a user-defined distance, A, from the point of interest, xi . Both the decay exponent, α, and the support size, A, are user input 47 4.2. Materials and methods parameters. With simulated data, results depend weakly on A with lower support sizes being optimal. We found that the best choice for α is one but this parameter does not strongly influence results if it is between one and two (see Fig. 4.2). 0.40 2.5 Decay exponent 0.35 2.0 0.30 1.5 0.25 1.0 0.20 0.15 0.5 0.10 0.25 0.125 0.25 0.5 1.0 2.0 Support (fraction of domain size) Figure 4.2: The effects of the user-defined support size and the decay exponent on error in domain identification. The mean errors in the estimated state sequence for layer 2 Qˆ2 are plotted for varying values of A and α. Ten tracks were simulated for each set of parameters. Other parameters: D1 = 0.2µm2 s−1 , D2 = 0.01µm2 s−1 , [Lr ]/[Lp ] = 20, Pjump = 0.05, domain area= 1µm2 . 4.2 4.2.1 Materials and methods Parameter optimization Likelihood calculation - lipid rafts For a single particle track, we observe a sequence of individual displacements O1 which are taken to be the outcome of a two-state hidden Markov model with the following set of parameters: θ = {D1 , D2 , P12 , P21 }. The log-likelihood of a 48 4.2. Materials and methods LAYER 1 (MOBILITY) initialize parameters θ0={D1, D2, P12, P21, |v1|=0, |v2|=0} and calculate likelihood for θ0 perturb parameters θA only perturb velocities if state is classified as directed calculate likelihood for θA use appropriate model (TRM or TDM) and θA according to Metropolis Hastings if θA is accepted: calculate most likely state sequence and test for directed motion repeat ~105 times parameter estimates Histogram of accepted parameters after burn-in is the estimate for P(θ1|O1). LAYER 2 (SPATIAL) calculate observation (O2 ) from layer 1 calculate most likely state sequence and the weighted average of surrounding states MCMC to estimate parameters θA={P12R, P12p, QRP, QPR}, calculate likelihood and accept according to M.H. parameter estimates Histogram of accepted parameters after burn-in is the estimate for P(θ2|O2). most likely state sequence calcualte most likely state sequence to identify rich and poor spatial regions Figure 4.3: Flowchart showing the parameter estimation algorithm for both layers of the HMM. diffusion coefficient Dsi for an individual displacement (xj , yj ) is x2j + yj2 − log(Dsi τ ) i (xj , yj ) ∝ − 4Dsi τ (4.4) The log-likelihood function is only defined to an arbitrary additive constant so we only retain terms with explicit dependence on model parameters. Again, velocity is neglected here but if drift were to be considered, the likelihood equation would be the same as Eq. 3.9. Likelihood calculation - layer 2 In the second layer of the HMM, we observe a state sequence from layer 1 (Qˆ1 ) as well as the weighted average of states surrounding each point (see Eq. 4.3). This observation is taken to be the outcome of a two-state HMM with the following set of parameters: θ = (P12p , P12r , P21r , P21p , Prp , Ppr ), where [Lr ]/[Lp ] ≈ P12r /P12p and P21r = P21p ≈ Pˆ21 . The likelihood of a transition rate P12t for a transition i from sj to sj+1 in layer 1 is P (particle is in state sj for layer 1 given the layer 2 49 4.2. Materials and methods state ti = r, p)× P (particle transitions from state sj to sj+1 in layer 1 given the layer 2 state ti ) × P (Wi is observed given that the particle is in layer 2 state ti ). From Eqn. 4.3, there are ni positions that we consider surrounding the particle and (Wi − 1)ni of them are estimated to be in state 2 of layer 1. Therefore, (Wi − 1)ni follows a binomial distribution B(n, p) with parameters n = ni and p = P12ti /(Pˆ21 + P12ti ) (the probability that the particle lies in state 2 (of layer 1) given that it is in state ti of layer 2). We can therefore write the full expression for the likelihood function as follows: ˆ21 Psj sj+1 ti f (Wi − 1)ni ; ni , Pˆ P+P 21 12ti Li (sj−1 , sj ) = P12ti Psj sj+1 ti f (Wi − 1)ni ; ni , ˆ P21 +P12ti P12ti Pˆ21 +P12ti if sj = 1 P12ti Pˆ21 +P12ti if sj = 2 ti = r, p , (4.5) where f (k; n, p) is the probability mass function of the binomial distribution B(n,p). For an entire track O with state sequence Q (for either layer), the likelihood for a given parameter set θ is given by Eq. 3.10. We use the forward algorithm to compute this likelihood value in each layer of our HMM (see Program B.2) [49]. Track segmentation For a particular θ and a set of displacements from a single particle track O, we ˆ = sˆ1 , sˆ2 ...sˆN for the calculate the most likely state sequence in either layer, Q Markov chain using the backward algorithm (see Program B.3)[49]. Parameter estimation Given our observation for the HMM, the objective is to estimate the posterior probability (Eq. 3.7). We use the same MH MCMC algorithm as in the previous Chapter to estimate parameters for both applications (see Program B.5). In the layered model, MCMC has to be performed twice - once for each layer. Fig 4.3 summarizes the steps of each layer in the MCMC algorithm. Again, the size of displacements along parameter axes were adaptively tuned to achieve the conventional acceptance rate of approximately 40% after burn-in [25] and we report the mode of the result50 4.2. Materials and methods ing distribution as our parameter estimate. 4.2.2 Generation of simulated single particle tracks Simulated trajectories were generated to test the accuracy of our method. We simulated two-state trajectories consisting of N displacements, (x1 , y1 ), ..., (xN , yN ), with each occurring over a constant time interval τ . We first generate a 2-state Markov chain with a transition matrix as described in Eq. 3.6 to generate the state sequence describing the particle’s mobility, Q1 . The particle displacements along the x and y axes were drawn randomly from a normal distribution with variance 2Dsi τ and mean 0. Program B.1 describes this process in detail. Boundaries of lipid raft or ligand-enriched domains were first defined on the membrane. For partially impermeable boundaries, a probability of jumping over them was defined (Pjump ). If the particle failed to jump over the boundary, it was reflected. In the presence of ligand-enriched domains, every time the particle entered a new spatial domain, a new state sequence describing its mobility was generated according to P12r or P12p and P21 . 4.2.3 Estimating areas of spatial domains We estimated the areas of the spatial domains using the final segmented trajectory (an example of this process is shown in Fig. 4.4). We defined a grid on the x and y axes which divided the plane into bins. If the particle had entered a bin during its trajectory then that bin was classified as being in either a rich or poor domain (or inside or outside a lipid raft) based on the particle’s determined state inside the bin. This gave us an image of the x-y plane with labelled areas indicating where the particle was inside a domain (Fig. 4.4A-B). The grid size does affect the results, so the user must use discretion in this step. The best grid size was usually approximately √ one fourth of the expected step size for the particle ( 4Dτ /4). We then identified and labeled domains with MATLABs bwlabel function and filled in any holes in the binned domains using the imclose function (Fig. 4.4B-C). Finally, areas near the edge or those which seemed to be small artifacts are removed (Fig. 4.4C-D). Assuming domains were symmetric and square, we measured the longest distance between two points in each domain and took that to be an estimate of the diagonal 51 4.3. Results length of the square (Fig. 4.4D). Assuming the domains were circular gave similar results. This method does lack some stringency due to the user-defined grid size. However, a more rigorous method would be difficult to derive since the particle does not fully explore all domains in most trajectories. In general, once the domain image is acquired, one can potentially quantify the size or shape of the domains in a way deemed appropriate for the experiment at hand. For example, in experiments (such as [2, 18, 74]) where other membrane components are labeled in addition to the single particles, one could look for a correlation between the single particle domains and domains identified in the fluorescent image. 4.3 Results We used simulated trajectories to test both applications of our hidden Markov analysis. In the first layer, used here to identify lipid rafts, we demonstrated in Chapter 3 that our statistical analysis is accurate under many different simulated conditions (track lengths, experimental noise, diffusivity, transition rates etc.). Therefore, to avoid repetition, we only present results deemed biologically relevant for trajectories in the presence of raft-type domains. The ratio of free protein diffusivity to that of proteins within lipid rafts (Dnon-raft /Draft ) has been observed experimentally to be ∼ 2 − 5 [16, 46]. We therefore tested Dfree /Draft ratios of 2 through 10 in our simulations. In addition, we set the radius of the raft in the simulations domains to be on the order of 10 nm in order to match recent experimental measurements [13, 20, 36]. An example of a typical simulated track within circular lipid rafts is shown in Fig. 4.5A. Part B of the figure shows our estimated state segmentation into parts of the trajectory deemed inside (red) and outside (blue) a raft. In this case, the estimated state sequence and raft size contain 3% relative error. Results from our statistical analysis of trajectories with varying Dnon-raft /Draft are displayed in Table A.4. State sequences have less than 3% error and raft sizes are estimated to within 25% of the correct values. We also used simulated trajectories to test the second layer of our hidden Markov analysis. As in the first layer, tracks were tested in a variety of conditions. An example of our spatial multi-layered HMM analysis is shown in Fig. 4.6. Here, the membrane is divided up into square domains, half being ligand rich and 52 4.3. Results A B C D measure remove fill in 0.2 μm 0.2 μm 0.2 μm 0.2 μm Figure 4.4: Example of the method used for estimating the size of spatial domains with one simulated particle track consisting of 4,000 frames within rafts (same trajectory as in Fig. 4.5). The Trajectory contains the following parameters: area of spatial domains = 0.0314µm2 , Pjump = 0.02, D1 = 3−1 µm2 s−1 and D2 = 1−1 µm2 s−1 . (A) The simulated trajectory with lipid raft like spatial domains (circles) that is labeled red (inside rafts) and blue (outside rafts) based on the MCMC parameter estimation and estimated state sequence sˆ. We first define a grid of size 0.005µm on the x and y axes which divided the plane into bins and classified each bin as being inside or outside a raft (based on sˆ) (B). Each object is labeled using MATLAB’s bwlabel and filled in using imclose (B-C). All small domains that appear to be small artifacts are then removed from the image (C-D). Finally, we measure the longest distance between two points in each of the three remaining domains, which provides us with three estimates of the radii. In this example, the estimated domain size has 3% error. half being ligand poor. We do not assume anything about the spatial structure in our analysis. First, we use the layer 1 HMM analysis to segment the trajectory into the most likely sequence of fast and slow diffusive states (Fig. 4.6B). Given this segmentation, we use MCMC to estimate the posterior distributions for parameters in the second layer of our model as described in the Materials and Methods. The final estimated sequence of rich and poor regions is shown in Fig. 4.6C. In this case, our estimate is 95% accurate and qualitatively matches the shape of the membrane domains. In addition, the area of ligand-enriched domain is estimated at 10% off 53 4.3. Results A 0.5 μm Dinside = 0.1 μm2s-1 Doutside = 0.3 μm2s-1 Domain size: 0.031 μm2 0.5 μm Estimated Dinside = 0.081 μm2s-1 Estimated Doutside = 0.297 μm2s-1 Estimated Domain size: 0.032 μm2 B Figure 4.5: Typical results from the MCMC parameter optimization and track segmentation with one simulated particle track consisting of 4,000 frames with lipid rafts. The trajectory contains the following parameters: area of spatial domains = 0.0314µm2 , Pjump = 0.02, D1 = 0.3µm2 s−1 and D2 = 0.1µm2 s−1 . (A) The simulated trajectory with lipid raft like spatial domains (grey). (B) Layer 1 (mobility) segmentation into fast (unbound) and slow (bound) diffusive states. The estimated state sequence (shown in red and blue) and the estimated domain size each have 3% error. of the true value. Fig. 4.7 reports the relative error in the estimated spatial state sequence and domain size for varying domain sizes, ratio of ligand concentrations in rich and poor regions and barrier permeability. Errors in the estimated state sequence and domain 54 4.3. Results A 3 Rich Poor y(µm) 2 1 0 −1 Domain size: 1.0 µm2 −1 B 1 2 x(µm) 3 Accuracy: 99.8% C Fast Slow 3 Accuracy: 95% Rich Poor 2 y(µm) 2 y(µm) 0 1 1 0 0 −1 −1 Estimated domain size: 1.1 µm2 −1 0 1 x(µm) 2 −1 0 1 2 x(µm) Figure 4.6: Typical results of two-layered MCMC parameter optimization and track segmentation with one simulated particle track consisting of 50,000 frames and the following parameters: [Lr ]/[Lp ] = 20, area of spatial domains = 1µm2 , Pjump = 0.05, D1 = 0.2µm2 s−1 , D2 = 10−2 µm2 s−1 , P12r = 0.1 and P21p = 0.005. (A) The simulated trajectory with rich (grey) and poor (white) spatial domains. (B) Layer 1 (mobility) segmentation into fast (unbound) and slow (bound) diffusive states. The estimated state sequence is 99.8% accurate. (C) Layer 2 (spatial) segmentation into rich and poor ligand regions. The estimated state sequence is 95% accurate and the estimate of domain size has 10% error. 55 4.4. Discussion areas are usually near 10 − 15% and 30 − 50% respectively. Error increases as the ligand concentration in the rich and poor regions become similar (Fig. 4.7A-B). The error in the estimated state sequence hass a weak dependence on the area of spatial domains and barrier permeability, however, the area estimates are more accurate for smaller domains and for domains with less permeable barriers (Fig. 4.7 C-F). The explanation is intuitive: with smaller domains, the particle visits more domains providing more data points from which to make an estimation and with stronger barriers, particles cover more area inside each domain. Results from this analysis are reported in more detail in Table A.5. We also plot the same error values for trajectories where we vary both the domain size and ratio of ligand concentrations simultaneously (Fig. 4.8). Again, these plots show that the reliability of our analysis is weakly dependent on domain size but it increases significantly as the difference between ligand concentrations increases. Other domain shapes were tested in addition to squares with similar results. An example of our analysis on one of these trajectories is shown in Fig. 4.9. Different track lengths were also tested and relative errors are reported in Fig. 4.10. Relatively long track lengths are needed for accurate spatial domain detection. If tracks are too short the particle does not make enough transitions between states of mobility to accurately distinguish between rich and poor domains. One way to get around this with experimental data would be to use all of the trajectories on each cell simultaneously in the spatial analysis. 4.4 Discussion We have presented two extensions to our HMM to create a layered model that characterizes raft-like and ligand-enriched domains. With the first layer, we can accurately segment trajectories into slow and fast diffusive modes thereby identifying the locations and sizes of lipid raft domains. As confirmed using simulated data, our method is accurate within parameter regimes that are consistent with recent experiments. In the second layer, the algorithm accurately detects spatial domains wherein the particle is switching between its two states of mobility at different rates. These domains correspond to areas on the membrane that are rich (or poor) in associating ligands or motor proteins. We demonstrated that the method’s 56 4.4. Discussion accuracy is robust to changes in domain shape, size and barrier permeability in simulated trajectories. The plasma membrane has a heterogeneous structure and consequently, its components exhibit spatial heterogeneity in their mobility [34, 44]. Single particle tracking has been valuable technique for identifying spatial domains in the past because spatial heterogeneity is not averaged out over many trajectories as it is in ensemble methods like FRAP and FCS. One notable example is the method Jacobson’s group has developed to detect transient confinement zones resembling lipid rafts [66]. However, by assuming a constant rate of diffusion throughout the trajectory, they neglect an important feature of lipid rafts. Our method is novel in its ability to detect spatial domains through changes in a particle’s local diffusivity. Our method is also the first to infer regions of high ligand concentration through analysis of single particle tracks. Some recent exciting SPT experiments have included a secondary fluorescent labeling of a protein assumed to be interacting with the labelled particle in some way [2, 18, 74]. In these experiments, investigators attempt to link the particle’s diffusivity to its proximity with the secondary protein. Our layered HMM analysis on trajectories such as these could be valuable in testing hypotheses regarding if and how these membrane components are interacting. In summary, our method should prove useful in characterizing and quantifying a variety of spatial domains as well as the lateral mobility of their components. This is a valuable endeavor given that these domains are poorly understood despite their ubiquity and importance in many plasma membrane functions. 57 A layer 1 (mobility) layer 2 (spatial) relative error 0.25 0.2 0.15 0.1 0.05 02 0.8 10-1 100 0.6 0.4 0.2 02 D1 relative error 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0 -2 10 1 10 14 18 2226 [LR]/[LP] C relative error 6 B relative error 4.4. Discussion 101 0.6 0.4 0.2 0 10-2 0.1 0.05 0 10-2 F 0.8 0.7 relative error relative error E 10-1 Pjump 100 10 14 18 2226 0.8 2 Area (µm ) 6 [LR]/[LP] 0.6 0.5 0.4 0.3 0.2 0.1 0 -2 10 10-1 100 101 Area (µm2) 10-1 10-2 Pjump Figure 4.7: Errors in the estimated state sequences (A,C,E) and domain area (B,D,F) with 90% confidence intervals obtained by analyzing simulated tracks with TRM in rich and poor ligand regions. Each pair of figures contains results from 20 independent simulations, each with a track length of 5 × 104 . In (A,C,E), red represents the error in the estimated layer 1 sequence and blue is the error for layer 2. For all simulations unless indicated, [Lr ]/[Lp ] = 20, the area of spatial domains is 1µm2 , Pjump = 0.05, D1 = 0.2µm2 s−1 and D2 = 10−2 µm2 s−1 . (A,B) Error in spatial domain detection increases as [Lr ] approaches [Lp ]. (C,D) The domain size has little effect on estimated state sequence, however accuracy of domain size estimates increases as domains become smaller. With smaller regions, the particle visits more domains so there are more data points from which to make an estimation. (E,F) Barrier permeability has little effect on the error of the predicted state sequence. Area estimates become more accurate with stronger barriers because particles cover more area inside each domain. 58 4.4. Discussion B 0.40 0.35 0.30 0.25 0.20 0.15 0.10 0.05 0.00 2.72 4.06 6.05 9.03 [LR]/[LP] 13.46 20.09 1.0 0.02 0.05 0.14 0.37 1.00 2.72 7.39 0.45 Area (µm2) Area (µm2) 0.50 0.02 0.05 0.14 0.37 1.00 2.72 7.39 A 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 2.72 4.06 6.05 9.03 13.46 20.09 0.0 [LR]/[LP] Figure 4.8: Errors in the estimated layer 2 state sequence and domain size obtained by analyzing simulated tracks with TRM in rich and poor ligand regions. For each plot, the mean error is plotted for 20 independent simulations. For all simulations, Pjump = 0.05, D1 = 0.2µm2 s−1 and D2 = 10−2 µm2 s−1 . (A) Error in the estimated state sequence is largely independent of domain size but it increases as [Lr ] approaches [Lp ]. (B) As with the state sequence estimate, error in the estimated domain size increases as [Lr ] approaches [Lp ]. We also obtain more accurate area estimates for smaller domains because the particle is then able to visit more regions during a single trajectory. 59 4.4. Discussion A3 Rich Poor 2 y(µm) 1 0 −1 −2 −3 −4 −4 −3 −2 −1 0 1 2 3 x(µm) B3 Fast Slow 2 C3 Accuracy: 99.7% 1 y(µm) 1 y(µm) Accuracy: 89.5% Rich Poor 2 0 0 −1 −1 −2 −2 −3 −3 −4 −4 −4 −3 −2 −1 x(µm) 0 1 2 3 −4 −3 −2 −1 0 1 2 3 x(µm) Figure 4.9: Typical results of two-layered MCMC parameter optimization and track segmentation with one simulated particle track consisting of 50,000 frames and the following parameters: [Lr ]/[Lp ] = 20, area of spatial domains ≈ 1µm2 , Pjump = 0.05, D1 = 0.2µm2 s−1 , D2 = 10−2 µm2 s−1 , P12r = 0.1 and P21p = 0.005. (A) The simulated trajectory with rich (grey) and poor (white) spatial domains. (B) Layer 1 (mobility) segmentation into fast (unbound) and slow (bound) diffusive states. The estimated state sequence is 99.7% accurate. (C) Layer 2 (spatial) segmentation into rich and poor ligand regions. The estimated state sequence is 89.5% accurate. 60 4.4. Discussion B 0.6 D1 D2 P12 P21 v1 0.4 0.2 00 1.0 layer 2 (spatial) layer 1 (mobility) 0.8 relative error relative error A 0.6 0.4 0.2 0 1000 2000 3000 Track length 4000 103 104 Track length 105 Figure 4.10: Track length versus error in parameter estimates for simulated trajectories within ligand enriched domains. Error and the 90% confidence interval for the estimated state sequences for layer 1 (red) and layer 2 (blue). 20 independent particle trajectories were analyzed for each track length. In all trajectories, particles were undergoing TRM with the following parameters: [Lr ]/[Lp ] = 20, area of spatial domains = 1µm2 , Pjump = 0.05, D1 = 2−1 µm2 s−1 , D2 = 10−2 µm2 s−1 , P12r = 0.1 and P21p = 0.005. 61 Chapter 5 Conclusions Here we have presented hidden Markov models of single particle trajectories along with a likelihood-based statistical analysis that can accurately decipher multi-state mobility. Heterogeneous movement is undetectable using standard MSD analysis for SPT experiments (see Fig. 2.5). In Chapter 3 we demonstrate our method’s ability to distinguish between transient reduced mobility and transient directed movement and estimate model parameters. Because we attribute state switching to ligand binding and unbinding, estimates of transition rates give a measure of on- and off-rates for the reaction. These estimates can provide important insight into the role of these associations under different cellular conditions [8, 11]. Importantly, the parameter estimates for this model can be used to identify the most probable state at each frame. This estimated state sequence becomes crucial in the extended model for detecting spatial domains presented in Chapter 4. Using a likelihood-based approach leads to natural estimates of the uncertainties in the estimated parameters arising from both experimental noise and finite sampling. These error estimates are valuable in experimental work, for example, in interpreting the differences in parameter values obtained in different experimental conditions. Also, unlike window-based methods for detecting transient behavior, we require few user-defined parameters and minimal analyst intervention which is advantageous given the quantity of data being analyzed. Furthermore, the setup of the HMM allows for straightforward adaptations such as including additional states or types of mobility. The adaptability of our model is demonstrated in Chapter 4. We modify the HMM developed in Chapter 3 in order to describe trajectories containing spatiallydependent state switches. Our statistical analysis confidently predicts where transitions in mobility occur and this information is then used to infer the size and location of certain spatial domains. The two types of spatial domains we can iden62 5.1. Future work tify here were previously undetectable through other approaches. Also, our method is unique in its ability to extract information about them through changes in the local or microscopic diffusion coefficient Dmicro . The first type of domain we can detect is one in which a particle experiences reduced diffusivity. This feature is common to lipid rafts. The second type of domain is a region enriched in ligand. These two domain types are ubiquitous on the cell surface and crucial in many cell functions such as signal transduction [31, 36, 44]. 5.1 Future work There are several ways in which the two-state hidden Markov model could be extended in order to analyze different types of particle trajectories. • Additional states could be added to the model. Modes of motion in particle trajectories have been classified into as many as four groups previously. For instance, E-Cadhedrin was observed to undergo free diffusion, restricted diffusion and directed motion on the surface of mouse keratinocyte cells [35] and classification of Fc RI trajectories included an immobile mode in addition to the former three groupings [2]. Therefore, adding another state to the model could be a worthwhile extension in some cases. Although this would be a simple extension to implement due to the adaptable nature of the HMM, care should be taken in interpreting the results. In this extension, significantly more parameters are added to the model - the number of elements in the transition probability matrix is the square of the number of states. Therefore, a large quantity of SPT data containing many state transitions would be needed to accurately estimate all of the parameters. Also, one would need to use a model selection tool such as the Akaike Information Criterion to see whether the improvement in fit can compensate for the additional parameters. However, given that the HMM is a likelihood-based method, methods like these can be applied in a statistically rigorous manner. • Additional modes of motion could be added to the model. As shown in Chapter 3 with the addition of directed motion, it is possible to discriminate between different types of motion using the HMM analysis. So far, we 63 5.1. Future work have not considered confined or restricted diffusion in our analysis. This particular mode of motion has been observed in many single particle tracks so it would be a useful extension to consider[2, 24, 63]. However, adding this mode of diffusivity would require some substantial modifications to the model. Here, we do a local analysis of mobility meaning that we consider each frame individually in the likelihood calculation. Confinement cannot be detected using only one frame because it takes more than one individual step for the particle to sense the presence of compartment boundaries. One potential way to overcome this difficulty would be to create another layer of the HMM for detecting confinement, much like the one created to detect ligand-enriched domains in Chapter 4. In this way, one could analyze trajectories over different timescales with the detection of TRM or TDM using local measurements and transient confinement using longer timeframes. This multi-scale analysis would also lead to an improvement over previous methods for detecting transient confinement [39, 66]. • There are other types of data, in addition to single particle tracks on cellular membranes, which show multi-state movement. Previously, hidden Markov models have been applied to analyze actomyosin and kinesin-microtubule movement data [68, 78], and DNA looping dynamics [4]. The HMM analysis could also be applied to larger scale trajectories in ecological studies. For example, bluefin tuna in the Atlantic Ocean alternate between small-scale random paths associated with the presence of resource patches and largescale directed movements between these patches [26]. Multi-phase mobility has been observed in tracking data for many other animals as well including coyotes, elk and albatrosses [6, 23, 40]. A better understanding of this type of animal movement can provide important insight into landscape features, the spatial patterns and dynamics of animal territories and foraging strategies. Throughout this thesis, we have stressed the fact that spatial domains play a large part in co-ordinating the lateral movement of molecules on the cell’s surface. By doing so, these domains are key to many cell processes that occur at the membrane. Despite their importance however, many of their properties are still 64 5.1. Future work poorly understood. We have also demonstrated here that single particle tracking experiments have the potential to unveil valuable missing information about membrane heterogeneity. In particular, information about membrane organization can be inferred from trajectories by studying the resulting switches in particle mobility. As demonstrated in this thesis, the hidden Markov model is a powerful and easily adaptable tool that can rigorously quantify these transitions. As such, this class of models should prove useful for analyzing many experimental trajectories, extracting valuable information regarding the spatiotemporal dynamics of receptor mobility. 65 Bibliography [1] C.M. Anderson, G.N. Georgiou, I.E. Morrison, G.V. Stevenson, and R.J. Cherry. Tracking of cell surface receptors by fluorescence digital imaging microscopy using a charge-coupled device camera. Low-density lipoprotein and influenza virus receptor mobility at 4 degrees Celcius. J. Cell Sci., 101(2):415–425, 1992. [2] N.L. Andrews, K.A. Lidke, J.R. Pfeiffer, A.R. Burns, B.S. Wilson, J.M. Oliver, and D.S. Lidke. Actin restricts Fc -RI diffusion and facilitates antigen-induced receptor immobilization. Nat. Cell Biol., 10(8):955–963, 2008. [3] D. Arcizet, B. Meier, E. Sackmann, J.O. Radler, and D. Heinrich. Temporal analysis of active and passive transport in living cells. Phys. Rev. Lett., 101(24):248103, 2008. [4] J.F. Beausang and P.C. Nelson. Diffusive hidden Markov model characterization of DNA looping dynamics in tethered particle experiments. Phys. Biol., 4:205–219, 2007. [5] C. Bouzigues and M. Dahan. Transient directed motions of GABAA receptors in growth cones detected by a speed correlation index. Biophys. J., 92(2):654–660, 2007. [6] N. Brothers, R. Gales, A. Hedd, and G. Robertson. Foraging movements of the Shy Albatross Diomedea cauta breeding in Australia; implications for interactions with longline fisheries. Ibis, 140(3):446–457, 1998. [7] D.A. Brown and E. London. Structure and origin of ordered lipid domains in biological membranes. J. Membr. Biol., 164(2):103–114, 1998. 66 Bibliography [8] C.W. Cairo, R. Das, A. Albohy, Q.J. Baca, D. Pradhan, J.S. Morrow, D. Coombs, and D.E. Golan. Dynamic regulation of CD45 lateral mobility by the spectrin-ankyrin cytoskeleton of T cells. J. Biol. Chem., 285(15):11392— 11401, 2010. [9] C.W. Cairo, R. Mirchev, and D.E. Golan. Cytoskeletal regulation couples LFA-1 conformational changes to receptor lateral mobility and clustering. Immunity, 25(2):297–308, 2006. [10] R.J. Cherry, K.M. Wilson, K. Triantafilou, P. O’Toole, I.E.G. Morrison, P.R. Smith, and N. Fernandez. Detection of dimers of dimers of human leukocyte antigen HLA-DR on the surface of living cells by single-particle fluorescence imaging. J. Cell Biol., 140(1):71–79, 1998. [11] R. Das, C.W. Cairo, and D. Coombs. A hidden Markov model for single particle tracks quantifies dynamic interactions between LFA-1 and the actin cytoskeleton. PLoS Comput. Biol., 5(11):e1000556, 2009. [12] F. Daumas, N. Destainville, C. Millot, A. Lopez, D. Dean, and L. Salom´e. Confined diffusion without fences of a G-protein-coupled receptor as revealed by single particle tracking. Biophys. J., 84(1):356–366, 2003. [13] C.A. Day and A.K. Kenworthy. Tracking microdomain dynamics in cell membranes. Biochim. Biophys. Acta. Biomembr., 1788(1):245–253, 2009. [14] N. Destainville. Cluster phases of membrane proteins. Phys. Rev. E, 77(1):11905, 2008. [15] N. Destainville, F. Dumas, and L. Salom´e. What do diffusion measurements tell us about membrane compartmentalisation. Emergence of the role of interprotein interactions. J. Chem. Biol., 1(1):37–48, 2008. [16] C. Dietrich, B. Yang, T. Fujiwara, A. Kusumi, and K. Jacobson. Relationship of lipid rafts to transient confinement zones detected by single particle tracking. Biophys. J., 82(1):274–284, 2002. [17] C. Dillon and Y. Goda. The actin cytoskeleton: integrating form and function at the synapse. Annu. Rev. Neurosci., 28:25–55, 2005. 67 Bibliography [18] A.D. Douglass and R.D. Vale. Single-molecule microscopy reveals plasma microdomains created by protein-protein networks that exclude or trap signaling molecules in T cells. Cell, 121:937–950, 2005. [19] M.L. Dustin and J.A. Cooper. The immunological synapse and the actin cytoskeleton: molecular hardware for T cell signaling. Nat. Immunol., 1(1):23– 29, 2000. [20] C. Eggeling, C. Ringemann, R. Medda, G. Schwarzmann, K. Sandhoff, S. Polyakova, V.N. Belov, B. Hein, C. Von Middendorff, A. Sch¨onle, and S.W. Hell. Direct observation of the nanoscale dynamics of membrane lipids in a living cell. Nature, 457(7233):1159–1162, 2008. [21] T.J. Feder, I. Brust-Mascher, J.P. Slattery, B. Baird, and W.W. Webb. Constrained diffusion or immobile fraction on cell surfaces: a new interpretation. Biophys. J., 70(6):2767–2773, 1996. [22] R. Fisher. Statistical methods and scientific induction. R. Stat. Soc. Ser. B Stat. Methodol., 17(1):69–78, 1955. [23] J.D. Forester, A.R. Ives, M.G. Turner, D.P. Anderson, D. Fortin, H.L. Beyer, D.W. Smith, and M.S. Boyce. State-space models link elk movement patterns to landscape characteristics in Yellowstone National Park. Ecol. Monograph, 77(2):285–299, 2008. [24] T. Fujiwara, K. Ritchie, H. Murakoshi, K. Jacobson, and A. Kusumi. Phospholipids undergo hop diffusion in compartmentalized cell membrane. J. Cell Biol., 157(6):1071–1081, 2002. [25] A. Gelman, G. Roberts, and W. Gilks. Efficient metropolis jumping rules. Bayesian Statistics, 5:599–608, 1996. [26] R. Gutenkunst, N. Newlands, M. Lutcavage, and L. Edelstein-Keshet. Inferring resource distributions from Atlantic bluefin tuna movements: an analysis based on net displacement and length of track. J. Theor. Biol., 245(2):243– 257, 2007. 68 Bibliography [27] O. Hegener, L. Prenner, F. Runkel, S.L. Baader, J. Kappler, and H. Haberlein. Dynamics of β2 -Adrenergic receptor-ligand complexes on living cells. Biochemistry, 43(20):6190–6199, 2004. [28] M.L. Hermiston, Z. Xu, and A. Weiss. CD45: a critical regulator of signaling thresholds in immune cells. Annu. Rev. Immunol., 21(1):107–137, 2003. [29] R. Iino, I. Koyama, and A. Kusumi. Single molecule imaging of green fluorescent proteins in living cells: E-cadherin forms oligomers on the free cell surface. Biophys. J., 80(6):2667–2677, 2001. [30] H. Ike, A. Kosugi, A. Kato, R. Iino, H. Hirano, T. Fujiwara, K. Ritchie, and A. Kusumi. Mechanism of Lck recruitment to the T-cell receptor cluster as studied by single-molecule-fluorescence video imaging. Chemphyschem, 4(6):620–626, 2003. [31] K. Jacobson, O.G. Mouritsen, and R.G.W. Anderson. Lipid rafts: at a crossroad between cell biology and physics. Nat. Cell Biol., 9(1):7–14, 2007. [32] D.F. Kucik, M.L. Dustin, J.M. Miller, and E.J. Brown. Adhesion-activating phorbol ester increases the mobility of leukocyte integrin LFA-1 in cultured lymphocytes. J. Clin. Invest., 97(9):2139–2144, 1996. [33] A. Kusumi, I. Koyama-Honda, and K. Suzuki. Molecular dynamics and interactions for creation of stimulation-induced stabilized rafts from small unstable steady-state rafts. Traffic, 5(4):213–230, 2004. [34] A. Kusumi, C. Nakada, K. Ritchie, K. Murase, K. Suzuki, H. Murakoshi, R.S. Kasai, J. Kondo, and T. Fujiwara. Paradigm shift of the plasma membrane concept from the two-dimensional continuum fluid to the partitioned fluid: high-speed single-molecule tracking of membrane molecules. Annu. Rev. Biophys. Biomol. Struct., 34:351—378, 2005. [35] A. Kusumi, Y. Sako, and M. Yamamoto. Confined lateral diffusion of membrane receptors as studied by single particle tracking (nanovid microscopy). effects of calcium-induced differentiation in cultured epithelial cells. Biophys. J., 65(5):2021–2040, 1993. 69 Bibliography [36] P. Lajoie, J.G. Goetz, J.W. Dennis, and I.R. Nabi. Lattices, rafts, and scaffolds: domain regulation of receptor signaling at the plasma membrane. J. Cell Biol., 185(3):381–385, 2009. [37] D.M. Leitner, F.L.H. Brown, and K.R. Wilson. Regulation of protein mobility in cell membranes: a dynamic corral model. Biophys. J., 78(1):125–135, 2000. [38] S. Matsuoka, T. Shibata, and M. Ueda. Statistical analysis of lateral diffusion and multistate kinetics in single-molecule imaging. Biophys. J., 97(4):1115– 1124, 2009. [39] N. Meilhac, L. Le Guyader, L. Salome, and N. Destainville. Detection of confinement and jumps in single-molecule membrane trajectories. Phys. Rev. E., 73(1):11915, 2006. [40] P. Moorcroft and M.A. Lewis. Mechanistic home range models capture spatial patterns and dynamics of coyote territories in yellowstone. Proc. R. Soc. B, 273:1651–1659, 2006. [41] K. Murase, T. Fujiwara, Y. Umemura, K. Suzuki, R. Iino, H. Yamashita, M. Saito, H. Murakoshi, K. Ritchie, and A. Kusumi. Ultrafine membrane compartments for molecular diffusion as revealed by single molecule techniques. Biophys. J., 86(6):4075–4093, 2004. [42] S. Nelson, R.D. Horvat, J. Malvey, D.A. Roess, B.G. Barisas, and C.M. Clay. Characterization of an intrinsically fluorescent gonadotropin-releasing hormone receptor and effects of ligand binding on receptor lateral diffusion. Endocrinology, 140(2):950–957, 1999. [43] D.V. Nicolau Jr, K. Burrage, R.G. Parton, and J.F. Hancock. Identifying optimal lipid raft characteristics required to promote nanoscale protein-protein interactions on the plasma membrane. Mol. Cell. Biol., 26(1):313–323, 2006. [44] D.M. Owen, D. Williamson, C. Rentero, and K. Gaus. Quantitative microscopy: protein dynamics and membrane organisation. Traffic, 10(8):962– 971, 2009. 70 Bibliography [45] J.G. Powles, M.J.D. Mallett, G. Rickayzen, and W.A.B. Evans. Exact analytic solutions for diffusion impeded by an infinite array of partially permeable barriers. Proc. Roy. Soc. London Ser. A, 436(1897):391–403, 1992. [46] A. Pralle, P. Keller, E.L. Florin, K. Simons, and J.K.H. H¨orber. Sphingolipid– cholesterol rafts diffuse as small entities in the plasma membrane of mammalian cells. J. Cell Biol., 148(5):997–1007, 2000. [47] W.H. Press. Numerical recipes: the art of scientific computing. Cambridge Univ Pr, 2007. [48] H. Qian, M.P. Sheetz, and E.L. Elson. Single particle tracking. analysis of diffusion and flow in two-dimensional systems. Biophys. J., 60(4):910–921, 1991. [49] L.R. Rabiner. A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2):257–286, 1989. [50] L. Rajendran and K. Simons. Lipid rafts and membrane dynamics. J. Cell Sci., 118(6):1099–1103, 2005. [51] D.A. Roess, R.D. Horvat, H. Munnelly, and B.G. Barisas. Luteinizing hormone receptors are self-associated in the plasma membrane. Endocrinology, 141(12):4518–4523, 2000. [52] J. Rudnick and G. Gaspari. The shapes of random walks. Science, 237(4813):384–389, 1987. [53] P.G. Saffman and M. Delbr¨uck. Brownian motion in biological membranes. Proc. Nat. Acad. Sci. USA, 72(8):3111–3113, 1975. [54] Y. Sako and A. Kusumi. Compartmentalized structure of the plasma membrane for receptor movements as revealed by a nanometer-level motion analysis. J. Cell Biol., 125(6):1251–1264, 1994. [55] Y. Sako, A. Nagafuchi, S. Tsukita, M. Takeichi, and A. Kusumi. Cytoplasmic regulation of the movement of E-cadherin on the free cell surface as studied 71 Bibliography by optical tweezers and single particle tracking: corralling and tethering by the membrane skeleton. J. Cell Biol., 140(5):1227–1240, 1998. [56] M.J. Saxton. Lateral diffusion in an archipelago. Single-particle diffusion. Biophys. J., 64(6):1766–1780, 1993. [57] M.J. Saxton. Anomalous diffusion due to obstacles: a Monte Carlo study. Biophys. J., 66(2):394–401, 1994. [58] M.J. Saxton. Single-particle tracking: models of directed transport. Biophys. J., 67(5):2110–2119, 1994. [59] M.J. Saxton. Single-particle tracking: effects of corrals. Biophys. J., 69(2):389–398, 1995. [60] M.J. Saxton. Anomalous diffusion due to binding: a monte carlo study. Biophys. J., 70(3):1250–1262, 1996. [61] M.J. Saxton and K. Jacobson. Single-particle tracking: applications to membrane dynamics. Annu. Rev. Biophys. Bio., 26(1):373–399, 1997. [62] P. Sharma, R. Varma, R.C. Sarasij, Ira, K. Gousset, G. Krishnamoorthy, M. Rao, and Mayor S. Nanoscale organization of multiple GPI-anchored proteins in living cell membranes. Cell, 116(4):577–589, 2004. [63] E.D. Sheets, G.M. Lee, R. Simson, and K. Jacobson. Transient confinement of a glycosylphosphatidylinositol-anchored protein in the plasma membrane. Biochemistry, 36(41):12449–12458, 1997. [64] K. Simons and E. Ikonen. Functional rafts in cell membranes. Nature, 387(6633):569–572, 1997. [65] K. Simons and W.L.C. Vaz. Model systems, lipid rafts, and cell membranes. Annu. Rev. Biophys. Biomol. Struct., 33:269—295, 2004. [66] R. Simson, E.D. Sheets, and K. Jacobson. Detection of temporary lateral confinement of membrane proteins using single-particle tracking analysis. Biophys. J., 69(3):989–993, 1995. 72 Bibliography [67] S.J. Singer and G.L. Nicolson. The fluid mosaic model of the structure of cell membranes. Science, 175(23):720–731, 1972. [68] D.A. Smith, W. Steffen, R.M. Simmons, and J. Sleep. Hidden-Markov methods for the analysis of single-molecule actomyosin displacement data: the variance-hidden-Markov method. Biophys. J., 81(5):2795–2816, 2001. [69] J.E. Smith-Garvin, G.A. Koretzky, and M.S. Jordan. T cell activation. Annu. Rev. Immunol., 27:591–619, 2009. [70] H.W. Sohn, P. Tolar, T. Jin, and S.K. Pierce. Fluorescence resonance energy transfer in living cells reveals dynamic membrane changes in the initiation of B cell signaling. Proc. Nat. Acad. Sci., 103(21):8143–8148, 2006. [71] K.G.N. Suzuki, T.K. Fujiwara, F. Sanematsu, R. Iino, M. Edidin, and A. Kusumi. GPI-anchored receptor clusters transiently recruit Lyn and Gα for temporary cluster immobilization and Lyn activation: single-molecule tracking study 1. J. Cell Biol., 177(4):717–730, 2007. [72] R. Tavano, R.L. Contento, S.J. Baranda, M. Soligo, L. Tuosto, S. Manes, and A. Viola. CD28 interaction with filamin-A controls lipid raft accumulation at the T-cell immunological synapse. Nat. Cell Biol., 8(11):1270–1276, 2006. [73] M. Tomishige, Y. Sako, and A. Kusumi. Regulation mechanism of the lateral diffusion of Band 3 in erythrocyte membranes by the membrane skeleton. J. Cell Biol., 142(4):989–1000, 1998. [74] B. Treanor, D. Depoil, A. Gonzalez-Granja, P. Barral, M. Weber, O. Dushek, A. Bruckbauer, and F.D. Batista. The membrane skeleton controls diffusion dynamics and signaling through the B cell receptor. Immunity, 32(2):187– 199, 2010. [75] A. Tsuji, K. Kawasaki, S. Ohnishi, H. Merkle, and A. Kusumi. Regulation of Band 3 mobilities in erythrocyte ghost membranes by protein association and cytoskeletal meshwork. Biochemistry, 27(19):7447–7452, 1988. 73 [76] Y.M. Umemura, M. Vrljic, S.Y. Nishimura, T.K. Fujiwara, K.G.N. Suzuki, and A. Kusumi. Both MHC class II and its GPI-anchored form undergo hop diffusion as observed by single-molecule tracking. Biophys. J., 95(1):435– 450, 2008. [77] A. Viola and N. Gupta. Tether and trap: regulation of membrane-raft dynamics by actin-binding proteins. Nat. Rev. Immunol., 7(11):889–896, 2007. [78] D.B. Walton. Analysis of single-molecule kinesin assay data by hidden Markov model filtering. PhD thesis, The University of Arizona, 2002. [79] K.M. Wilson, I.E. Morrison, P.R. Smith, N. Fernandez, and R.J. Cherry. Single particle tracking of cell-surface HLA-DR molecules using Rphycoerythrin labeled monoclonal antibodies and fluorescence digital imaging. J. Cell Sci., 109(8):2101–2109, 1996. 74 Appendix A Tables Containing Results from Analysis of Simulated Data 75 Simulation values D1 (µm2 s−1 ) RE CV RE CV RE CV RE CV A p12 p12 p12 p12 = 0.0032 = 0.0100 = 0.0320 = 0.1000 1.4% 2.1% 3.4% 4.5% 1.9% 2.3% 3.2% 5.5% 3.9% 2.3% 1.2% 0.8% 3.7% 2.2% 1.9% 1.7% 26.8% 28.2% 18.4% 15.2% 31.4% 21.3% 17.1% 16.6% 45.6% 18.3% 12.6% 13.8% 31.9% 21.4% 17.6% 17.6% B p12 p12 p12 p12 = 0.0032 = 0.0100 = 0.0320 = 0.1000 0.9% 1.8% 2.1% 2.9% 1.6% 1.7% 1.8% 2.4% 8.1% 6.5% 2.6% 2.4% 11.6% 5.4% 3.3% 2.3% 31.0% 15.6% 8.8% 7.2% 29.9% 16.0% 9.9% 7.1% 39.6% 11.7% 9.3% 6.5% C D2 D2 D2 D2 = 1 × 10−4 = 5 × 10−4 = 1 × 10−3 = 1 × 10−2 40% 4.3% 4.0% 4.1% 35% 4.7% 4.5% 4.0% 17% 2.3% 1.8% 2.0% 12% 1.8% 1.8% 1.8% 778% 18.7% 9.7% 17.8% 38% 19.0% 19.4% 17.3% 1443% 25.8% 12.9% 18.1% D2 (µm2 s−1 ) Maximum likelihood estimates p12 p21 Correct state id. v1 (µms−1 ) v2 (µms−1 ) Mode Mode (%) 0 0 0 0 0 0 0 0 99.8 99.5 99.4 98.2 ±0.1 ±0.1 ± 0.2 ±0.1 28.4% 15.4% 9.7% 7.1% 0 0 0 0 0 0 0 0 99.7 99.3 98.6 97.3 ±0.08 ±0.1 ±0.3 ±0.3 46.9% 20.5% 20.2% 17.6% 0 0 0 0 0 0 0 0 75 95.7 97.7 99.3 ±9 ±0.5 ±0.2 ±0.1 [ Accuracy of parameter estimates and track segmentation for simulated tracks with TRM]For each parameter combination listed, 10 independent tracks, 4000 steps each, with a sampling interval of 1 ms were simulated and maximum likelihood parameters for a 2-state HMM were calculated with the MCMC optimization scheme (Program B.5). RE = Relative error of the maximum likelihood estimate with respect to the parameter value used for the simulation. CV = Coefficient of variation, defined as the ratio of the standard deviation to the mean for the parameter distribution obtained from the MCMC trajectory. For the RE and CV, the mean value from each individual analysis of the 10 runs is reported. Most likely particle states for each track were determined using the track segmentation algorithm and compared with the actual state sequence of the corresponding Markov chain. The percent accuracy of this prediction is reported as mean ± standard deviation. In all sections, D1 = 10−4 µm2 /s and v1 = v2 = 0µm/s. In section A, p21 = 0.01, D2 = 10−2 µm2 /s and p12 varies. In section B, p21 = 0.1, D2 = 10−2 µm2 /s and p12 76 varies. In section C, p12 = 0.05, p21 = 0.01 and D2 varies. Appendix A. Tables Containing Results from Analysis of Simulated Data Table A.1: Accuracy of parameter estimates and track segmentation for simulated tracks with TRM Simulation values D1 (µm2 s−1 ) D2 (µm2 s−1 ) Maximum likelihood estimates p12 p21 v1 (µms−1 ) RE CV RE CV RE CV RE CV RE CV v2 (µms−1 ) Correct state id. Mode (%) A p12 p12 p12 p12 = 0.0032 = 0.0100 = 0.0320 = 0.1000 1.2% 2.1% 3.2% 4.5% 2.1% 2.3% 3.4% 5.4% 2.4% 1.0% 1.6% 2.2% 1.6% 2.1% 1.8% 1.7% 28.4% 22.2% 15.4% 16.9% 31.4% 21.4% 18.3% 16.1% 35.2% 21.1% 13.7% 21.7% 31.6% 21.4% 19.1% 17.2% .67% 0.83% 1.5% 2.9% 0.82% 1.0% 1.5% 2.4% 0 0 0 0 99.2 99.6 99.3 98.9 ±1.9 ±0.1 ±0.2 ±0.3 B p12 p12 p12 p12 = 0.0032 = 0.0100 = 0.0320 = 0.1000 1.2% 1.1% 2.4% 4.6% 2.6% 1.7% 1.9% 2.4% 17.1% 7.4% 5.1% 2.0% 10.5% 5.6% 3.3% 2.3% 24.3% 14.8% 11.3% 4.9% 29.3% 16.3% 10.1% 6.9% 38.2% 23.0% 9.3% 7.2% 27.2% 15.7% 9.6% 7.2% 0.68% 0.83% 0.70% 0.84% 0.73% 0.77% 0.81% 1.0% 0 0 0 0 99.6 98.8 98.1 96.2 ±0.2 ±0.8 ±0.4 ±0.4 C D2 D2 D2 D2 = 1 × 10− 4 = 5 × 10− 4 = 1 × 10− 3 = 1 × 10− 2 4.0% 4.6% 2.7% 3.3% 4.2% 5.2% 4.4% 4.1% 2.6% 2.8% 2.7% 1.5% 1.8% 1.7% 1.7% 1.8% 19.1% 26.7% 17.6% 11.0% 18.8% 23.9% 18.7% 17.2% 19.5% 42.5% 16.9% 17.5% 19.2% 25.0% 20.1% 18.0% 2.7% 2.7% 1.7% 1.7% 1.9% 2.2% 1.9% 1.7% 0 0 0 0 97.7 92.5 97.3 99.2 ±9.1 ±0.5 ±0.2 ±0.1 [ Accuracy of parameter estimates and track segmentation for simulated tracks with TDM]For each parameter combination listed, 10 independent tracks, 4000 steps each, with a sampling interval of 1 ms were simulated and maximum likelihood parameters for a 2-state HMM were calculated with the MCMC optimization scheme (Program B.5). RE = Relative error of the maximum likelihood estimate with respect to the parameter value used for the simulation. CV = Coefficient of variation, defined as the ratio of the standard deviation to the mean for the parameter distribution obtained from the MCMC trajectory. For the mode, RE and CV, the mean value from each individual analysis of the 10 independent runs is reported. Most likely particle states for each track were determined using the track segmentation algorithm and compared with the actual state sequence of the corresponding Markov chain. The percent accuracy of this prediction is reported as mean ± standard deviation. In all sections, D1 = 10−4 µm2 /s, v1 = 1µm/s and v2 = 0. In section A, p21 = 0.01, D2 = 10−2 µm2 /s and p12 varies. In section B, p21 = 0.1, D2 = 10−2 µm2 /s 77 and p12 varies. In section C, p12 = 0.05, p21 = 0.01 and D2 varies. Appendix A. Tables Containing Results from Analysis of Simulated Data Table A.2: Accuracy of parameter estimates and track segmentation for simulated tracks with TDM Simulation Maximum likelihood estimates D1 values Mode (×10−2 ) (µm2 /s) D2 RE CV Mode (×10−2 ) (µm2 /s) RE CV Correct v1 (µm/s) v2 (µm/s) Mode Mode (%) state id. D1 = 1 × 10−4 0.0098 2.07% 2.3% 0.77 23.1% 2.3% 0 0 99.5 ±0.1 10−3 0.0922 7.83% 2.4% 0.77 23.3% 2.2% 0 0 98.3 ±0.4 D1 = 5 × 10−3 0.4114 17.7% 3.2% 0.77 22.9% 3.1% 0 0 87.1 ±2.2 D1 = 1 × [ Accuracy of parameter estimates and track segmentation for simulated tracks within thin channels]For each parameter combination listed, 10 independent tracks, 4000 steps each, with a sampling interval of 1 ms were simulated and maximum likelihood parameters for a 2-state HMM were calculated with the MCMC optimization scheme (Program B.5). RE = Relative error of the maximum likelihood estimate with respect to the parameter value used for the simulation. CV = Coefficient of variation, defined as the ratio of the standard deviation to the mean for the parameter distribution obtained from the MCMC trajectory. For the mode, RE and CV, the mean value from each individual analysis of the 10 independent runs is reported. Most likely particle states for each track were determined using the track segmentation algorithm and compared with the actual state sequence of the corresponding Markov chain. The percent error of the state sequence is reported in the last column as the mean ± standard deviation. In all sections, D2 = 10−2 µm2 /s, p12 = 0.05, p21 = 0.01 and v1 = v2 = 0µm/s. Appendix A. Tables Containing Results from Analysis of Simulated Data Table A.3: Accuracy of parameter estimates and track segmentation for simulated tracks within thin channels 78 Simulation values Maximum likelihood estimates D1 (µm2 s−1 ) D2 (µm2 s−1 ) Error in domain Error in state area estimate id. estimate RE CV RE CV (%) (%) D2 /D1 = 10 2.5% 1.2% 6.3% 2.1% 25.4 ±10.7 0.2 ±0.1 D2 /D1 = 5 10.5% 2.3% 3.6% 1.2% 16.2 ±8.9 0.7 ±0.3 D2 /D1 = 2 14.5% 3.5% 2.6% 1.2% 12.0 ±10.0 2.5 ±1.2 [ HMM analysis of simulated tracks within raft domains]For each parameter combination listed, 20 independent tracks, 10,000 steps each, with a sampling interval of 1 ms were simulated and maximum likelihood parameters for a 2-state HMM were calculated with the MCMC optimization scheme (Program B.5). RE = Relative error of the estimate with respect to the parameter value used for the simulation. CV = Coefficient of variation, defined as the ratio of the standard deviation to the mean for the parameter distribution obtained from the MCMC trajectory. For the RE and CV, the mean value from each individual analysis of the 20 independent runs is reported. The errors for domain size and state sequence is reported in the last two columns as the mean ± standard deviation. For all trajectories, D1 = 10−2 µm2 s−1 , Pjump = 0.05 and the area of each raft domain was 0.04 µm2 . Appendix A. Tables Containing Results from Analysis of Simulated Data Table A.4: HMM analysis of simulated tracks within raft domains 79 Simulation values A B C Maximum likelihood estimates D1 (µm2 s−1 ) D2 (µm2 s−1 ) RE CV RE CV Error in domain area estimate (%) Area = 18.3nm2 Area = 49.8nm2 Area = 135.3nm2 Area = 367.9nm2 Area = 1.00µm2 Area = 2.71µm2 Area = 7.38µm2 0.9% 0.9% 0.8% 1.1% 1.1% 0.8% 1.1% 1.0% 1.0% 0.9% 0.9% 0.9% 0.9% 1.0% 0.5% 0.4% 0.4% 0.4% 0.5% 0.4% 0.5% 0.6% 0.5% 0.5% 0.5% 0.5% 0.5% 0.5% 23.4 39.8 26.6 50.6 45.1 58.6 75.1 ±24.3 ±16.4 ±21.5 ±27.9 ±28.0 ±18.3 ±20.0 0.5 0.5 0.5 0.4 0.5 0.4 0.4 ±0.04 ±0.04 ±0.05 ±0.05 ±0.04 ±0.05 ±0.09 13.0 10.3 12.6 13.6 12.8 13.1 16.4 ±2.3 ±2.3 ±3.5 ±3.4 ±4.4 ±4.6 ±17.5 = 0.01 = 0.05 = 0.10 = 0.50 = 1.00 0.9% 1.1% 1.0% 1.1% 1.3% 0.9% 0.9% 1.0% 1.0% 1.1% 0.3% 0.4% 0.4% 0.4% 0.6% 0.5% 0.5% 0.5% 0.5% 0.5% 25.6 43.4 39.2 41.6 58.4 ±22.5 ±14.8 ±19.6 ±17.1 ±10.1 0.4 0.5 0.5 0.5 0.7 ±0.06 ±0.04 ±0.05 ±0.05 ±0.05 3.5 6.0 5.7 8.1 9.3 ±3.1 ±2.2 ±2.6 ±3.3 ±2.8 [LR ]/[LP ] = 2.72 [LR ]/[LP ] = 4.05 [LR ]/[LP ] = 6.05 [LR ]/[LP ] = 9.03 [LR ]/[LP ] = 13.46 [LR ]/[LP ] = 20.09 1.0% 1.3% 1.2% 1.2% 0.9% 1.1% 1.3% 1.2% 1.0% 1.0% 1.0% 0.9% 0.4% 0.4% 0.6% 0.5% 0.5% 0.5% 0.5% 0.5% 0.5% 0.5% 0.5% 0.5% 47.2 42.1 47.2 37.6 37.8 43.7 ±24.6 ±21.8 ±36.6 ±17.7 ±16.7 ±17.8 0.5 0.5 0.5 0.5 0.5 0.4 ±0.04 ±0.05 ±0.03 ±0.04 ±0.06 ±0.05 15.3 12.8 9.4 8.6 7.2 6.7 ±6.6 ±5.6 ±3.8 ±2.7 ±3.6 ±2.8 Pjump Pjump Pjump Pjump Pjump Error in state id. estimate Layer 1 (mobility) Layer 2 (spatial) (%) (%) [ HMM analysis of simulated tracks within lipid-enriched domains]20 independent tracks, 50,000 steps each, with a sampling interval of 1 ms were simulated with D1 = 2−1 µm2 s−1 and D2 = 1−2 µm2 s−1 . Unless indicated differently, [Lr ]/[Lp ] = 20, Pjump = 0.05 and domains had area of 1.0µm2 . RE = Relative error of the 80 maximum likelihood estimate and CV = Coefficient of variation, defined as the ratio of the standard deviation to the mean for the parameter distribution obtained from the MCMC trajectory. For the RE and CV, the mean value the 20 independent runs is reported. The percent error of the state sequences and domain area estimates are reported in the last three columns as the mean ± standard deviation. Appendix A. Tables Containing Results from Analysis of Simulated Data Table A.5: HMM analysis of simulated tracks within lipid-enriched domains Appendix B Algorithms Used in the Generation of Simulated Data and Statistical Analysis Program B.1 Algorithm for generating a two-state Markov chain for given diffusion coefficients D1 and D2 , velocities v1 and v2 , and transition rates P12 and P21 . 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 26: 27: Calculate stationary probabilities π1 = P21 /(P12 + P21 ), π2 = 1 − π1 Generate a uniformly distributed random number u ∈ U (0, 1) if u ≤ π1 then s1 = 1 else s1 = 2 end if for i = 2 to i = N do Generate a uniformly distributed random number u ∈ U (0, 1) if si−1 = 1 AND u ≤ P12 then si = 2 else if si−1 = 2 AND u ≤ P21 then si = 1 else si = si−1 endif end for Generate a uniformly distributed random number (velocity angle) φ ∈ U (0, 2π) Generate a normally distributed random number (step along x-axis) x1 ∈ N (vs1 cos φ, 2Ds1 τ ) Generate a normally distributed random number (step along y-axis) y1 ∈ N (vs1 sin φ, 2Ds1 τ ) for i = 2 to i = N do if si = si−1 Generate new velocity angle φ ∈ U (0, 2π) end if Generate step along x-axis xi ∈ N (vsi cos φ, 2Dsi τ ) Generate step along y-axis yi ∈ N (vsi cos φ, 2Dsi τ ) end for 81 Appendix B. Algorithms Used in the Generation of Simulated Data and Statistical Analysis Program B.2 Forward algorithm for calculating log likelihood of parameter values θ of a two-state HMM for a given trajectory observation O=o1 , ...oN . In layer 1, oj = (xj , yj ) and in layer 2, oj = (sj−1 , sj ). 1: 2: 3: 4: 5: 6: 7: logsum(a, b) = log ea + eb Calculate i (oj ) for i = 1, 2 and j = 1, ..., N using Eq. 3.9 (layer 1) or 4.5 (layer 2) α1 (i) = log(πi ) + i (o1 ); i = 1, 2 for j = 2 to j = N do αj (i) = logsum[αj−1 (1) + log(P1i ), αj−1 (2) + log(P2i )] + i (oj ); i = 1, 2 end for L(θ|O) = logsum[α1 (N ), α2 (N )] ˆ = Program B.3 Backward algorithm for identifying the most likely states Q sˆ1 , ...ˆ sN of the particle for a given trajectory observation O=o1 , ...oN and estimated parameters θˆ including the transition probabilities Pˆ12 and Pˆ21 . This algorithm can be used in either layer of the HMM. 1: 2: 3: 5: 6: 7: Calculate the forward variable αj (i) for j = 1, ..N and i = 1, 2 with θ = θˆ using Algorithm 2 βN (i) = 0; i = 1, 2 for j = N − 1 to j = 1 do βj (i) = logsum[ 1 (oj+1 ) + βj+1 (1) + log(Pi1 ), 2 (oj+1 ) + βj+1 (2) + log(Pi2 )]; i = 1, 2 with θ = θˆ end for sˆj = argmax[αj (i) · βj (i)] for j = 1, 2, ...N i=1,2 Program B.4 Algorithm for classifying states in the HMM as diffusive or directed given an observed trajectory O, estimated parameters θˆ and an estimated state seˆ quence Q. 1: 2: 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: √ “ ” |a−µ| l Y (a, µ, σ, l) = P x ≥ |x ∈ N (µ, σ 2 ) σ “ ” Z(a, l) = P x ≥ a|x ∈ χ22l ˆ for O and θˆ using Algorithm 3 Calculate the most likely state sequence Q n1 , n2 ←− Number of times the particle enters each state (1) (n ) mi = {mi , ..., mi i } ←− Number of frames the particle is in each state for (j) (j) Calculate x i and y i for j = 1, ..ni and i = 1, 2 while omitting the first and last frames for i = 1 to i = 2 do for j = 1 to j = ni do “ ” √ √ (j) (j) (2j) (j) (2j−1) (j) = Z( y i , 0, 2Di τ , mi − 2) Calculate pi = Y x i , 0, 2Di τ , mi − 2 , pi end for“ ” Pj=2n ζi = Z −2 j=1 i ln(pj ), 4ni if ζi < 0.01 Calculate vˆi using Eq. 3.11 Calculate φˆij for j = 1, ..., ni using Eq. 3.12 else vˆi = 0 end if end for 82 Appendix B. Algorithms Used in the Generation of Simulated Data and Statistical Analysis Program B.5 MCMC likelihood-based parameter estimation algorithm for estimating the posterior distribution P (θ|O) for the model parameters θ and observation from a single particle trajectory O. This method can be used in either layer of the HMM. 1: 2: 3: 4: 5: 6: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 26: 27: 28: 29: 30: 31: 32: 30: 31: n ←− Number of MCMC steps m ←− Number of model parameters (θ = {θ1 , ...θm }) σ = {σ1 , ..., σm } ←− scale for displacement along each parameter axis nadapt ←− Number of MCMC steps to complete before adapting σ A = {A1 , ..., Am } = {0, ..., 0} ←− acceptance rate; set to zero initially Optional : nmodel ←− Number of MCMC steps to complete before testing for directed motion (0) (0) (0) (0) Choose a random initial position in parameters space θ(0) = {θ1 , θ2 , θ3 , θ4 } (0) Calculate the log likelihood of guessed parameter set, L(θ |O) using algorithm 2 for i = 1 to i = n/m do for k = 1 to k = m do l = 4(i − 1) + k − 1 Propose a displacement δθk along the kth parameter axis drawn from a normal distribution with mean 0 and standard deviation σk : θ(proposed) = θ(l) + δθk Calculate L(θ(proposed) |O) if L(θ(proposed) |O) ≥ L(θ(l) |O) then θ(l+1) = θ(proposed) ; Ak = Ak + 1 else Generate a uniformly distributed random number u ∈ U (0, 1) if log(u) ≤ L(θ(proposed) |O) − L(θ(l) |O) then θ(l+1) = θ(proposed) ; Ak = Ak + 1 else θ(l+1) = θ(l) end if end if if mod(i, nadapt ) = 0 AND i ≤ n/(2m) then Calculate the acceptance rate for the last nadapt MCMC steps: Ak = A/nadapt if Ak ≥ 0.49 OR Ak ≤ 0.39 then Calculate the new scale for displacement: σk = σk Ak /0.44 end if Ak = 0 end if Optional: if mod(i, nmodel )=0 then Test for directed motion using Algorithm 4 and θ(l+1) end if end for end for 83
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Deciphering multi-state mobility within single particle...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Deciphering multi-state mobility within single particle trajectories of proteins on the plasma membrane Morrison, Jennifer S. 2010
pdf
Page Metadata
Item Metadata
Title | Deciphering multi-state mobility within single particle trajectories of proteins on the plasma membrane |
Creator |
Morrison, Jennifer S. |
Publisher | University of British Columbia |
Date Issued | 2010 |
Description | Single particle tracking is a powerful technique often used in the study of dynamic mechanisms on the cell surface such as binding, confinement and trafficking. Experimental trajectories can be used to detect changes in the lateral mobility of individual molecules over time and space. Therefore, a potential problem in the analysis of single particle trajectories is to account for transitions between modes of mobility. Here we present two coupled statistical methods which characterize particle mobility that is temporally and spatially heterogeneous. The first method detects periods of drift diffusion or reduced mobility within single trajectories due to transient associations with other biomolecules. The second locates spatial domains which have higher or lower concentrations of these associating molecules. The trajectory is modeled as the outcome of a two-state Hidden Markov model parameterized by the diffusion coefficients and drift velocities of each state and the rates of transitions between them (which may change in space). Transitions between states arise from association and disassociation with a binding partner, either membrane-associated or cytosolic. These associations lead to either reduced Brownian diffusion or drift diffusion. An adapted Markov chain Monte Carlo algorithm was used to optimize parameters and simultaneously select the most favorable model of lateral mobility (transient reduced mobility or transient drift diffusion) and to locate spatial domains. Analysis of simulated particle tracks with a wide range of parameters successfully distinguished between the two models, gave accurate estimates for parameters and accurately located spatial domains. |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | eng |
Date Available | 2010-08-24 |
Provider | Vancouver : University of British Columbia Library |
Rights | Attribution-NonCommercial-NoDerivatives 4.0 International |
DOI | 10.14288/1.0071178 |
URI | http://hdl.handle.net/2429/27723 |
Degree |
Master of Science - MSc |
Program |
Mathematics |
Affiliation |
Science, Faculty of Mathematics, Department of |
Degree Grantor | University of British Columbia |
GraduationDate | 2010-11 |
Campus |
UBCV |
Scholarly Level | Graduate |
Rights URI | http://creativecommons.org/licenses/by-nc-nd/4.0/ |
AggregatedSourceRepository | DSpace |
Download
- Media
- 24-ubc_2010_fall_morrison_jennifer.pdf [ 6.02MB ]
- Metadata
- JSON: 24-1.0071178.json
- JSON-LD: 24-1.0071178-ld.json
- RDF/XML (Pretty): 24-1.0071178-rdf.xml
- RDF/JSON: 24-1.0071178-rdf.json
- Turtle: 24-1.0071178-turtle.txt
- N-Triples: 24-1.0071178-rdf-ntriples.txt
- Original Record: 24-1.0071178-source.json
- Full Text
- 24-1.0071178-fulltext.txt
- Citation
- 24-1.0071178.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.24.1-0071178/manifest