HOW WELL DOES A LINEAR FUNCTIONAL MODEL PREDICT THE RESPONSES OF PRIMARY VISUAL CORTEX NEURONS TO A NATURAL SCENE STIMULUS? by Nadia Pietravalle B.Sc. Physiology & Physics, McGill University, 2004 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE in The Faculty of Graduate Studies (Neuroscience) THE UNIVERSITY OF BRITISH COLUMBIA (Vancouver) December 2010 © Nadia Pietravalle, 2010 ii Abstract The goal was to test how well a linear model of the responses of neurons in area 18 of cat visual cortex, derived from recordings made in anaesthetized adult cats, predicts responses to natural scene stimuli. Methods: Estimates of the spatio-temporal receptive field profile of the neurons were obtained by reverse correlation to an m-sequence stimulus (Reid et al., 1997). The receptive field estimate, together with a non-linear response function, was then used to give the expected probability, or rate, of spike firing (Chichilnisky, 2001; Ringach & Malone, 2007) during a time- varying natural scene stimulus. The ability of the model to describe the responses was assessed by computing the correlation coefficient between the rates predicted by the model and those observed during stimulation with a natural scene (Willmore & Smyth, David & Gallant, 2005). For each LN functional model identified for all real A18 neurons using m-sequence responses, a Poisson spike generator was added (Heeger, 2000) to simulate ‘LNP’ responses to m-sequence and natural scene stimuli, and was used to assess the statistical significance of the results. Results: The LN model, with parameters derived from responses to m-sequence stimuli, was able to predict responses to m-sequence stimuli with fairly high reliability (correlation coefficients in the range 0.84 – 0.96). However the model was only able to weakly predict responses to natural scene stimuli. This result was confirmed by comparing the correlation coefficients between predicted and observed firing rates obtained for actual and for simulated responses to the natural scene stimulus; values ranged from 0.14 to 0.59, in marked contrast to the simulated ones ranging from 0.47 to 0.88. Reasons for the inability of the LNP model to predict responses to natural scene stimuli are discussed. iii Table of Contents Abstract ........................................................................................................................................ ii Table of Contents ..................................................................................................................... iii List of Tables ...............................................................................................................................v List of Figures ........................................................................................................................... vi List of Equations .................................................................................................................... viii Acknowledgements.................................................................................................. ix Dedication.................................................................................................................................... x 1 Introduction............................................................................................................................ 1 1.1 Overview of Primary Visual Cortex............................................................................ 2 1.2 Describing the Function of Primary Visual Cortex Neurons.................................... 3 1.3 Linear Properties of Primary Visual Cortex Receptive Fields ................................. 6 1.4 Standard Model of Primary Visual Cortex Sensory Processing ............................ 10 2 Methods ................................................................................................................................ 16 2.1 Silicon Polytrode Electrophysiology in Cat Primary Visual Cortex ...................... 16 2.1.1 Animal Surgery and Preparation........................................................................... 16 2.1.2 Visual Neural Response Data Acquisition............................................................ 16 2.1.3 Stimulus Presentation (m-sequence and natural scene movies) ........................... 18 2.1.4 Spike Sorting and Identifying Individual Neuron................................................. 19 2.2 LN Functional Model Identification.......................................................................... 20 2.2.1 Estimating the Spatio-Temporal Receptive Field, RF .......................................... 22 2.2.2 Identifying the Static Output Nonlinearity, N(L) .................................................. 24 2.3 Measuring Accuracy of LN Prediction of Natural Scene........................................ 31 2.4 Comparing LN Model Prediction Accuracy for Different Stimuli ........................ 33 2.5 Simulated LN Model Prediction Accuracy............................................................... 34 iv 2.5.1 ‘LNP’ Simulated Spike Trains.............................................................................. 35 2.5.2 Estimation of RF and Static Output Nonlinearity, N(L) ....................................... 36 2.6 Measuring the Strength of the RF Signal ................................................................. 38 2.7 Gabor Fit to RF Profile .............................................................................................. 38 2.8 Software ....................................................................................................................... 40 3 Results................................................................................................................................... 41 3.1 Estimating the RF Profiles and N(L) Parameters for Each Cell ............................ 41 3.1.1 Estimating the Linear RF Profile (the First-Order Wiener Kernel)...................... 41 3.1.2 Plotting the Linear Response Prediction vs. the Observed Response................... 43 3.1.3 Optimizing N(L) parameters ................................................................................. 43 3.2 RF Signal Strength...................................................................................................... 45 3.3 Effect of the Procedure for Identifying the Functional Model ............................... 47 3.4 Model Accuracy .......................................................................................................... 52 4 Discussion and Conclusion ................................................................................................. 56 4.1 The Standard Model is Partially Accurate............................................................... 56 4.2 Potential Sources of Experimental Error ................................................................. 58 4.2.1 Movement of the Eyes during Recording ............................................................. 58 4.2.2 Computational Error ............................................................................................. 60 4.3 Deviations of Simple Cell Responses ......................................................................... 60 4.3.1 Response Suppression........................................................................................... 62 4.3.2 Properties of Natural Stimuli ................................................................................ 64 4.3.3 Extra-Classical Effects.......................................................................................... 65 4.4 Conclusion ................................................................................................................... 66 Bibliography ................................................................................................................................. 67 Appendix A ................................................................................................................................... 71 Appendix B ................................................................................................................................... 78 v List of Tables Table 4. 1 Model Prediction Accuracy Values ............................................................................ 57 Table 4. 2 Effect of Stimulus Displacements on Prediction Accuracy......................................... 59 Table A. 1 Receptive Field Signal Strength ................................................................................. 73 Table A. 2 Optimal Parameter Values and Goodness-of-Fit of Static Nonlinearity ................... 75 Table B. 1 Parameter Values of Best-Fit Spatial Gabor Functions ............................................ 81 vi List of Figures Figure 1. 1 The Mammalian Primary Visual Cortex..................................................................... 2 Figure 1. 2 General Set-Up For Recording Primary Visual Cortex Neural Activity .................... 3 Figure 1. 3 Classical Technique For Mapping Primary Visual Cortex Stimulus Sensitivity ........ 4 Figure 1. 4 Proposed Model of Primary Visual Cortex RF........................................................... 5 Figure 1. 5 Spatial Frequency Tuning of Cat Primary Visual Cortex Neurons ............................ 7 Figure 1. 6 Agreement Between Spatial Domain and Frequency Domain Stimulus Sensitivity.... 8 Figure 1. 7 2D Linear RF Profile in Spatial and in Frequency Domains ................................... 11 Figure 1. 8 Standard ‘LN’ Model of Primary Visual Cortex Cell Response............................... 13 Figure 1. 9 Three Dimensional Spike Triggered Average........................................................... 14 Figure 2. 1 Schematic of Electrophysiology Experiment............................................................. 17 Figure 2. 2 A Few Frames of the Natural Scene Movie .............................................................. 19 Figure 2. 3 An Example of a Polytrode Recording...................................................................... 20 Figure 2. 4 Flow Chart for Measuring LN Model Prediction Accuracy ..................................... 21 Figure 2. 5 Computing the linear response prediction, L............................................................ 26 Figure 2. 6 Raw Plot of Linear Prediction vs. Observed Response ............................................ 28 Figure 2. 7 Free Parameters of the Static Nonlinearity N(L)...................................................... 29 Figure 2. 8 Measuring LN Model Prediction Accuracy for Natural Scene................................. 32 Figure 2. 9 Measuring LN Model Prediction Accuracy for M-sequence .................................... 34 Figure 2. 10 Measuring Virtual LN Model Prediction Accuracy for M-sequence...................... 37 Figure 2.11 Measuring the RF Signal ......................................................................................... 38 Figure 2.12 Strong RF Signal Profile with ‘X-Y’ Gabor Fit ....................................................... 39 Figure 3. 1 ‘X-Y’ Plots of RF Profiles Along the Polytrode........................................................ 42 Figure 3. 2 Plots of Smoothed Linear Prediction vs. Observed Response .................................. 44 Figure 3. 3 Plots of the Best-Fitting Static Nonlinearity N(L) .................................................... 45 Figure 3. 4 Measure of Real & Simulated RF Signal Strength ................................................... 46 vii Figure 3. 5 Real and Virtual RF Signal Maps ............................................................................. 48 Figure 3. 6 Effect of LNP Simulation on Static Nonlinearity ...................................................... 50 Figure 3. 7 Parameter Values of the Best-Fitting Static Nonlinearity N(L)................................ 51 Figure 3. 8 Plots of LN-Model Predicted vs. Observed Rate for Natural Scene......................... 53 Figure 3. 9 Correlation Coefficients between Predicted and Recorded Responses .................... 54 Figure 3. 10 Plots of Simulated LN-Model Predicted vs. Observed Rate for Natural Scene...... 55 Figure 4.1 Plot of Cell Rank vs. LNP Model Prediction Accuracy Values ................................. 56 Figure 4. 2 Firing Rate across Repeats of Natural Scene ........................................................... 61 Figure 4. 3 Simulated Firing Rate across Repeats of Natural Scene .......................................... 62 Figure 4.4 Location of Linear Spatial RF Profiles on a Frame of the Natural Scene ................ 65 Figure A. 1 Example of Measuring Strong Receptive Field Signal............................................. 71 Figure A. 2 Example of Measuring Weak Receptive Field Signal............................................... 72 Figure A. 3 Correlation Coefficient between Real & Simulated RFs.......................................... 74 Figure A. 4 Plots of Smoothed Model-Predicted vs. Observed Response to M-Sequence .......... 76 Figure A. 5 Plots of Smoothed Virtual Model-Predicted vs. Observed Response to M-Sequence ....................................................................................................................................................... 77 viii List of Equations Equation 2. 1................................................................................................................................. 23 Equation 2. 2................................................................................................................................. 25 Equation 2. 3................................................................................................................................. 29 Equation 2. 4................................................................................................................................. 30 Equation 2. 5................................................................................................................................. 33 Equation 2. 6................................................................................................................................. 35 Equation 2. 7................................................................................................................................. 35 Equation B. 1 ................................................................................................................................ 78 Equation B. 2 ................................................................................................................................ 78 Equation B. 3 ................................................................................................................................ 78 Equation B. 4 ................................................................................................................................ 78 Equation B. 5 ................................................................................................................................ 79 Equation B. 6 ................................................................................................................................ 79 Equation B. 7 ................................................................................................................................ 79 Equation B. 8 ................................................................................................................................ 79 Equation B. 9 ................................................................................................................................ 79 Equation B. 10 .............................................................................................................................. 79 Equation B. 11 .............................................................................................................................. 80 Equation B. 12 .............................................................................................................................. 80 ix Acknowledgements To the faculty, staff and my fellow students at UBC, I offer my gratitude for having inspired me to continue my work in this field. I owe particular and special thanks to Dr. Nicholas Swindale, whose continued support and understanding has helped me throughout this endeavour. Also, a great thank you to Martin Spacek, without whose dedicated work on both running neurophysiology experiments and spike sorting, this project would not have been possible. Thanks also to Dr. Timothy Blanche, who is responsible for the neurophysiology experiment and spike sorting used in this project. x Dedication To My Parents 1 1 Introduction There are cells in mammalian cerebral cortex, in the primary sensory areas, whose responses are consistently controllable by external stimuli, because they receive strong input almost directly from peripheral sensory receptors. In the visual system, these cells are found in primary visual cortex. The term ‘primary visual cortex’ encompasses the area known as V1 in primates as well as areas 17 and 18 in cats and other non-primate mammals, which receive direct input from the lateral geniculate nucleus (LGN). The early neurophysiology in this cortical area began to reveal how sensory processing and perception might occur. The progress in brain research since the first physiological studies of primary visual cortex neurons has provided important insight into their role in giving rise to conscious experience of the world, and how organisms actually move through and interact with the world. In the current study we re-examine the responses of primary visual cortex cells, in light of all the subsequent knowledge obtained from studying both real and artificial visual systems, with numerical efficiency now possible with modern technology. Here we assess the strength of the classic 'linear-nonlinear' model in predicting responses of neurons in area 18 of the cat, which were recorded extracellularly with a silicon polytrode in anaesthetized cat primary visual cortex. Well-established techniques were used to identify a functional model for each neuron and to test the accuracy with which it can predict the response to a set of repeated presentations of a natural scene stimulus. By comparing this result to that obtained by repeating the same analysis for responses that were simulated according to the theoretical model, but with Poisson spike time variability, it was possible to obtain a statistically rigorous measure of the success of the classic theory in capturing actual primary visual cortex neural behaviour. Only when we can successfully predict responses to arbitrary visual stimuli will we be able to claim that we fully understand the behaviours of primary visual cortex neurons. 2 A) B) 1.1 Overview of Primary Visual Cortex The afferent innervation from the retinas of mammals projects topographically to the primary visual cortex via the lateral geniculate nuclei (Payne & Peters, 2001; Wurtz & Kandel, 2000). In primates primary visual cortex is the only retinotopically organized region of cortex receiving all of the visual afferent innervation; the other retinotopic areas are innervated by it, as illustrated in Figure 1.1, part A. Analogously, cats have areas 17 and 18 as illustrated in Figure 1.1, part B. Describing the organization and neural function of primary visual cortex is a central component in the theory of sensory perception, and cats are ideal subjects for this work. Figure 1. 1 The Mammalian Primary Visual Cortex These drawings illustrate how visual information used for visual perception is transmitted to the visual cortex in a similar manner in all mammals. A), The pathway that projects from the retina through the lateral geniculate nucleus (LGN) to V1 in monkeys. B), Projection from retina, to dorsal LGN (dLGN) to areas 17 & 18 in cats; the blackened region of cortex indicates areas 17 & 18; scale bar = 10mm. Part A is from Wurtz & Kandel, 2000, Figure 28-2. part B is from Payne & Peters, 2001, modified from Figure 1-10. In all mammals, there is a topographic representation of the whole visual field in primary visual cortex. In higher mammals, like cats and many primates, there is also an orderly organization of ocular dominance and orientation columns, which was first observed by Hubel and Wiesel using single unit neurophysiology; a technique that is still used to study responses of visual cortex neurons, one at a time, as illustrated in Figure 1.2 (Hubel & Wiesel, 1959, 1962, 1977; Wurtz & Kandel, 2000). 3 Figure 1. 2 General Set-Up For Recording Primary Visual Cortex Neural Activity The general primary visual cortex single unit electrode set-up is illustrated here for a human brain. A tilted slit of light, when moved across the screen, evokes electrical impulses from a primary visual cortex neuron, which is recorded extracellularly with a tungsten microelectrode, and stored for subsequent analysis. From Wurtz & Kandel, 2000, upper portion of Figure 27-11. 1.2 Describing the Function of Primary Visual Cortex Neurons In studies of all types of sensory neurons, the aim has been to give a complete description of stimulus sensitivity such that it can be used to accurately predict responses to different test stimuli, which is proof of having successfully described the neural function. The stimulus sensitivity of a given sensory neuron is referred to as the receptive field (RF). The receptive field describes the pattern of presynaptic input, and thus, how the sensory neuron encodes the relevant type of physical energy currently in the environment and fires action potentials in response to it. In the first studies of receptive fields of visual cortex neurons, Hubel and Wiesel employed a technique to study primary visual cortex responses in cats (Hubel & Wiesel, 1959, 1962) that had already been used to reveal centre-surround RF structures in the LGN and in the retina (Kuffler, 1953; Hubel, 1960). This technique involved recording responses of individual cells in cats while circular spots of light, of varying location and size, were flashed on a screen of constant background illumination. Using spots that were small relative to the overall size of the receptive field, many locations were independently tested across the RF area to produce a map describing the spatial stimulus sensitivity profile; the resulting map for one cell is shown in Figure 1.3. 4 A) B) C) Figure 1. 3 Classical Technique For Mapping Primary Visual Cortex Stimulus Sensitivity RF-map for cat area 17 cell shown on the right, obtained from extracellularly recorded responses (centre column) to circular spot stimuli (left column). A), 1º spot in the centre region; B), same size spot as in A) but displaced 3º to the right; C), 8º spot covering entire RF area. Scale = 6º. Duration of stimulus = 1s. From Hubel & Wiesel, 1959, fig. 2. Responses to spot ONsets were marked by crosses, and responses to OFFsets were marked with triangles, at the locations corresponding to where the spots were presented (an example ONset response is shown in part A; an example OFFset is shown in part B). Proceeding in this way, Hubel and Wiesel constructed spatial maps of the cells' stimulus sensitivity, resulting in RF maps like the one shown on the right in Figure 1.3, obtained for one cat primary visual cortex cell. They found that roughly half of the cells they studied had distinct non-overlapping, parallel, elongated ON and OFF response regions, just as in the above example (Hubel & Wiesel, 1959; 1962). Hubel and Wiesel found that many of the cells with this kind of RF configuration also had the following response characteristics: there was summation within separate ON and OFF parts, there was antagonism between the ON and OFF regions, and it was possible to predict responses to stationary or to moving spots of various shapes and sizes from these RF maps. The cell shown in Figure 1.3 demonstrates all of these characteristics, and this type of behaviour is seen by its response to a large spot shown in part C) of Figure 1.3. Light falling within the region within the dashed line is excitatory, and is otherwise inhibitory. Thus it is thought that the cell's failure to respond to the onset of the large spot is due to the balance of opposite individual effects of the spot overlapping with each ON and OFF region. 5 A) B) By using the resulting RF maps to predict the orientation tuning of the responses of these cells to bar stimuli (like the one illustrated in Figure 1.2), Hubel and Wiesel proved they had indeed described the receptive fields of these primary visual cortex cells (Hubel & Wiesel, 1959). Based on previous recordings of centre-surround type receptive fields in the LGN (Hubel, 1960), Hubel & Wiesel hypothesized that aligned convergence of LGN afferents onto primary visual cortex cells resulted in the elongated parallel ON and OFF patterns of primary visual cortex RFs that were observed. More recently, Reid & Alonso mapped RFs of simultaneously recorded LGN afferents and primary visual cortex cells in cats (1995), to show that with a few exceptions, the RF locations of LGN cells making synaptic contact with simple cells were aligned with the ON and OFF sub-regions of the RF. More recent evidence has largely continued to provide support for Hubel & Wiesel’s original proposed wiring diagram (Reid et al., 2000), as shown in Figure 1.4 part A). Figure 1. 4 Proposed Model of Primary Visual Cortex RF A), The thalamocortical wiring proposed by Hubel and Wiesel to explain observed primary visual cortex RF profiles. A large number of geniculate cells, of which four are illustrated in the upper right in the Figure, have receptive fields with ON centres arranged along a straight line on the retina. All of these geniculate afferents project upon a single cortical cell with excitatory synapses. The receptive field of the cortical cell will then have an elongated ON centre indicated by the interrupted lines in the receptive-field diagram to the left of the Figure. From Hubel & Wiesel, 1962, Figure 19. B), A simplified wiring schematic illustrates the concept of how excitatory inputs from spatially aligned, circularly symmetric LGN RFs (shown on the left) summate (plus signs) to give rise to the primary visual cortex RF that is shown on the right. The ON-centre LGN RFs (white centres) are aligned along the top, and OFF-centre LGN RFs (black centres) are aligned along the bottom, to give the horizontally aligned, parallel excitatory and inhibitory regions of this example of a primary visual cortex RF. 6 Hubel and Wiesel,, and many following them, found that numerous cells behaved like the one shown in the present example, with distinct, non-overlapping ON and OFF regions, and these were termed simple cells. There were also a lot of cells that had overlapping ON and OFF regions, such that no receptive field maps could be defined for them. Out of these so-called complex cells, many seemed to have behaviour otherwise similar to the first category, in that they were orientation selective, and could be explained as operating as though summing a set of simple-cell inputs, as Hubel and Wiesel proposed (Hubel & Wiesel, 1962). The remainder of cells had more complex behaviour, and were termed hyper-complex, displaying end stopping or length selectivity (Hubel & Wiesel 1959; 1962). Taking all primary visual cortex responses including even the most exceptional cases, some form of orientation selectivity, localized to a limited region of retinal space, is the rule. Visual system researchers continue to consider mammalian primary visual cortex as transforming visual information, from the mosaic-like representations of the retina and in the LGN, into an image segmentation that compactly represents the lines and edges of the world. At the level of the individual primary visual cortex cells, simple cells are considered as those whose function is well- described by the RF map, which is largely determined by the LGN afferent input, as illustrated in Figure 1.4 B (Wurtz & Kandel, 2000, Olshausen & Field, 2006). For these cells, a simple primary visual cortex functional model may be defined, with the RF profile as the central component of the model. 1.3 Linear Properties of Primary Visual Cortex Receptive Fields The criteria Hubel and Wiesel used for identifying the class of simple cells are more parsimoniously stated as: those cells with distinct excitatory and inhibitory RF sub-regions that operate as linear filters (Mancini et al., 1990, Reid et al., 2000). Applying linear systems theory, the RF profiles that Hubel and Wiesel obtained with the spot-mapping technique are really estimates of the cells' impulse response functions. And then, according to the definition of a linear filter, the response of such a cell is supposed to be given by the weighted sum of the current visual contrast pattern weighted by the spatial RF profile, (i.e., convolution), which also fulfills Hubel & Wiesel’s requirements for simple cell classification. Thus it soon appeared to visual system researchers that the function of this class of cells might be truly equated to that of linear filters, as Hubel and Wiesel's classification criteria seemed to assume. 7 Before truly quantitative analysis of primary visual cortex response properties, other techniques, involving the spatial frequency domain, began to provide important evidence constraining the nature of the function of these cells, still supporting the general view of them operating as linear filters. Psychophysical measurement of the contrast-sensitivity function of the human visual system suggests that the visual system is composed of independent, linearly operating channels, each narrowly tuned for spatial frequency (Campbell & Robson, 1968). Cooper and Robson (1968) first studied responses of primary visual cortex cells in cats to sine wave gratings with varying orientation and spatial frequency. The spatial frequency response functions of three typical cortical cells shown below reveal them to have narrow spatial-frequency selectivity, with a varying centre, or peak, preferred spatial frequency. Figure 1. 5 Spatial Frequency Tuning of Cat Primary Visual Cortex Neurons Normalized spatial frequency response functions are shown for three cortical cells that were all measured with gratings drifting at 4Hz. The patterns were moving in the optimum direction for each cell. From Cooper & Robson, 1968, Figure 5. Similar frequency-response functions of cat primary visual cortex cells were also obtained in a subsequent study (Campbell et al., 1969). Here it was shown that the high frequency end of the spatial frequency tuning of primary visual cortex cells was well fitted by exponential functions of the form 22 fAe α− , with A, the response gain, α, the space constant, and f, the spatial frequency. This results is evidence of primary visual cortex cells operating as independent channels, each narrowly tuned for the widths and orientations of line segments, also not distinguishing between functional classes of feature-detectors. These results suggest that the operation of all primary visual cortex units is to perform a linear 2D spatial-frequency analysis of visual information; a different interpretation from what was shown in experiments like those done by Hubel & Wiesel. 8 Movshon, Thompson and Tolhurst (1978a) used bar and sine wave-grating stimuli to conduct the first study looking at primary visual cortex responses in cats in both spatial and in frequency domains independently. First they identified simple cells as Hubel and Wiesel (1962) did, by mapping ON and OFF responses to spots of light to find those satisfying the classification criteria. Then they used stationary thin bar stimuli of optimal orientation, flashed on at different locations along the axis perpendicular to the optimal orientation, in order to measure the line-weighting functions of the simple cells. Finally, they compared these line-weighting functions with those predicted by taking the inverse Fourier transforms of the spatial frequency response functions of those same simple cells, obtained in the way as described by Campbell and Robson (1969). An example of one such result for a cat simple cell is shown below. The line-weighting function appears as is expected: as a slice taken across a typical two-dimensional spatial RF profile, like the one shown in Figure 1.3 from Hubel and Wiesel (1959), in the direction perpendicular to the preferred orientation. Figure 1. 6 Agreement Between Spatial Domain and Frequency Domain Stimulus Sensitivity Agreement between line-weighting function of a simple cell (bars) and that predicted by inverse Fourier transform (curve) of spatial frequency tuning (inset). Positive values in the line-weighting histogram represent incremental responses to the introduction of a bright bar; negative values represent incremental responses to the introduction of a dark bar. The width and position of the histogram bars indicate the width and position of the bars used in testing, measured in degrees. The continuous curves represent the inverse Fourier transform of the cell's spatial frequency tuning curve; even symmetry was selected and the resulting curve was translated sideways arbitrarily to provide the best fit to the data. The abscissa of the inset is spatial frequency in cycles per degree; the ordinate is contrast sensitivity, the inverse of the threshold contrast value for each spatial frequency. From Movshon et al., 1978a, Figure 9B. By applying linear systems theory, the resulting agreement of primary visual cortex RF profiles in spatial and in spatial frequency domains provided the first quantitative support for primary visual 9 cortex cells summating linearly across space, as would the basis functions of a linear frequency analysis system. However the approach was not strictly rigorous, because it involved an arbitrary assignment of the unconstrained spatial phase to get the inverse Fourier transform of the spatial frequency response to match the spatial line-weighting function. Also, the theory could only be tested with simple cells, which are the subset of primary visual cortex cells having well-defined stimulus sensitivity profiles in the spatial domain when using simple spot (and bar) stimuli (Movshon et al., 1978a) DeValois and colleagues (1979) used checkerboard stimuli made from perpendicular, superimposed square gratings to provide more decisive evidence of primary visual cortex cells operating as linear filters, without having to classify them first. The orientation of the fundamental sinusoid of the checkerboard was different from the orientation of the edges of the checkerboard, such that the measured orientation tuning to the checkerboards would distinguish whether primary visual cortex cells were doing either frequency analysis or detection of edge features. The orientation tuning of cat primary visual cortex cells to the checkerboards that was measured matched that predicted from the orientation tuning to pure sinusoids and assuming linear operation. However, the results did not rule out the possibility of primary visual cortex cells operating as quasi-linear edge detectors that are selective for the diagonal line in the checkerboard. With the advent of laboratory microcomputers, automated stimulus presentation and data storage made it possible to exhaustively measure stimulus sensitivity profiles of primary visual cortex cells, thanks to random sampling in both the spatio-temporal and space-time-frequency domains. The same approach of Movshon and colleagues was repeated with better numerical rigour, providing more support for the still-accepted theory that simple cells, which are characteristic of a significant portion of mammalian primary visual cortex, summate approximately linearly over space and time (DeAngelis et al., 1993b). In the present cat primary visual cortex neurophysiology experiments, we expect to record from cells that operate approximately linearly, and to find that their response to any type of moving light pattern is in close agreement with the response of their receptive field. The receptive field profiles that will be measured, using modern numerical software, provide optimal resolution for determining the sensitivity of simple cells to patterns of light having all possible shapes and movements. 10 1.4 Standard Model of Primary Visual Cortex Sensory Processing The idea of primary visual cortex performing a two-dimensional spatial frequency analysis on information destined for areas of higher sensory processing is attractive because it fits in the general approach for sensory perception used by organisms who move through natural environments. And also, the spatial frequency response functions of all primary visual cortex cells reveal them to be band-pass frequency channels, operating approximately independently in parallel to each other, as is required in this scenario for the role of primary visual cortex. An important refinement to this theory emerged when Kulikowski and Bishop (1981) repeated the comparison of line-weighting histograms with inverse Fourier transforms of spatial frequency response functions for cat simple cells, just like the one shown above in Figure 1.6 from Movshon and colleagues, (1978a). They were able to compare the simple cell data with that expected from linear filters that satisfy the formal definition of Gabor functions, and found excellent agreement between the data and this type of filter function (Daugman, 1985, 1988; Jones & Palmer, 1987). And with the refinements granted by laboratory microcomputers, the agreement of primary visual cortex stimulus sensitivity profiles with Gabor filter functions was confirmed with numerical rigor, and in the realm of joint space and time of all possible moving shapes (DeAngelis et al., 1993). An example of a primary visual cortex RF profile described in two spatial dimensions is shown below, both in spatial and in frequency domains; this is often referred to as the ‘X-Y’ plot of the RF profile. The structure described in Figure 1.7 is that of the RF profile of a cell recorded in the primary visual cortex of a cat, however the structure appears remarkably like that of a two- dimensional spatial Gabor filter. 11 Figure 1. 7 2D Linear RF Profile in Spatial and in Frequency Domains RF profile in the frequency domain may be described in terms of ‘fx0’ & ‘fy0’, which represent the peak of the 2D frequency response in Cartesian coordinates, or equivalently in terms of ‘fopt’ & ‘φ’, in polar coordinates. The dashed axis shown in the frequency domain represents the conventional rotated frame of reference, here labelled ‘u’ & ‘v’, with ‘u’ being the direction perpendicular to the cell’s preferred orientation, ‘φ’. From Gardner et al., 1999, Figure 1. The spatial impulse response of a Gabor function is a Gaussian envelope modulating a sinusoid, commonly referred to as a wavelet. The Fourier transform of the spatial impulse response of a Gabor function is a Gaussian in the spatial frequency domain. The peak of the Gabor function in the spatial frequency domain corresponds to the spatial frequency of the modulated sinusoid, or wavelet, as seen in the space domain. In the conventional axis labelling that is shown, the periodic component of the 2D Gabor wavelet lies along the direction perpendicular to preferred orientation, and is labelled as ‘u’ in the rotated reference frame of Figure 1.7 part b). The one-dimensional spatial line-weighting function and the one-dimensional frequency response, as shown in Figure 1.6, from Movshon and colleagues, (1978a), together represent a slice through the RF profile along this ‘u’-axis. This same convention is used for plotting the ‘X-T’ plot of the simple cell RF profile, for studying its temporal component over the course of the cell’s memory (DeAngelis et al., 1993a, 1999; Gardner et al., 1999). This formal definition of a Gabor function emerged from signal processing theory, as the optimal solution to the problem of transforming a signal by decomposing it using basis functions that individually minimize uncertainty jointly in time and in frequency, with no overall loss of information (Daugman, 1985, 1988). The Gabor theory integrates all of the preceding empirical studies of primary visual cortex neural function much better than the theory of frequency analysis. The two-dimensional spatial RF maps of simple cells measured by Hubel and Wiesel (1959, 1962, 1977), like the one shown in Figure 1.3, look like spatial impulse responses of Gabor filters. The spatial frequency response functions of all primary visual cortex cells since they were first measured by Cooper & Robson (1968) also 12 match those of Gabor filters, being well fit by Gaussian functions (Campbell et al., 1969; Movshon et al., 1978a; Kulikowski & Bishop 1981). Thus the early studies of primary visual cortex stimulus sensitivity profiles are consistent with the Gabor theory, whether the profiles were measured in either the space domain, or the spatial frequency domain, or in both. In contrast, only studies of the frequency response functions are consistent with the frequency analysis theory; this theory cannot explain the spatio-temporal stimulus sensitivity profiles that were observed independently by many (Daugman, 1985, 1988; Jones & Palmer, 1987). The role implied by the Gabor theory for the primary visual cortex within mammalian vision is of performing image information decomposition, as in general for natural sensory perception. The Gabor theory may be seen as a refinement of a more basic spatial frequency analysis theory. Whereas the individual basis functions of a frequency analysis system localize information exactly in frequency space, they do so at the expense of localizing visual information in ‘actual’ space. In the Gabor wavelet decomposition paradigm, the units are instead defined by their ability to optimally minimize uncertainty jointly in ‘actual’ space and in frequency space. Here, the term ‘actual’ space is meant to signify either time or space, because the meaning is the same in any dimension being considered. Gabor filters thus describe the function of primary visual cortex cells as being selective, not only for the orientation of bars or edges, but also partially selective for both their width and location. Whereas the Gabor function was originally defined in the temporal dimension, following from signal processing theory, for primary visual cortex cells it was natural to look at them instead in terms of localizing visual signal jointly in space and in spatial frequency (Movshon et al., 1978a; Kulikowski & Bishop, 1981), at first. Then with modern techniques of stimulus sensitivity measurement it turns out that primary visual cortex cells also do the same, jointly, for the dimension of time as well (DeAngelis et al., 1993b, 1999). The results of these studies that have been mentioned provide the underlying motivation for the currently generally accepted model of primary visual cortex function, in which individual units jointly encode the location and the size of local light pattern variations, operating approximately independently, in parallel. This role for primary visual cortex fits well in the greater context of sensory processing done by the entire mammalian visual system, by pre-processing visual information from the mosaic-like representation of the retinas and LGN, into what information 13 theory confirms to be an optimal form for identifying the shapes and the movements of objects. The standard primary visual cortex functional model is illustrated below, in Figure 1.8. Figure 1. 8 Standard ‘LN’ Model of Primary Visual Cortex Cell Response The neuron computes a weighted sum ('linear response', L(t) ) of the image, S over space and time, (x,y,t) and this result is passed through a 'pointwise nonlinearity', N(L) to give the expected time-varying spike-firing rate, R(t) in response to the time-varying stimulus, S(x,y,t). This completes the LN model prediction of a cell’s true response to a given stimulus. This LN-predicted response is also used to create a simulated spike train, {ti}, by simply assuming that spike generation is a Poisson process. The accuracy of the LN prediction of neural responses is measured for actual primary visual cortex responses, as well as for those simulated with the LNP model for comparison. Modified from Olshausen & Field, 2006, Figure 10-1. The standard model illustrated in Figure 1.8 begins by stating that spatio-temporal receptive field maps describe the function of individual primary visual cortex units. This would be a complete description of function of the units only if primary visual cortex operated as an analog transmitting device. The action potential-firing nature of primary visual cortex neurons also requires defining a spike-generation mechanism that transforms the output of the receptive field in to a sequence of action potentials, transmitted to post-synaptic neurons, and being recorded. The standard approach to identifying a cell’s functional model is to measure a cell’s spatio-temporal receptive field, RF(x,y,t), and then to find the optimal values of the free parameters of an appropriately-chosen pointwise static nonlinearity, N(L). The first two stages of the model, each indicated by a red box outline in the above illustration, together give the ‘LN model prediction’ in response to any spatio- temporal light pattern, S(x,y,t), (Chichilnisky, 2001; Dayan & Abbott, 2000; Olshausen & Field, 2006). Finally, a Poisson process describes the statistics of the resulting spike train, and is used to determine sequences of ‘LNP Simulation’ action potentials, for a given time varying expected firing rate, or the ‘LN Model-Predicted Response’, N(L(t)). 14 A) B) The state-of-the-art way to measure primary visual cortex receptive fields is to record their responses to dense noise patterns, which are stimulus frame sequences that are generated using either Gaussian-distributed or m-sequence noise (Sakai, 1992; Reid et al., 1997). Figure 1. 9 Three Dimensional Spike Triggered Average A): Illustrating the reverse correlation procedure to m-sequence noise (black and white pattern squares), for computing the spike-triggered average (bottom row, in colour). The spike-triggered average at one time slice, 60ms preceding each spike, is illustrated as the sum of all m-sequence frames preceding each spike in the sequence (t1, t2, ...) by 60ms. The colour bar indicates the scaling of the light sensitivity, with the blue extreme representing the minimum STA value, and red, the maximum. B): An area 17 RF represented as a 3D prism instead, by stacking all the 2D X-Y frames along the 3rd dimension, of time preceding a spike. Stimulus sensitivity is shown in greyscale, with white representing responses to contrast increments and black representing responses to contrast decrements. On the base face, the plot of the projection of the RF is shown in X-T coordinates, showing an iso-amplitude contour map of response sensitivity, with solid contours representing responses to contrast increments, and dashed contours denoting responses to contrast decrements. Panel B is from DeAngelis et al., 1993a, Figure 3. The spatio-temporal receptive field profile, obtained by taking the spike-triggered average stimulus (Dayan & Abbott, 2001; Chichilnisky, 2001) as illustrated in Figure 1.9, above, describes the appearance and the evolution through time of the optimal light pattern leading up to a spike, (Reid et al., 1997; DeAngelis et al., 1993a, 1999). The resulting spatio-temporal receptive field, can be plotted as either a sequence of ‘X-Y’ frames, as at the bottom of part A), or as an ‘X-Y-T’ prism, as in part B), with intensity indicating the level of stimulus sensitivity in both cases. 15 It should be noted that an m-sequence stimulus is similar to Gaussian-distributed white noise in that it is a 'dense' approach to sampling the relevant dimensions of visual information (Reid et al., 1997). This approach is contrary to classical ‘sparse’ noise techniques, in which bars or gratings of varying size, location or phase angle, drift speed, and delay, were presented in random order. Dense noise paradigms treat all points in space as instances in time equally and uniformly, by using some algorithm for generating a pseudo-random sequence of frames, each made of equal numbers of randomly-placed small black-and-white dots. Thus the dense noise stimulus approach allows for exhaustive sampling of spatial and the temporal dimensions, essentially without any prior assumptions on the structure of either the RF profiles of the visual neurons of interest, or of the possible patterns that the visual system could encounter. An m-sequence stimulus is different from Gaussian-distributed white noise because it is constructed to completely and uniformly cover all of the spatio-temporal dimensions for a finite stimulus sequence, whereas true white noise is infinite by definition. The same procedures of system identification derived using Gaussian white noise as input may be used with m-sequence stimuli (Reid et al., 1997), making it a mathematically powerful approach for capturing primary visual cortex neural function. In light of the studies that were mentioned, the generally accepted framework of the function of primary visual cortex neurons can now be tested in the cases where the cells are stimulated, as they would be when encountering natural environments. The results will provide a measure of how accurately the present model describes the true function of area 17 or 18 cells recorded in cats, and thus, how closely a simple model of behaviour captures primary visual cortex function. 16 2 Methods 2.1 Silicon Polytrode Electrophysiology in Cat Primary Visual Cortex Data were obtained from experiments done by T. Blanche, M. Spacek, and N. Swindale. Multiple cat area 18 cells were simultaneously recorded in anaesthetized, paralysed cats using a silicon multielectrode array (polytrode). Full details of the surgical procedures used to do the experiments are given in Blanche et al., 2005. The following is a brief description. 2.1.1 Animal Surgery and Preparation Adult cats were initially anaesthetized with an intravenous bolus of sodium thiopental, and intubation or a tracheotomy was performed. The animal was then placed in a stereotaxic frame and connected to temperature, blood pressure, electrocardiograph (ECG), electroencephalograph (EEG), pO2, and end-tidal CO2 monitors. Surgical anaesthesia was maintained by artificial ventilation with a mixture of 70% N2O and 0.25–1.5% isoflurane in oxygen. Paralysis was induced with injection of pancuronium bromide and maintained by continuous infusion throughout the experiment. Pupils were dilated with topical atropine (5%), and nictitating membranes were retracted with phenylephrine eye drops (10%). Reverse ophthalmoscopy was used to choose rigid gas-permeable contact lenses of appropriate radius of curvature and power to focus both eyes on a monitor positioned 50 cm in front of the cat. A small craniotomy was made over the primary visual area of one hemisphere, the dura was removed and a 54-channel Michigan polytrode (54 map1b type) was inserted into the brain. Core body temperature was maintained near 38°C with a thermostatically controlled heating pad, and end-tidal SpCO2 and pO2 were stabilized at 40 mmHg and 99–100%, respectively, by varying the respiration rate. 2.1.2 Visual Neural Response Data Acquisition The schematic below illustrates how recordings were done, in which visual stimulus and extracellular voltage signals were simultaneously stored for subsequent analysis. 17 Figure 2. 1 Schematic of Electrophysiology Experiment The continuous, 54-site voltage signal detected with the polytrode was amplified and converted into digital form using two 32-channel A/D converters for storage in the acquisition computer. A separate computer was used to display the visual stimuli on a Sony 200sf monitor, and a copy of the stimulus information was also stored in the acquisition computer. The extracellular electrical activity was referenced to a platinum wire loop positioned around the craniotomy, and was buffered by two headstage preamplifiers (HS-27s, Neuralynx) prior to being amplified and band-pass filtered with a 64-channel amplifier, (Multichannel Systems FA-I-64, ALA Scientific Instruments, Westbury, NY). The signals were digitized with 12-bit resolution by two 32-channel A/D acquisition cards (DT3010s, Data Translation, Marlboro, MA) at 25 kHz/channel. The resulting signal had noise (including thermal and biological noise due to distant neural activity) of 3-4 µVrms (20-30 µVpp) and the spike amplitude was typically in the range 100- 200 µVpp, so the signal to noise ratio was typically 4:1, which was more than adequate for spike sorting. During recording, voltage waveform and stimulus display data were continuously streamed to the acquisition computer hard disk using in-house software written by Timothy Blanche. All data presented here were recorded from cortical neurons in area 18, in experiments done by Timothy Blanche and Martin Spacek. Recordings were typically made for 3–8 h in each penetration. 18 2.1.3 Stimulus Presentation (m-sequence and natural scene movies) Two different experiments from these polytrode recording sessions were used for the current project. For both experiments, the stimuli were presented at a rate of 50Hz, and thus with 20ms temporal resolution, on the display monitor (Sony 200sf) with a 200-Hz refresh rate and software- linearized gamma correction (mean luminance: 55 cd/m2). The software packages visionegg (written in Python by Andrew Straw: Straw, 2008) and dimstim (Spacek et al., 2009) were used to run the stimuli. An m-sequence stimulus recording was done for estimating neural receptive fields, designed to completely describe primary visual cortex neural function, under the constraints imposed by the experimental set up. Here we used a 32-by-32 pixel binary m-sequence, which means each pixel was either black or white, and hence only one or the other of the extreme values of the display monitor's luminance range. The algorithm for constructing an m-sequence with these specifications generates a sequence of 65535 32-by-32 pixel patterns (Sutter, 1987), which takes approximately twenty-two minutes to display at a 50Hz rate. This m-sequence was presented in a 12.7 by 12.7-degree square, thus capturing RF profiles with a spatial resolution of 0.4 degrees of arc, both horizontally and vertically, and 20ms resolution temporally. The stimulus that was used to test the primary visual cortex linear-nonlinear filter prediction consisted of a 250-frame-long sequence of natural scene patterns, taking 5 seconds to display, and repeated 75 times. This stimulus was a segment of a movie originally recorded using a camera that was mounted atop the head of a cat as it explored a natural environment, and it was obtained courtesy of Peter König. The movie frame images were displayed using fifteen contrast (or luminance) levels, in 64 by 64 pixels, also covering a 12.7 by 12.7-degree square. 19 Figure 2. 2 A Few Frames of the Natural Scene Movie These particular frames are not in true sequence, but occur every tenth frame, to allow better visualization of movement in the natural scene footage than what can be seen in true sequence, in which the movement is slow compared to the 50Hz display rate. This segment took 1.4 seconds to display. Additional recordings, not reported here, were made from the cells, including recordings of responses to moving black and white bars and to drifting and flashed sine-wave gratings. 2.1.4 Spike Sorting and Identifying Individual Neuron Spike events were extracted out of the raw extracellular voltage signal by a multistage procedure, involving spike detection whenever bi-phasic threshold crossing occurred at any of the recording sites, then clustering of the voltage waveforms resulting from this initial round of spike extraction. The initial clustering was used to describe the candidate spike templates (which often occupied multiple sites, typically 2 to 5) that were used to identify the units that were recorded. The spike templates were then used to locate wherever matches existed in the original raw voltage signal, in order to extract all of the remaining spike times, to the nearest twenty microseconds, and to associate them to the neuron which fired it. One-millisecond epochs of the voltage signals, which were sampled with forty-microsecond resolution, were displayed when spike events were determined to have occurred, using in-house software written by Timothy Blanche. Using these displays, as shown below, the efficacy of spike detection was improved by adjusting detection thresholds for capturing as many spike events as possible while minimizing false positives. Spike amplitudes were large enough to be able to detect them reliably. The accuracy of neural isolation was also verified using the same waveform displays, and was improved by either merging clusters together or breaking them apart wherever appropriate, on the basis of similarity between resulting spike templates. The final result of processing the raw data is a spike train, expressed as a sequence of spike times, with microsecond resolution, for each of the units that was isolated. Full details of the procedure are given in Blanche, 2005 (PhD Thesis). 20 Figure 2. 3 An Example of a Polytrode Recording Signals from the twelve sites at the bottom of the polytrode, shown in the schematic in part A, were used to isolate 10 cat primary visual cortex neurons. 1ms epochs of voltage waveform templates that were used to isolate neurons are shown in B, plotted at the relevant sites, in correspondence with the polytrode sites. Colour-coding in A shows the relevant sites used to construct each template; in B it shows the voltage waveforms of the resulting templates; and in C, the spike clusters, plotted in three of the waveform component dimensions for best visualizing the clusters. See the original text for full details. This Figure demonstrates how taking the signals over all twelve sites allows isolation of 3 more neurons from one of the clusters, (shown at the arrow located in the top of part C) beyond the 7 clusters that were isolated using only the four sites highlighted in yellow on the polytrode schematic in A and delineated by the dashed line in B. From Blanche et al., 2005, Figure 3. The analyses reported here were done on data from a transcolumnar penetration in area 18 that captured the activity of 26 neurons with the older algorithm. 2.2 LN Functional Model Identification The question being asked in this thesis is: can the responses of simple cells to arbitrary visual stimuli be predicted on the basis of a linear model of the cell’s receptive field? To that end, the receptive fields of simple cells were estimated by reverse correlation to an m-sequence stimulus using Sutter's spatiotemporal m-sequence method (Reid et al., 1997; Sutter, 1987). Next, an appropriate form of static output nonlinearity was chosen and its parameters were fitted in order to complete the prediction of the linear-nonlinear primary visual cortex functional model. The hypothesis is that the cells with strong linear impulse responses will have an LN functional model 21 that accurately captures the response to an arbitrary stimulus. This hypothesis can be tested by measuring the accuracy of the LN functional model prediction of the response to a natural scene stimulus, and comparing it to that obtained for simulated ‘LNP’ responses to natural scenes. The entire procedure is summarized in the following flow chart. Figure 2. 4 Flow Chart for Measuring LN Model Prediction Accuracy The flow chart shows how the accuracy of the LN model was measured, by taking the correlation coefficient between the LN-predicted and the recorded responses to a natural scene stimulus, denoted above by the red encircled ‘X’. Encircled ‘X’ symbols on the right denote taking the correlation coefficient of a pair of signals. Each row of the flow (Figure 2.4, cont’d) chart shows how either an m-sequence, ‘MSEQ’ or natural scene, ‘NS’ stimulus was used either for obtaining spike trains from actual cells or for computing a response prediction with an LN functional model. Rows coloured in blue represent the same analysis that was done for responses that are simulated as a Poisson spike process; these serve as a control for measuring the LN model prediction accuracy in the case where the system being identified was completely determined except for random noise. In each instance, computing the LN prediction involves two steps of obtaining the linear prediction of response, ‘L’ and then applying a point-wise nonlinearity to it to obtain the response rate prediction, ‘N’. Spike trains must be converted from sequences of spike times into evenly-binned spike- firing rate, ‘R’, in order to take the correlation coefficient with the response rate prediction, ‘N’. MSEQ Stimulus NS Recorded Spike Train L N R RF MSEQ Recorded Spike Train R NS Stimulus L NRF MSEQ Stimulus Cell NS Stimulus Cell MSEQ Stimulus L NSimulated RF NS Stimulus L NSimulated RF MSEQ Simulated Spike Train NS Simulated Spike Train R R 22 2.2.1 Estimating the Spatio-Temporal Receptive Field, RF The stimulus sensitivity profile of each cell was estimated by taking its first order kernel, or spike- triggered average, for a type of white noise stimulus: the m-sequence. When describing the m- sequence stimulus above, I began to allude to it as a powerful means of RF estimation, for resolving primary visual cortex neuron function. As mentioned, the key to the effectiveness of using an m-sequence is that it can be used in the same way as white noise for applying system identification theory (Reid et al., 1997). Therefore, the proof that shows that the spike-triggered average using white noise is the impulse response, or stimulus sensitivity of a linear system (Sakai, 1992; Chichilnisky, 2001), is also true for the m-sequence method used here (Reid et al., 1997). Estimating the RF with this method gives the first order Wiener kernel, which is the best estimate of function for a linear system, in the least squares sense, and this is true regardless of the existence of significant higher order Wiener kernels necessary to completely capture the function of the filter, or neuron, in this case. This statement is true due to the orthogonality between kernels obtained through system identification (Willmore & Smyth, 2003). Thus the result of the current analysis, which only involves estimating the first order Wiener kernel, even if it does not complete the identification of the function of nonlinear neurons, still produces a mathematically well-defined result. The stimulus sensitivity profile of each cell was defined in two dimensions of space and one of time, for a complete description of how the neuron would respond to any monochromatic moving pattern shown on the screen, displayed monocularly (DeAngelis et al., 1993a). Sutter's m- sequence method lends itself particularly well to this task of characterizing the neural stimulus sensitivity in these three dimensions with fine resolution (Sutter 1987; Reid et al., 1997). This powerful approach, along with the choice of the linear algebra software package numpy, written in python, and the appropriate choice of m-sequence stimulus parameters, have set the stage for efficiently computing the spatio-temporal sensitivity profiles for large groups of neurons in just a few minutes. These three-dimensional spatial sensitivity profiles, or RFs, may be visualized along a sequence of delta functions in time during the cell's memory, for clearly seeing the spatial (X-Y) profile, at each point in time (DeAngelis et al., 1993a), just as is shown in the example in panel A) of Figure 1.9. The m-sequence reverse correlation method for estimating the primary visual cortex RF is illustrated in Figure 1.9 and it involves averaging all stimulus patterns that produced spikes over the course of the recording. This procedure is described formally in equation 2.1, below. 23 ( ) ( )∑= −⋅= spk n Nn t knji spk kji tyxsNyxrf ,1 }{ ,, ,1, ττ Equation 2. 1 Here s(xi,yj,t) is the stimulus data, a sequence of arrays of numbers that correspond to the patterns of black and white pixels shown on the screen during the experiment. rf(xi,yj,τk) is the spatio- temporal receptive field profile. The discrete variables xi and yj denote the spatial coordinates corresponding to column and row indices, respectively, of the 32-by-32 pixel grids of the stimulus (and RF) frames. The discrete variable τk is the temporal displacement through the stimulus (and RF) frames, relative to a spike time. {tn} is the set of all of the spike times, or the spike train, for a given neuron during the m-sequence stimulus. Thus equation 2.1 defines the receptive field, RF as the result of indexing into the appropriate m- sequence frames that triggered spikes, taking the sum of these x-y-t segments of stimulus data and dividing it by the total number of spike-triggering segment, Ntot, to give the neuron's spike- triggering average stimulus (Dayan & Abbott, 2001; Willmore & Smyth, 2003). The algorithm used in the RF estimation was the one that is included as a method in the Experiment class of the 'neuropy' neurophysiology data analysis package, written by Martin Spacek. It should be noted that the RF estimate is equivalent to taking the cross-correlation of the response with the stimulus. The cross-correlation between response and stimulus must be divided by the stimulus autocorrelation, only when it is necessary to correct for non-white stimulus spectra. This step is not included here because for an m-sequence, the autocorrelation is a delta function by construction (Reid et al., 1997; Willmore & Smyth, 2003). 24 2.2.2 Identifying the Static Output Nonlinearity, N(L) Once the estimate of the RF was obtained, it can be used to filter any pattern of light displays to give the linear prediction of a cell’s response, L. The standard procedure for modeling a cell’s response is to consider a static nonlinearity, which describes the cell’s threshold for action potential firing, as transforming the linear measurement of cell activity into a rate of action potential firing, R (Dayan & Abott, 2001; Ringach & Malone, 2007). This nonlinearity is captured by the ensemble of pairs of linear prediction and firing rate values obtained during the entire course of the m-sequence recording, and can be summarized by a static function of one variable (the linear response prediction), typically with two or three free parameters fit to the data for each cell. This procedure of identifying the static output nonlinearity involves the following four steps, outlined here and explained in detail below: 1. Convert the spike train {tn} into a time-varying firing rate signal, R, as described on pp 23- 24. 2. Compute the linear prediction of response, or 'Generator' signal, L, as described in equation 2.2, on page 24. 3. Do non-parametric smoothing of L vs. R data, choosing bins with appropriate sizes along the range of L values to capture the static output nonlinearity; as described on pp 26-27. 4. Fit the free parameters of the expression for the static nonlinearity, N(L), in equation 2.3 on page 28. 1. Convert spike train into time-varying rate, R For the present analysis, we must take a sequence of spike times {tn} that are specified with microsecond precision and represent it instead as the spike count observed during each stimulus frame. The fixed 20ms spacing was selected for the stimulus displays because it is fine enough so that when typical primary visual cortex responses are expressed as a time-varying rate, the value changes slowly with respect to this temporal resolution, thus ensuring there is no information lost when digitizing the neural responses in this way. The time-varying firing rate, R was obtained by simply binning the spike train, with 20ms temporal resolution, to give the spike count that was observed during each time bin along the entire course of the stimulus. The responses to the m- sequence stimuli were therefore 65536-element-long vectors. The responses to the natural scene 25 stimuli were 250-element-long vectors. This conversion of the observed response from a list of spike times into a sequence of spike counts, R was necessary to be able to identify the static nonlinearity in the m-sequence, and then for measuring the LN model's accuracy by taking the correlation coefficient between recorded and predicted rates in response to a natural scene. It is plausible to consider the neural output as a firing rate in the manner described because it represents what we consider as the relevant signal from the perspective of a downstream neuron, as might be seen by summing across several similar neurons. So even though there is research ongoing elsewhere regarding the nature of the neural signal, for the current purpose it was sufficient to take the neural response to simply be the rate of action potential firing. 2. Compute linear prediction, L The linear response prediction, L, is the output stage of the linear stage of the LN-functional model that was illustrated in Figure 1.9, giving the relative level of the cell’s activation in response to any stimulus that is presented. L was calculated by taking the convolution of cell’s RF with the stimulus, S, in which the moving light patterns of the stimulus are weighted by the spatio-temporal receptive field, as described in the following equation (Chichilnisky, 2001). ( ) ( ) ( ) RFStyxsyxrftL PIX PIX MAXN i N j k kjikji ⋅=−⋅= ∑ ∑∑ = = = ',,,, 1 1 1 τ ττ Equation 2. 2 The time points during the stimulus are represented by t, which ranges from 1 to the total number of frames, NFRAMES. L has the same dimension as R (described previously), which was the same as the total number of time points during the stimulus: 65536 elements-long for the m-sequence, and 250-elements-long for the natural scene. rf(xi,yj,τk) represent the spatio-temporal receptive field profile. s(xi,yj,t) represents the time-varying stimulus pattern intensities. The xi’s and yj’s sweep through the spatial pixel locations. NPIX is the number of pixels on each side of the square arrays representing the stimulus frames, and was equal to 32 for the m-sequence, and 64 for the natural scene. τk is the temporal displacement preceding each time point, t, and so τk was taken to range from 0 to τMAX = 8 = 160ms, to sweep through the neuron’s memory in time. Computing the convolution can be imagined as the xi,yj,τk – elementwise weighted sum of the RF and s(xi,yj,t) at each time point, t during the stimulus, illustrated in Figure 2.5. 26 Figure 2. 5 Computing the linear response prediction, L The computation of convolution of the RF (top row) with the m-sequence stimulus (the black and white pattern) as the sum of the element-wise product of the Frith the current stimulus segment is shown for the 8th, 9th, 10th, 11th and 12th time points during the m-sequence stimulus. The resulting convolution values are shown in the graph in the lower left corner of the diagram, for a brief time window during the m-sequence stimulus. The computation of the convolution value corresponding to the 8th time point, is illustrated. The ‘X’ represents the 8-by-32-by-32 element-wise product of the RF with the m-sequence stimulus data, and is plotted in the first row under the stimulus. The ‘Σ’ represents taking the sum over this product, to give the resulting value of the convolution, which is the one encircled in the graph. The repeating of this procedure is shown for the next four time points. The arrows represent the RF ‘sliding’ along the stimulus frames, giving each new element-wise product shown with dashed borders in the subsequent rows, and the corresponding convolution values in the graph. The colour bar indicates the scaling of the STA intensities, with blue representing the minimum, and red, the maximum values. The dot product in the second equality in Equation 2.2 is actually an equivalent way of computing the convolution to the step-wise computation that was illustrated above. L was actually computed as this single array dot product, of the modified stimulus, S’ and the RF, using the linear algebra software package, numpy in Python (numpy.org). In order to replicate the ‘sliding’ of the RF 27 through the stimulus, as in the time-point-wise computation that was illustrated, the stimulus data had to be converted from an NFRAMES -by-NPIX-by- NPIX -sized array into an array of NFRAMES –many rows, each containing a flattened copy of τMAX -by- NPIX-by- NPIX-sized stimulus segments. Each row of S’ thus contains a flattened segment of the stimulus movie. The first τMAX values of the convolution are assumed to be zero, to take account of the points where there is no data available (because the RF would be convolved with time points before the beginning of the experiment). Two technical points had to be considered before computing L in response to the natural scene. First, the natural scene had twice the spatial resolution as the m-sequence stimulus data, and the same temporal resolution. The RF had same resolution as the m-sequence experiment from which it was computed, so it had to be recopied from 32-by-32 spatial pixel representation into a 64-by- 64 spatial pixel array. The result was that each RF element at the lower resolution occupied instead four elements at the higher resolution. The other issue was that the m-sequence and natural scene stimuli were displayed at slightly different locations on the stimulus display monitor during the recording (the natural scene was displayed 12 screen pixels to the left and 5 screen pixels below the m-sequence). Thus RF and S were re-aligned before computing L. 3. The M-Sequence L vs. R Distribution The above two steps results in a set of L(t), R(t) data pairs, giving the linear prediction of the neuron's response, and the corresponding observed firing rate, for each time point during the course of the stimulus, for each neuron. The resulting L(t) vs. R(t) distribution for each cell describes the mapping of the generator signal into the observed action-potential firing rate. The m- sequence procedure for reverse correlation required only a single presentation of the stimulus sequence. Therefore R(t), which represents the time-varying spike count during each 20ms-long frame of the stimulus can only have integer values. The discrete range of R(t) values made it difficult to determine the underlying relationship from the raw L(t) vs. R(t) data plotted as an ordinary scatter plot, due to the data being compressed along a small number of possible R values. The data can instead be visualized by giving the R values a random scatter, to better display the density of the data, which was done in Figure 2.6. 28 Figure 2. 6 Raw Plot of Linear Prediction vs. Observed Response Each unit of the range of observed response values in spikes per time bin was sub-divided into forty divisions, and was chosen at random for plotting each L vs. R data point, in order to better visualize the data distribution. The vast majority of data points correspond to zero spike counts. Dashed vertical lines illustrate how bins of varying width are chosen along the range of L values in order to do the ‘non-parametric smoothing’ of the L vs. R data. One example of a resulting smoothed data point is shown inside the left-most pair of dashed vertical lines, indicating the mean value and the variance of the subset of data points that lie inside of this bin. From this raw plot, a formal description of a static nonlinearity was extracted, which defines the observed mapping from linear response prediction values, into output spike firing. Before an appropriately chosen formal expression of the nonlinear function could be fit to the observed data, the L vs. R data had to first be smoothed, as illustrated above. By choosing appropriate ranges of L values, the observed data could be condensed into bins and expressed as the mean firing rate and the measurement error corresponding to these bins. The choice of bin edges was made so that a reasonable number of data points (between 10 and 10,000), fell within each bin, in order to give an appropriate measure of the mean R value and its associated error at each point in the resulting smoothed distribution. This so-called non-parametric smoothing is illustrated above, on the left of Figure 2.6, and was described previously (Ringach & Malone, 2007). Thus the range of L values was divided up unevenly, with larger bins corresponding to the extremes of L values (either positive or negative), to collect the sparse data points occurring in these areas, whereas the narrowest bins corresponded to L values close to zero, where the majority of data points lie. 29 4. Static Nonlinearity The shape of the static nonlinearity was summarized with a few parameters, and resulted from assuming a perfect rectifier with threshold, θ, and gain, A, along with an external noise source σn, at the input to the rectifier (Miller & Troyer, 2002; Ringach & Malone, 2007), as in equation 2.3 below; also see Figure 2.7. Figure 2. 7 Free Parameters of the Static Nonlinearity N(L) The free parameters of the formal expression for the 'Static Nonlinearity' N(L), in equation 2.3, are fit to the smoothed L vs. R data points. The diagram shows how the N(L) expression results from adding an input noise parameter, σn to the classic expression of a half-wave rectifier, given by N(L) = A*(L - θ), with threshold, θ and gain, A. From Ringach & Malone, 2007, Figure 3d. ( ) ( ) ( ) ( )    ⋅ −−⋅⋅ ⋅+        ⋅ −+⋅−⋅= n n n LALerfLALN σ θ π σ σ θθ 2 exp 22 12 2 Equation 2. 3 The error function, erf, is one of the special functions included in scipy, so it was convenient to use this within the routines written for computing and for fitting N(L). L represents the values of the linear response prediction of the LN model, which was given by the time-varying value of the current stimulus weighted by the spatio-temporal receptive field, as was described previously. The value of σn, determines the smoothness of the transition zone; the higher the value, the smoother the transition. A and θ are the gain and threshold, respectively, of the half-wave rectifier, as illustrated in Figure 2.7. 30 The optimal combination of free parameters of the static output nonlinearity were obtained by adjusting the parameter values to minimizing the chi-square, χ2 value between the LN-predicted rate, N(L) and the true values of the rate R. The LN-predicted values were obtained by taking the L values of the smoothed L vs. R data in response to the m-sequence and substituting them into the expression of equation 2.3, along with appropriate choices of the values of A, θ and σn. The χ2 value was calculated as follows, using the true values that were observed, R and LN-model- predicted values, N(L). ( )[ ]∑ = −= Bins N i i ii LNR 1 2 2 2 ˆˆ σχ Equation 2. 4 From Barlow, 1989, modified from Equation 8.2. Standard error of the mean R values, which were obtained from non-parametric smoothing of the L vs. R m-sequence data, were used for the σi's in the above expression. The index i sweeps from 1 up to N, the total number of smoothed data points, which is different for each cell. The hats in the above expression denote values obtained after smoothing. A function was written in python that takes the smoothed Ri values, with associated errors, σi and any combination of the N(L) parameters, and gives the χ2 value, as calculated in the above expression, as the output. The scipy.optimize.fmin method, which uses a downhill Nelder-Mead simplex algorithm, was used to find the optimal combination of the three parameters of the N(L) expression for each cell that minimized the χ2 value. This fitting problem is equivalent to minimizing the errors between model-predicted and observed responses, and gives the best fitting solution in the least squares sense, to describe the smoothed L vs. R m-sequence data. The χ2 test is the most important test of goodness of fit (Barlow, 1989). The χ2 test of the goodness of fit of the LN prediction was done by calculating the probability of the LN model describing the observed neural response data, P(χ2; DOF). The number of degrees of freedom, or DOF, equals the number of smoothed L vs. R data points, which was different for each neuron, minus the number of model parameters, which was always 3. The χ2 value and the number of degrees of freedom were given as input to the scipy.stats.chi2.cdf method, which conveniently computes the χ2 probability, P(χ2; DOF) without needing to refer to a table. A meaningful control is necessary 31 to establish the minimal χ2 probability level that is acceptable for indicating a good match of the model to the observed data. This control was obtained by generating virtual neural spike responses using a Poisson spike generator attached to the LN functional model, as illustrated in Figure 1.8. 2.3 Measuring Accuracy of LN Prediction of Natural Scene A natural scene experiment was used to test how well the LN functional model actually captures primary visual cortex neural activity, by measuring the correlation coefficients between the LN- predicted, N(LNS(t)) and the observed, RNS(t) rates in response to the natural scene stimulus. First, the convolution of the spatio-temporal receptive field with the natural scene was computed to give the linear prediction of the response to the natural scene, LNS(t). Then the values of the linear response prediction were substituted into the expression of equation 2.3, using the optimal values of A, θ and σn that were obtained with the fitting procedure on m-sequence data that was described previously. The resulting calculation gave the LN prediction of the response to the natural scene, N(LNS(t)). Because the natural scene was presented multiple times, the neural response was taken to be the time-varying spike count that was recorded in response to each presentation, averaged over all of the repeats, which is essentially the peri-stimulus time histogram. Thus the response to the natural scene was 'pre-smoothed', and so the correlation coefficient of the N(LNS(t)) vs. RNS(t) data was computed directly. 32 Figure 2. 8 Measuring LN Model Prediction Accuracy for Natural Scene The data analysis described in the preceding text is represented here, graphically. The top portion illustrates how the correlation coefficient was computed between LN-predicted and observed responses to the natural scene, and the resulting value is denoted with the red encircled ‘X’. The lower portion shows the same procedure, except repeated for simulated, ‘LNP’ responses to the natural scene; virtual counterparts are highlighted in blue. 33 Response predictions were obtained by using the LN cascade derived for each of the neurons, as others have already described previously (Chichilnisky, 2001, Ringach & Malone. 2007). The outcome of this analysis is the ability to measure the accuracy of the LN functional model in predicting neural responses to test stimuli, by computing the correlation coefficient between predicted and observed firing rates (David et al., 2004; Lesica & Stanley, 2004), and comparing the results obtained for real neurons with those obtained in the same way for virtual 'LNP' neurons. ( ) ( ) ( )NSNS LNR NSNSNSNS LNRLNRCorrCoef σσ ⋅ ⋅−⋅= Equation 2. 5 From Barlow, 1989, Equation 5.25a. The correlation coefficient was calculated using the numpy.corrcoef method, which calculates the correlation coefficient as in the above formula, where bars above indicate taking the average, and σ is the standard deviation. In this case the actual, RNS and model-predicted, N(LNS) firing rates that were obtained in response to natural scenes were provided as input to the corrcoef method. 2.4 Comparing LN Model Prediction Accuracy for Different Stimuli The LN model prediction accuracy was measured by taking the correlation coefficient between the LN model-predicted and the recorded response to a natural scene stimulus, as illustrated in the top of Figure 2.8. The correlation coefficient between the LN model-predicted and the recorded response was also done for the m-sequence scene stimulus. The LN model prediction accuracy using the same stimulus that was used for identifying the LN functional model obtained in this way provided an upper bound of the range of correlation coefficient values that could be obtained. Given the accepted theory of primary visual cortex function that is based on the previous studies of the mammalian visual system, the model prediction of the test stimulus is expected to be nearly as good as for the stimulus that was used to estimate the parameters of the model. When plotting model-predicted vs. observed responses to the m-sequence, to see if the data falls along the line of equality, there is the problem of the observed responses taking integer values of the spike count, such that it fails to capture the underlying relationship, just as was described previously for the raw L vs. R data, in Figure 2.6. Thus the same non-parametric smoothing 34 procedure was applied to the raw N(L) vs. R data, before assessing the model accuracy. The schematic below describes how the LN model prediction accuracy was measured for the m- sequence experiment. Figure 2. 9 Measuring LN Model Prediction Accuracy for M-sequence The procedures for computing the ‘Spike-Triggered Average’ and for identifying the optimal parameters, θ, A and σn, of the expression for the static nonlinearity N(L) that were described previously in the text are illustrated here, graphically. This schematic also illustrates how the correlation coefficient was measured between model-predicted, N(L) and observed, R responses to the m-sequence, whose final value is denoted by the black encircled ‘X’. This value provides the upper bound expected from the value that was obtained at the red encircled ‘X’ that was shown at the top of Figure 2.8. 2.5 Simulated LN Model Prediction Accuracy Simulated responses were generated, using the ‘linear-nonlinear-Poisson’, or ‘LNP’ cascade, as illustrated in Figure 1.9, to create virtual cells whose function is completely determined except for random noise, in order to test the analysis done for real primary visual cortex cells. As highlighted in blue in Figures 2.4, 2.8 and 2.10, the simulated responses provided a virtual counterpart for every step of the procedure of measuring the LN model prediction accuracy that was done for the 35 real primary visual cortex cells, which was an invaluable addition that was used for subsequent error analyses. The LN functional model that was identified for each of the real primary visual cortex cells was used for the ‘LN’ portion of the ‘LNP’ cascade, to generate the virtual counterpart. Thus the expected spike firing probability, in response to either the m-sequence or natural scene stimulus, was obtained, and it was used to simulate spike trains as instances of a Poisson spike process. The procedure of generating Poisson spike trains to a natural scene was illustrated graphically in the lower portion of Figure 2.8 and at the top of Figure 2.10 for the m-sequence; the simulated spike generation step was also shown schematically for both m-sequence and natural scene stimuli in Figure 2.4. 2.5.1 ‘LNP’ Simulated Spike Trains Simulated spike trains were generated according to the algorithm given in Heeger, 2000, as instances of an inhomogeneous Poisson process, given the time-varying expected spike firing probability. In this case, the spike count distribution, or the probability of observing exactly n spikes in a particular time interval (t1, t2), is given by: ( ){ } ( ) ! , 21 n ne ttduring spikesnP nn ⋅= Equation 2. 6 Taken from Heeger, 2000, Page 6. Here is the mean, or expected spike count during a time interval, (t1, t2). This definition of the inhomogeneous Poisson process explains the effect of a varying mean, or expected spike count, on the resulting spike count distribution that could be observed within a given interval during a time-varying stimulus. By choosing a fine temporal resolution δt of 2ms for generating the spike train, the definition of the Poisson spike process can be rephrased in terms of the probability of observing one spike inside the very small interval δt: { } ( ) ttRtduring spikes1P δδ ⋅≈ Equation 2. 7 From Heeger, 2000, Equation 2. 36 R(t) is the time varying spike count, such as that observed, or predicted, in response to the m- sequence or the natural scene stimulus, and it describes the mean, or expected spike count at each point during the stimulus, except now it describes the spike count per 2ms time bin over the course of the stimulus. Then the procedure for generating a Poisson spike train follows naturally from this definition: first subdivide time into the short intervals, each of duration δt, then take the time-varying spike count R(t) at this fine temporal resolution, and generate a sequence of random numbers x[i], uniformly distributed between 0 and 1. Here i is simply the index, going from one up to the total number of time bins over the course of the stimulus: 655,360 for the m-sequence and 2,500 for the natural scene. For each very small interval, if x[i] ≤ R[i] δt, generate a spike. Otherwise, no spike is generated. For spike time sequences generated like this, the spikes tend to occur more often when the expected spike firing rate R[i] is higher, and with the characteristic variability of a Poisson process, as required. It did not matter that virtual spike times were only specified with 2ms precision, even though real spike trains were specified with sub-microsecond precision to start off. They would both end up being expressed instead as time-varying spike count rates with temporal precision of 20ms, before doing any subsequent analysis. 2.5.2 Estimation of RF and Static Output Nonlinearity, N(L) Using virtual responses generated in this way for the m-sequence, the same procedures for taking the spike-triggered average and for obtaining the optimal parameters of the static nonlinearity as were done for the real cells, were repeated here, to identify the virtual LN models. The schematic in Figure 2.10, below illustrates how these procedures were done, using virtual responses to the m- sequence The virtual responses to the m-sequence and natural scenes were generated by applying the LN- model prediction procedure to the stimulus, substituting in the spatio-temporal receptive field profile and the optimal parameters of the static nonlinearity in the appropriate places. These substitutions were illustrated graphically in the lowermost portion of Figure 2.8, for the virtual response to a natural scene, and in the uppermost portion of Figure 2.10 for the m-sequence. The 37 resulting simulated responses acted as virtual counterparts for the real cells, for repeating the procedures of taking the spike-triggered average, for finding the optimal parameters of the static nonlinearity, and for measuring the accuracy of the resulting LN model prediction. Any deviation of the virtual RF profile and optimal parameters values away from those used to generate the responses, and of the model prediction being less than perfect, is due to the noise inherent in the procedure of estimating these quantities, for a Poisson process. Thus repeating the entire procedure for simulated responses provides us with an understanding of the magnitude of the actual estimation error. Figure 2. 10 Measuring Virtual LN Model Prediction Accuracy for M-sequence The procedures for computing the ‘Spike-Triggered Average’ and for identifying the optimal parameters, θ, A and σn, of the expression for the static nonlinearity N(L) that were described previously for real cells are illustrated here, graphically, for virtual cells. The virtual observed responses to the m-sequence were generated by applying an LNP cascade, as illustrated in the uppermost portion of the Figure. The correlation coefficient was measured between simulated model-predicted, N(L) and observed, R responses to the m-sequence, whose final value is denoted by the blue encircled ‘X’, acting as the virtual counterpart to the black encircled ‘X’ that was shown in Figure 2.9. 38 2.6 Measuring the Strength of the RF Signal The 32-by-32 pixel frames of the RF profile were separated into noise and signal areas. It was assumed that the RF signal measured in the noise area, which was taken to be located at the outer edges, was purely due to estimation noise, from applying the m-sequence method for taking the spike-triggered average. The variance of the RF averaged across this noise area was thus taken as the RF noise level. The RF signal measured within the 25-by-25-pixel area, at roughly the centre of the frames, was assumed to contain both signal and noise. The amount of signal in the RF was estimated by considering only where the RF variance exceeded twice the RF noise level. An example of plotting the RF variance and RF signal is shown in Figure 2.11, below. Figure 2.11 Measuring the RF Signal The RF variance taken across RF time points is shown in part A). The colour bar on the right shows the scale of RF variance values, ranging from zero up to the maximal. The lines indicate where the separation was made between ‘noise’ and ‘signal’ areas. The average of the variance taken across all the pixels in the outer edge area was taken as the RF noise level. The resulting plot of the RF signal is shown in part B), which was obtained by plotting RF variance values wherever the value exceeded twice the noise level, or else, zero. The sum was taken across the RF signal, which is plotted on the right side in the example shown in Figure 2.11. Only cells with a non-zero total signal value were selected for testing the LN model theory of neural function. 2.7 Gabor Fit to RF Profile Gabor fits of the RF profiles were done as was described by De Angelis et al. (1999). The scipy.optimize.fmin method was used to identify the combination of parameters of the equation B) A) 39 B.12 in the appendix that minimized the error between the resulting Gabor filter function and the RF, using a downhill Nelder-Mead simplex algorithm. A space-time separable spatio-temporal Gabor function was sufficient to describe all of the RFs that were measured that had sufficient RF signal. The space-time separable spatio-temporal Gabor function is simply a product of the functions in each of the two spatial dimensions and in the dimension of time. Thus it was straightforward to simplify the expression to fit either the time- slice expression to ‘X-T’ plots, or the two-dimensional spatial expression to ‘X-Y’ RF plots, which is shown in the example below, in Figure 2.12. Figure 2.12 Strong RF Signal Profile with ‘X-Y’ Gabor Fit An example of an ‘X-Y-T’ plot of the spike-triggered average stimulus is shown above, for the same cell that was used in the example of Figure 2.11, which contains a strong RF signal. The colour bar indicates the scaling of the STA intensities, with blue representing the minimum, and red, the maximum. Row A) is the raw plot of the spatio-temporal RF. Row B) is the result of plotting RF values only at the RF signal locations, or else zero, as was illustrated in Figure 2.11, to separate the RF signal from the RF noise. The first frame of this plot would be completely empty, and so in its place, the ‘X-Y’ plot of the best-fit Gabor function is shown. This spatial Gabor function corresponds to the third frame of this RF profile, where ON (red) and OFF (blue) regions are clearly distinguished. Gabor fitting was only done for those RFs having non-zero total RF signal, like the cell in the above example, which were identified as described in the preceding subsection. Successful fitting of spatial Gabor functions was only possible for RFs having at least one ON and one OFF region in the ‘X-Y’ plot at the peak time delay of the spatio-temporal receptive field profile, which is also illustrated in the example shown above. The Gabor fitting procedure would not converge to a solution if the RF profile had only one ON or OFF response region. Thus successfully fitting spatial Gabor functions to ‘X-Y’ RFs completed the identification of the putative simple cells that could be found in the group, which was confirmed visually for all of the cells, as will be shown in B) A) 40 the next section. Recalling the revised classification of simple cells from Hubel and Wiesel’s original four criteria in the light of linear systems theory (those cells that operate linearly and that have distinct ON and OFF regions), what we have done by identifying those cells having sufficient linear RF signals, and with a good fit by a Gabor, is to identify those cells that fulfill these criteria. With the present implementation of the functional model, the hypothesis may only be meaningfully tested for these simple cells, and not for those that have been rejected from subsequent analysis. 2.8 Software Programs for analyzing neural responses were written and run using Python 2.5 (www.python.org), with the corresponding versions of scipy, numpy, matplotlib, visionegg, dimstim and neuropy packages added for numerical and graphical functions, and for handling stimulus and neuron data from electrophysiology experiments. All data analysis presented here was done on a 1.80GHz 4CPU Intel® Pentium® machine using Microsoft Windows XP. 41 3 Results 3.1 Estimating the RF Profiles and N(L) Parameters for Each Cell 3.1.1 Estimating the Linear RF Profile (the First-Order Wiener Kernel) Linear spatio-temporal receptive fields (RFs) were estimated, by computing the spike-triggered average stimulus (STA), using the 32-by-32 pixel m-sequence, for all 26 recorded cells, which included simple cells as well as all the others. Nineteen out of the 26 cells were found to be acceptable for subsequent analysis on the basis of having sufficient RF signal strength and acceptable fits of spatial Gabor functions to their ‘X-Y’ plots. 'X-Y' plots of the RF profiles are displayed in Figure 3.1, for these nineteen cells, which include four cells deemed to be unacceptable. The 'X-Y' plots shown in the Figure represent the temporal slice of the STA corresponding to the temporal peak of the RF (which peaked at approximately 40ms leading up to a spike). These 'X-Y' plots are consistent with previous cat primary visual cortex neurophysiology in that all of the profiles (for all 19 that are shown) can well be modeled as filters with Gabor wavelet-type impulse responses. Those RF profiles that were not well fit by Gabor functions simply did not have sufficient RF signal above noise (for 7 of the 26 cells, leaving the 19 that are shown below). This group of RF profiles demonstrates a smoothly varying orientation preference, which is expected according to the known organization of primary visual cortex in cats, and the cross-columnar orientation of the polytrode as it penetrated the cortical layers, for this track. The best-fit Gabor functions revealed the preferred spatial frequency of the cells to range from 0.03 to 0.3 cycles per degree, consistent with spatial frequency preferences measured in response to gratings (data not shown) and being found in area 18, not 17 (Movshon et al., 1978b) Four more cells out of the nineteen shown below were excluded from subsequent analysis because the RF profile at those points where there was sufficient signal above noise were not well-fit by a Gabor. Although these plots of the RF signal are not shown here, the excluded cells, identified by asterisks in Fig. 3.1, can be seen here to be those where only a single ON or OFF region is strong. For these four cells, there is not enough signal in the flanking regions to survive the cut-off, and they were deemed not sufficiently reliable for testing the LN model hypothesis. The full spatio- 42 temporal profile for one of these cells that was excluded, neuron 27, is shown in Figure A.2, along with the plot of the RF signal. Figure 3. 1 ‘X-Y’ Plots of RF Profiles Along the Polytrode The lines indicate the recording site that corresponds to the location of the peaks of the multi-site spike waveform that identified each cell, thus displaying the relative positions of the cells that were recorded along this track. RF-profiles were obtained by spike-triggered averaging using the 32- by-32 pixel m-sequence experiment, and only the 'X-Y' plot at the peak time delay is shown for each cell in the left-hand column. Best-fitting spatial Gabor functions to the corresponding ‘X-Y’ plots are shown in the right-hand column. Corresponding parameter values of these fits are listed in Table B.1 in the appendix. The colour bar indicates the scaling of the STA intensities, with blue representing the minimum, and red, the maximum values. Asterisks indicate cells excluded from subsequent analysis. RFXY GXY 43 3.1.2 Plotting the Linear Response Prediction vs. the Observed Response The RF profiles were convolved with the same m-sequence stimulus that was used to compute them, to give the linear response prediction to the m-sequence. The observed response, R(t) was taken as the spike count occurring during each 20ms-long frame of the single presentation of the 65536-frame long m-sequence stimulus, and so could only take on an integer value. The resulting linear response values, L(t) were plotted with the corresponding observed spike rate, R(t), to give raw L vs. R plots, like the one shown in Figure 2.6. Condensing these diffuse scatter plots by non- parametric smoothing (Ringach & Malone, 2007), the problem of describing the mapping of linear responses into observed spike rates was reduced to a one-to-one mapping, as seen in the examples below. The smoothing procedure found that consistently captured the L to R mapping was to divide up the entire domain of the raw scatter plot into roughly thirty evenly-spaced divisions, and to then either add more divisions in between or merge them together, until an appropriate number of data points fell within each. For bins near where L approaches zero, widths decreased, typically by a factor of ten, until no more than 5000 points fell within the bin, due to the very high density of points there. At the very negative and positive extremes of the range of L values, where data points were sparse, bins had to instead be merged until at least ten points fell within a bin. With these appropriately chosen divisions of the domain, the raw data was re-expressed as the average spike rate, with variance and with standard error across the subset of points that fell within each resulting bin. The precise total number of resulting smoothed data points was unique to each distribution, and thus was different for each cell, but was always roughly close to 200 points, for real and virtual cells. This procedure was repeated for each of the cells, and proved to be successful in capturing the mapping of linear response, L to observed spike rate, R, for each cell, as seen with the examples shown below, in Figure 3.2. 3.1.3 Optimizing N(L) parameters After smoothing the L vs. R m-sequence data, the fitting procedure always converged successfully, to give the optimal combination of parameter values of the formal expression of the static nonlinearity from equation 2.3 that provided the best fit to the smoothed data. Examples of the resulting best-fit static nonlinearity are plotted below in Figure 3.3, for the same two cells for which smoothed L vs. R data was plotted in Figure 3.2. The form of the static nonlinearity that was chosen appeared to well-describe the L to R mapping of all of the neurons much better than a 44 simple half-rectifier would. The addition the input noise parameter, σn, was clearly necessary, to properly capture the LN functional model for these cat primary visual cortex cells, identified using the m-sequence method. Figure 3. 2 Plots of Smoothed Linear Prediction vs. Observed Response The L vs. R data distributions were non-parametrically smoothed as was described in Figure 2.6, and the results are shown here for neuron 18 in part A), and for neuron 26 in part B). Blue error bars show the variance of each data point, and SEMs are shown with green error bars. The SEM was calculated by taking the square root of the variance of each smoothed data point divided by the number of raw data points that gave each smoothed data point. B) A) MSEQ, Actual N26 MSEQ, Actual N18 R L R L 45 Figure 3. 3 Plots of the Best-Fitting Static Nonlinearity N(L) The optimal combination of the parameters of equation 2.3 were found by adjusting them, using a downhill Nelder- Mead simplex algorithm, to obtain the best fitting solution in the least squares sense, to describe the smoothed L vs. R m-sequence data. The resulting best-fit static nonlinearity, N(L) is plotted here in magenta, superimposed onto smoothed L vs. R data that is plotted with SEM bars in green, for cell 18 in part A), and for cell 26 in part B). Best- fitting parameter values and associated χ2probability of goodness of fit are included in insets. 3.2 RF Signal Strength The total signal in the RF profile was measured for all of the cells that were recorded. The resulting values obtained for nineteen out of twenty-six cells, are shown below, in Figure 3.4 B) MSEQ, Actual N26 MSEQ, Actual N18 θ = 18 A = 0.08 σn = 13 Χ2 p. = 0.078 θ = 27 A = 0.05 σn = 15 Χ2 p. = 0.0298 A) R L R L 46 Figure 3. 4 Measure of Real & Simulated RF Signal Strength The total RF signal was measured as described in Figure 2.11 in the Methods. The resulting values are plotted here as a function of the rank number of the cells in decreasing order of their RF signal strength for the 19 out of 26 cells initially found to have sufficient RF signal for subsequent analysis. The resulting values for real cells are shown in part A), and for the virtual counterparts in part B). Real and virtual groups of cells followed the same ranking, which was done based on the real RF profile signal strength. Dashed lines are exponential fits to data points. Points encircled in black correspond to cells excluded from subsequent analysis; plots of the RF profiles and RF signals are shown in Figure A.2 in the Appendix for the cell corresponding to the point encircled by the thick black line. The point encircled in red is the cell with the strongest RF signal, as seen in Figure 3.5. Nineteen cells initially deemed suitable for testing the LN model hypothesis were ranked according to the strength of the signal measured in the RF. First the cells were sorted in decreasing order according to three independent measures of their RF signals: the total RF signal obtained as 47 was described in section 2.6, the amplitude of the best-fit spatial Gabor function, and the peak spatio-temporal RF profile value. Taking the average of the three rank numbers gave the final ordering, which was used to give the horizontal axes of the plots in Figure 3.4 above. The point encircled in red corresponds to the cell with the strongest RF signal. The full plots of the spatio- temporal RF profiles for this cell and for its virtual counterpart are shown in Figure A.1 in the appendix, along with the plots of their RF signals. The spatio-temporal RF profile of this cell and the plot of its RF signal resemble those that were shown in the example of a strong RF signal that were seen in Figures 2.11 and 2.12. The values encircled in black in the above plots of rank number vs. total RF signal correspond to those cells excluded from testing the LN model hypothesis, leaving 15 cells for measuring the accuracy of the functional model. The full plot of the spatio-temporal RF profiles for the cell corresponding to the point encircled in bold black above, and for its virtual counterpart, is shown in Figure A.2 in the appendix, along with the plots of their RF signals. These plots were included to show that it was reasonable to exclude this cell from subsequent analysis given the weakness of the RF signal as seen visually, especially for the virtual counterpart, even though judging by its position according to the ranking it would be expected to be reliable for testing the LN model hypothesis. Although the ranking of cells in this way was meant to reflect the RF signal, clearly it does not objectify the problem of deciding which cells to include in subsequent analysis. The ranking has nonetheless provided a convenient way to display a breakdown of the putative simple cells found along the track and be able to identify them by some measure for relating them to the LN model theory. 3.3 Effect of the Procedure for Identifying the Functional Model Before going on to show the measurements of the model accuracy that were made for responses to the natural scene, the effect of the current procedure for identifying the functional model must be explained. Virtual responses to the m-sequence were generated according to a Poisson process, as was illustrated in Figure 2.10, which were then used for (re) estimating the LN model parameters. The resulting RF signal for one of the virtual cells is shown below alongside its real counterpart. 48 Figure 3. 5 Real and Virtual RF Signal Maps The RF signal map is shown for real cell 18 in part A) and for its virtual counterpart in part B). The colour bar on the right shows the scale of RF variance values, ranging from zero up to the maximal. For the real cell, 82 pixels gave a total signal of 495, whereas 52 pixels (51% of real) gave a total signal of 251 (63% of real) for the virtual cell. In the case of the virtual responses, the true LN model parameters are known – they are the ones that were estimated using the actual responses to the m-sequence. It is consistently seen across the group of real and virtual m-sequence response pairs that the spike-triggered average stimulus computed for the virtual responses had a consistently smaller signal to noise ratio, as assessed by measuring the total RF signal, as seen in the example shown above in Figure 3.5. Even though the virtual responses to the m-sequence were indeed defined as a pure rate code given by the real cell’s LN-model in response to the m-sequence, a random process generated the spike times. The average time-varying spike firing rate taken across repeats of Poisson spike trains (or equivalently, across identical cells, simultaneously) has a variance that is at every point in time equal to the current value of the firing rate (Heeger, 2000). Thus, taking a single instance of a Poisson spike train, as was done for the response to the m-sequence that was used for computing the spike- triggered average stimulus, and for identifying the static nonlinearity, gives a noisy estimate of the expected spike-firing rate. For all of the pairs of real and virtual cells in the group, the virtual RF profiles had an overall RF noise level that was 2 to 3 times greater than that seen in the real RF profiles, from which the simulated spike trains were generated. This effect is seen in the graphs of Figure 3.4 above, which can now be seen as the total signal of the true RF profiles that were used to define the virtual LNP models plotted in part A), and for the estimated RF profiles in part B). The total estimated RF B), Virtual A), Real 49 signal had a similarly decreasing relationship across the group as was seen for the true RF profiles, but the values of the total estimated RF signal were increasingly scaled down with increasing rank number, relative to the values of the true total RF signal. The total estimated RF signal is scaled down by half for the first two cells, and down to a tenth for the last three cells. The decreasing amount of total RF signal remaining after generating a Poisson spike train and using it to (re-) estimate the RF profile that occurs with increasing rank number is understood by imagining an iceberg effect. The greater the rank #, the less ‘ice’ there was ‘above the water’ in the true RF profile to start off with, so a similar level of noise added would leave a much tinier amount of total ice left above water in the estimated RF profile, compared to a cell that had a very large RF iceberg to start with. The estimated virtual RF profiles still resemble the true virtual RF profiles; this visual observation, as seen in the representative example in Figure 3.5, and in Figure A.1 in the appendix, was confirmed by computing correlation coefficients between true and estimated RF profiles, and between true and estimated RF signal maps, which are plotted in Figure A.3 in the appendix. Assuming cells function as LNP models, the result of analyzing the RF estimation procedure shows that the m-sequence reverse correlation method still gives an accurate estimate of the true RF profile, with loss of precision that worsens with decreasing RF signal strength. This diminished RF signal to noise ratio in the virtual estimated RF that is caused by Poisson spike firing variability also affects the form of the static nonlinearity that is identified for the virtual cells, in the way that is shown for one example below, in Figure 3.6, that is representative of the group. 50 Figure 3. 6 Effect of LNP Simulation on Static Nonlinearity Best-fitting static nonlinearity functions are plotted in magenta for the actual spike rate recorded in response to the m- sequence, and in red for the virtual one. Smoothed L vs. R data to which the function was fit is also plotted, with standard error bars in green for the actual spike rate recorded in response to the m-sequence, and in black for the virtual one. The quality of fit of the same static nonlinear function of equation 2.3 to virtual smoothed L vs. R data is as good as is seen for the real cells, as in the representative example seen above. The resulting fits to smoothed real and virtual data are shown together, which, as before for the RF signal, can now be thought of as the true and the estimated data, respectively, for a virtual cell. The estimated mapping of the linear response to the observed spike-firing rate in this example has a noticeably shallower slope relative to the true static nonlinearity that was used to generate it. This effect is reflected in the parameter values of the best-fitting functions, which are listed in Table A.2 in the appendix, and for which the distributions are plotted below, in Figure 3.7, for the real and virtual cells. Across the group, the only difference is that the estimated gain, A is found to be consistently smaller, while the threshold, θ and input noise parameters, σn do not change MSEQ, Actual & Virtual, N18 Actual: θ = 18 A = 0.08 σn = 13 Χ2 p. = 0.078 Virtual: θ = 21 A = 0.06 σn = 18 Χ2 p. = 0.0019 51 significantly, all of which reflects the change of shape due to (re-) estimation of the static nonlinearity as is seen in the above example. Figure 3. 7 Parameter Values of the Best-Fitting Static Nonlinearity N(L) Distributions of the values for each of the three parameters of the best-fitting static nonlinearity to the smoothed L vs. R m-sequence data are plotted. Threshold values are shown in part A) for real cells and in part D) for virtual ones. Gain values are shown in part B) for real cells and in part E) for virtual ones. Input noise values are shown in part C) for real cells and in part F) for virtual ones. θ, Actual A, Actual σn, Actual θ, Virtual A, Virtual σn, Virtual D) A) E) B) C) F) 52 The values of the virtual estimated parameters of the static nonlinearity, relative to the true virtual parameter values show that assuming LNP operation, model-predicted spike firing rates would be simply scaled down by a constant factor relative to observed spike-firing rates, causing a slight mismatch. However, this effect would not reduce values of the correlation coefficient between model-predicted and observed firing rates, thus reflecting an LN model that is still able to accurately predict responses to any stimulus. In contrast, a reduction of the correlation coefficient is expected due to imprecise RF estimation from reverse correlation of Poisson spike variability with the m-sequence noise stimulus. The resulting diminished accuracy of the LN model will be worse the weaker the linear receptive field profile, and it will be worse overall for the test stimulus than for the m-sequence. 3.4 Model Accuracy The resulting scatter plots of the model-predicted and recorded spike-firing rates in response to the natural scene are shown below, in Figure 3.8, for the same two example cells for which the previous results were shown. The weak agreement between model-predicted rate, N(LNS) and observed rate RNS shown for the two examples are representative of those seen across the entire group, revealing the limited accuracy with which the LN model captures actual area 18 responses to the natural scene. Correlation coefficients between model-predicted and observed rates in response to the natural scene range from 0.14 to 0.59. In contrast, the virtual LN models result in correlation coefficients between model-predicted and observed rates in response to the natural scene that range from 0.47 to 0.88; that is, there is little overlap with real results. The agreement between model-predicted rate, N(LNS) and observed rate RNS for ‘LNP’ responses is shown in the plots in Figure 3.10, below, for the two virtual cells that correspond to the actual responses that are shown in Figure 3.8. The distributions of the resulting correlation coefficients for the natural scene are shown in Figure 3.9, below, alongside the distributions of correlation coefficients between model-predicted and observed rates in response to the m-sequence. Full results are listed in Table 4.1, in the following section. When it comes to predicting the response to the same stimulus that was used to identify the functional model, actual and virtual functional models have indistinguishable ranges of prediction accuracy: from 0.84 to 0.92 and from 0.83 to 0.96, respectively. This result is also seen in the 53 plots of N(L) vs. R plots in response to the m-sequence, shown for the same two cells and for their virtual counterparts, in Figures A.4 and A.5, respectively, in which the N(L) to R mapping is linear. Figure 3. 8 Plots of LN-Model Predicted vs. Observed Rate for Natural Scene LN model prediction accuracy of the cells that were recorded in cat area 18 was assessed using a 250-frame long natural scene stimulus that was repeated seventy-five times. The scatter plots of the LN model – predicted, N(LNS) vs. the observed, RNS spike firing rates in response to the natural scene are shown here for same two representative cells; for neuron 18 in part A), and for neuron 26 in part B). Correlation coefficients are included in the insets. Dashed lines are the line of equality along which the correlation coefficient would be equal to 1. B) A) Natural Scene, Actual N18 Natural Scene, Actual N26 54 Distributions of the resulting correlation coefficients between model-predicted and observed firing rates, in response to the natural scene and to the m-sequence, are shown below, in Figure 3.9. Figure 3. 9 Correlation Coefficients between Predicted and Recorded Responses Distributions of correlation coefficients between model-predicted and recorded spike-firing rates are plotted, for the group of real cells obtained with the older spike sorting method in response to the real natural scene data in part A) and in response to the m-sequence in part B); for the group of virtual counterparts in response to the natural scene part C), and in response to the m-sequence in part D). Representative resulting plots of virtual model-predicted, N(LNS(t)) vs. observed, RNS(t) responses to the natural scene are shown in Figure 3.10 below. As was expected after having analyzed the effect of model parameter estimation given Poisson spike variability, the predicted firing rates under-estimate the observed firing rates, by a constant scaling factor, as was described in Figure 3.6. Thus the simulated N(LNS(t)) vs. RNS(t) shown in Figure 3.10 lie along the line of equality, B) A) NS, Actual C) D) NS, Virtual MSEQ, Virtual MSEQ, Actual 55 confirming that the standard primary visual cortex functional model accurately captures simulated responses to the test stimulus. Figure 3. 10 Plots of Simulated LN-Model Predicted vs. Observed Rate for Natural Scene Virtual LN model prediction accuracy was assessed with the same natural scene stimulus used during recording of actual area 18 cells. The scatter plots of the LN model – predicted, N(LNS) vs. the observed, RNS spike firing rates in response to the natural scene are shown here for virtual counterparts of the same two representative cells; for neuron 18 in part A), and for neuron 26 in part B). Correlation coefficients are included in the insets. Dashed lines are the line of equality along which the correlation coefficient would be equal to 1. B) A) Natural Scene, Virtual N18 Natural Scene, Virtual N26 56 4 Discussion and Conclusion 4.1 The Standard Model is Partially Accurate In this thesis I tested the currently generally accepted model of primary visual cortex simple cells – the LNP model – by using model receptive field parameters derived from responses to an m- sequence stimulus to predict responses to a natural scene stimulus. Overall, there was agreement between resulting LNP model predictions and observed responses to the natural scene, but it was weak, as compared to those obtained for simulated LNP responses. For virtual responses, any deviation away from perfect prediction of the test stimulus was due to error in estimation of model parameters because of the random nature of the process that was used for generating the simulated spike trains. Figure 4.1 Plot of Cell Rank vs. LNP Model Prediction Accuracy Values Resulting correlation coefficient values between LNP-model-predicted and observed responses are plotted as a function of the cell rank number, for fifteen simple cells. Values obtained in response to the natural scene and to the m-sequence are shown together, for actual and virtual cells. Correlation coefficient values between LNP-model-predicted and observed responses are plotted as a function of the cell rank number, for fifteen putative simple cells, in Figure 4.1, above, revealing the poor accuracy with which the LNP model captured the actual area 18 responses to 57 the test stimulus. There was practically no overlap between the full ranges of correlation coefficient values between model-predicted and observed responses to the natural scene obtained for actual area 18 neurons (red circles) as compared to simulated LNP responses to the natural scene (blue circles). The values that are shown graphically above are also listed in Table 4.1, below. Table 4. 1 Model Prediction Accuracy Values Correlation coefficient values between LNP-model-predicted and observed responses are listed, for fifteen putative simple cells, in order of their RF signal strength. The first column lists the cell identification numbers, which correspond to those listed in Figure 3.1. Correlation coefficient values for natural scene response predictions are listed in the next two columns, and for m-sequence responses in the rightmost columns. Extreme values of each group are listed at the bottom. The underlined value in the group of ‘virtual’ LNP response predictions of the natural scene is the only one that is less than some of the values in the group of ‘actual’ area 18 response predictions; the rest are greater. In Table 4.1, the correlation coefficient for the virtual LNP response to the natural scene for neuron 17 is underlined to distinguish it from the other values in this group because it is the only one that is less than any of the values obtained for real neurons in response to the natural scene. The LNP model always gave significantly more accurate predictions to simulated LNP responses than to actual area 18 responses. Thus the conclusion is that the LNP model does not fully capture the function of any of the putative simple cells that were recorded. CorrCoef (N(LNS)*RNS) CorrCoef (N(LMSEQ)*RMSEQ) Cell # Actual Virtual Actual Virtual 26 0.52 0.88 0.85 0.86 18 0.35 0.87 0.91 0.90 25 0.45 0.61 0.87 0.89 22 0.58 0.79 0.89 0.89 10 0.49 0.70 0.84 0.90 5 0.51 0.87 0.88 0.91 0 0.14 0.75 0.91 0.94 2 0.59 0.88 0.87 0.94 24 0.48 0.69 0.89 0.90 23 0.16 0.74 0.92 0.96 16 0.42 0.70 0.89 0.93 9 0.57 0.74 0.90 0.89 11 0.51 0.67 0.90 0.87 17 0.34 0.47 0.87 0.85 6 0.47 0.66 0.86 0.83 Min 0.14 0.47 0.84 0.83 Max 0.59 0.88 0.92 0.96 58 Measuring the correlation coefficients between model-predicted and observed responses to the m- sequence was done to control for any problem in the procedure used for estimating model parameters and for calculating the resulting correlation coefficients. As expected, the resulting correlation coefficient values for m-sequence responses confirm that the model accurately (re) predicts the response to the same stimulus that was used to compute the model parameters, just as accurately for the actual simple cells (black triangles, in Figure 4.1) as for the virtual responses that were generated using the model (green triangles). The LNP model did not capture the virtual responses to the test stimulus (blue circles) as accurately as for the control stimulus (green triangles), but knowing its construction, this effect can be attributed to the imprecise estimate of the linear receptive field due to the Poisson spike variability. For both the m-sequence and the natural scene, measured response values were equal to response variances, for all fifteen real and virtual cell pairs, consistent with Poisson variability as others have also observed in cortical cells (Kara et al., 2000). Thus the variability of the actual simple cells was not found to be in excess of that of the simulated LNP responses; other reasons must therefore be sought that might explain why the model did not work as well as it could have. 4.2 Potential Sources of Experimental Error The data employed for carrying out the preceding analysis was obtained using a state-of-the-art approach for recording primary visual cortex responses, leaving little to question with regard to whether true area 18 responses were well measured. In fact, the real spike trains were specified with 1000 times greater precision than the simulated LNP responses. On the other hand, the cells were recorded in vivo, over many hours (5.5 hrs lapsed between the natural scene and the m- sequence recordings being used here). Potential sources of experimental error have to be considered as reasons for why the LNP model did not fully describe the behaviour of the cells, unlike in the case of ideal virtual responses. 4.2.1 Movement of the Eyes during Recording The deficiency of the model in predicting responses of area 18 neurons to the natural scene could have been because the location of the receptive fields could have shifted during recording. If there was a change in location of either the eyes or of the stimulus display after the end of the natural scene and before the presenting the m-sequence, it would result in accurate prediction by the 59 model of the response to the latter only, for all of the cells, just as was observed. In order to rule out either of these possibilities, the accuracy of the model was measured for a variety of displacements of the natural scene relative to the m-sequence, to reveal whether there was a single location of the stimulus display that maximized the accuracy of the model prediction. If everything had been done properly, then the displacement found to produce the best prediction of the observed response to the natural scene would agree with the values of the two stimulus display locations that had been noted at the times of the recordings. n25 CorrCoef (N(LNS)*RNS) n26 CorrCoef (N(LNS)*RNS) RF shift Real RF shift Real (1.0, 2.7) 0.39 (1.0, 2.7) 0.28 (0, 0) 0.52 (0, 0) 0.36 (0.0, 5.4) 0.06 (0.0, 5.4) 0.04 (2.0, 0.0) 0.02 (2.0, 0.0) -0.01 (2.0, 5.4) -0.04 (2.0, 5.4) -0.09 (-1.7, -1.7) -0.05 (-1.7, -1.7) 0.003 (-1.7, 3.7) -0.04 (-1.7, 3.7) 0.07 (3.7, -1.7) -0.07 (3.7, -1.7) -0.14 (3.7, 3.7) -0.07 (3.7, 3.7) -0.10 n18 CorrCoef (N(LNS)*RNS) n22 CorrCoef (N(LNS)*RNS) RF shift Real RF shift Real (1.0, 2.7) 0.06 (1.0, 2.7) -0.10 (0, 0) 0.45 (0, 0) 0.58 (0.0, 5.4) -0.07 (0.0, 5.4) 0.06 (2.0, 0.0) -0.07 (2.0, 0.0) 0.55 (2.0, 5.4) 0.07 (2.0, 5.4) -0.11 (-1.7, -1.7) 0.02 (-1.7, -1.7) 0.42 (-1.7, 3.7) 0.11 (-1.7, 3.7) -0.05 (3.7, -1.7) -0.01 (3.7, -1.7) -0.13 (3.7, 3.7) -0.02 (3.7, 3.7) -0.08 Table 4. 2 Effect of Stimulus Displacements on Prediction Accuracy Correlation coefficients between model-predicted and recorded responses to the natural scene are listed above, for nine displacements of the RF relative to the test stimulus. The results are repeated for the four cells with the strongest RF signals, each set listed in a different quadrant. The displacements are listed in the left column of each quadrant, giving the x and y displacement of the natural scene relative to the receptive field (m-sequence) in units of degrees. The maximum correlation coefficient in each quadrant is typed in boldface, identifying a single location where all four LN models captured recorded responses with maximal accuracy. The resulting correlation coefficient values measured between model-predicted and observed responses to the natural scene are shown above, in Table 4.2, for the four neurons with the strongest RF signal, at each of nine different displacements of the m-sequence relative to the 60 natural scene displays. The only location corresponding to the peak prediction accuracy for all four of the cells included in the test corresponded exactly with the location of the display noted during the actual recording. This result supports the assumption that nothing moved between the natural scene and the m-sequence recordings that were made, which were later used for testing and for identifying functional models, respectively. Secondly, obtaining positive correlation coefficients for all four cells only when the stimuli are properly aligned relative to each other confirms that the correlation coefficient values were significantly above chance. 4.2.2 Computational Error The excellent accuracy of prediction of all area 18 responses to the m-sequence already ruled out the possibilities of major errors that there may have been in the procedure of estimating LN model parameter values. Also, the alignment of the model-predicted firing rate with the actual spike firing rate in response to the natural scene, as seen when plotted together as in the examples in Figure 4.2, showed that correlation coefficient values had indeed been computed correctly, just as for the virtual responses, like the one in Figure 4.3. 4.3 Deviations of Simple Cell Responses The discrepancy between predicted and observed responses to the natural stimulus as seen for virtual LNP cells, as in Figure 4.3, is largely explained by a constant down-scaling of the predicted responses; otherwise it is a minor effect of unpredictability from noisy RF estimation. In contrast, there are certain times during the natural scene stimulus when real cells are silent while the LNP model predicts them to fire action potentials; at other times they fire reliably while the model is silent. When they do respond at the correct time, it is more often by firing either more or less than predicted. Deviations from expected behaviour of visual cortex cells that were thought to behave as simple cells are not uncommon. Underlying physiological reasons for the present response discrepancies that are not explained by the simple cell model will now be considered. 61 Figure 4. 2 Firing Rate across Repeats of Natural Scene Examples of a segment of the firing rate during a natural scene are shown for neurons 18 and 26. The observed firing rate, RNS(t) is plotted as a solid blue line with variance error bars, which was obtained by taking the average over the 75 responses to the repeated stimulus. The LNP model-predicted firing rate, N(LNS(t)) is the solid black line. The entire stimulus took 5 seconds to display; the horizontal time scale is in seconds, the vertical rate is in spikes per stimulus frame. B) A) Natural Scene, Actual N18 Natural Scene, Actual N26 62 4.3.1 Response Suppression When multi-point structures found in the natural scene coincide with a cell’s receptive field, they stimulate the cell much more strongly than the m-sequence. The LNP model as implemented here to simulate firing rates observed in response to the natural scene assumes that responses do not saturate with increasing stimulus intensity. At the specific times when patterns in the natural scene activate cells very strongly, virtual cells respond just as vigorously, with no upper limit imposed onto their responses. Instead, when looking at the shape of the static nonlinearities for several of the putative area 18 simple cells for the m-sequence, for several of them, there is evidence of saturation at the highest linear response prediction values. This effect was seen most clearly in the plot of the linear response vs. observed rate to the m-sequence alongside the corresponding data that was obtained using this LNP implementation that did not have any response saturation, in Figure 3.6. Similar minor saturation effects as this were seen for seven other area 18 responses to the m-sequence. Figure 4. 3 Simulated Firing Rate across Repeats of Natural Scene Examples of a segment of the simulated firing rate during a natural scene are shown for neuron 26. The observed firing rate, RNS(t) is plotted as a solid blue line with variance error bars, which was obtained by taking the average over the 75 responses to the repeated stimulus. The LNP model-predicted firing rate, N(LNS(t)) is the solid black line. The entire stimulus took 5 seconds to display; the horizontal time scale is in seconds, the vertical rate is in spikes per stimulus frame. Natural Scene, Virtual N26 63 If observed firing rates are already being found to be lower than predicted in response to the m- sequence, the same effect would be seen even more drastically in response to the natural scene, in which the stimulus statistics activate area 18 cells more strongly. The response of the cell shown in part A) of Figure 4.2 is an example of one that is far less than computed by the LNP model for most of the predicted response peaks during presentation of the natural scene stimulus. There are points during the response of the cell in part B) of Figure 4.2 where it too is very much less than predicted, but it does not occur as often as with the first cell that was shown. The scatter plots of predicted, N(LNS) vs. observed, RNS responses to the natural scene were shown in Figure 3.8 for these same two cells. As can expected after seeing the average firing rates plotted as functions of time, the scatter plot for the first cell reflects how it responds more faintly overall, compared to the second cell. Roughly half (7 out of 15) of the cells had N(LNS) vs. RNS scatter plots demonstrating similar behaviour to the first cell, while the rest were more like the second one. However, even for the cells where the scatter plots are closer to linear, like the response in part B), the tendency is always apparent for responses at high predicted firing values to be less. Thus it appears that there is some mechanism at work to inhibit the responses of all of the cells that were studied here, but it is not a static mechanism that could be described by simple saturation of responses at high response levels. For some time, divisive normalization of visual cortex responses has been modeled to describe certain aspects of their function, such as contrast gain control, that are not captured by versions of simple cell models like the one being used here (Heeger, 1992). This divisive normalization is implemented as mutual inhibition from neighbouring cortex cells, and explains non-specific suppression and saturation observed in primary visual cortex responses (Heeger, 1992). This mechanism of normalizing responses would probably explain a significant portion of the present results, but not nearly all of the observed deviation from the LNP model prediction. Divisive normalization would indiscriminately inhibit the responses of the cells simply according to the strength of their activation. However, in the present case, there is no adaptation occurring to overall light level; and cells seem to be responding in distinct ways when the LNP model predicts similar levels of response. Including non-specific suppression by neighbouring cells to the present functional model would therefore improve the prediction accuracy only by a very limited amount. 64 4.3.2 Properties of Natural Stimuli What the present model must certainly fail to do is to capture aspects of the cells’ responses that are specific to natural scene stimulation. Natural scenes are remarkably different from white noise, with strong multipoint spatial and temporal correlations, due to the presence of forms in the visual patterns. In the last decade, visual neurophysiologists have refined techniques for accurately describing receptive field properties of mammalian primary visual cortex neurons from their responses to natural scenes (Smyth et al., 2003; David et al., 2004; Sharpee et al., 2006). Linear receptive field profiles estimated with sequences of natural images in ferrets revealed spatial structures not consistent with the linear Gabor model of simple cells (Smyth et al., 2003), including structure far outside central receptive fields and sometimes having orientation orthogonal to the preferred. The structure of patterns in natural scenes may therefore be evoking response from parts of simple cell receptive fields that were not activated by the m-sequence, such that the present functional model may have been failing to predict when these hidden parts of the receptive field were causing the cell to be active. Studying receptive field properties measured in response to dynamic natural scenes, either made by saccades over still images or as movies, reveal further aspects of mammalian visual cortex neural function that remain hidden when studying receptive field properties using other types of stimuli such as white noise and gratings. These response properties include improved sensitivity to under-represented frequencies seen in cats (Sharpee et al., 2006), and an inhibitory temporal component seen in monkeys (David et al., 2004). Even if linear receptive fields remain approximately the same under different stimulation paradigms, knowing the effects of natural scenes in activating primary visual cortex cells differently from other stimuli means that if these aspects had been captured by the present model, it would predict responses to the natural scene significantly better. 65 Figure 4.4 Location of Linear Spatial RF Profiles on a Frame of the Natural Scene All of the cells that were included in this study were recorded along a single polytrode track, such that their spatial receptive field profiles had roughly the same retinotopic coordinates; the pink ellipse on the natural scene frame gives the approximate RF location and size of all of the area 18 cells during the recordings. The ellipse representing the average RF location is centred at (1.0, 1.2) degrees relative to the centre of the natural scene display; the half-lengths of the major and minor axes measure 5.4 and 5.6 degrees respectively. These coordinates refer to the display on the stimulus monitor during recording, in which the natural scene and m-sequence frames were both displayed in 12.7 by 12.7 degree square windows. 4.3.3 Extra-Classical Effects An example of a frame of the natural scene movie that was played is shown in Figure 4.4. The location of the classical receptive field, which was roughly the same for all the cells included in this study, is also shown on the movie frame image. The RF location and size was found by taking the average across the centres and widths of the Gaussian portions of the spatial Gabor functions that were fit to the m-sequence RF profiles of these cells; best-fitting Gabor function parameter values for all the cells are listed in Table B.1. As this example shows, natural scenes have structures that extend far beyond the central RF field. Taking into consideration studies of extra-classical receptive field properties, it is likely that the natural scene movie activated extra-classical influences of the simple cell responses, through both centre and surround mechanisms that would have at times activated and other times suppressed responses of the putative area 18 simple cells (Cavanaugh et al., 2002). 66 Instances where cells lie along co-circular structures with each other could activate them more, and they might be a bit more suppressed when they are not co-circular, compared to the LNP model prediction of their response (Samonds et al., 2006). 4.4 Conclusion The accuracy of the standard model that is currently accepted as describing the function of primary visual cortex neurons was tested here by comparing its prediction to responses of putative area 18 simple cells recorded in a cat in response to a natural scene movie. The parameters of the model were fit from the responses of fifteen cells to an m-sequence stimulus, estimating the receptive field profile by taking the spike-triggered average stimulus, and fitting three free parameters of an expansive form of a static nonlinearity, to use in computing the expected firing rate in response to any stimulus. The significance of the observed accuracy of the functional model was assessed by computing correlation coefficients between model-predicted and observed responses to a repeated natural scene movie and comparing resulting values, for fifteen area 18 neurons and fifteen virtual responses that were generated using the LNP model. The present model always predicted virtual responses significantly more accurately than real responses recorded from the primary visual cortex. A brief overview was done of known physiological aspects of primary visual cortex function to explain why the generally accepted model could not fully capture the responses of any of the cells in response to a natural scene stimulus. Basically, the present approach used for estimating the model, using a noise stimulus, was likely not able to capture aspects of the cells responses, linear or otherwise, that were active during the natural scene response but not during the m-sequence that was used for identifying the model parameters. Future work might test some of the predictions described here, to better describe primary visual cortex responses in light of improved understanding of their role in natural sensory perception, providing evidence that can help distinguish important refinements made on to current neural computation theory. 67 Bibliography Barlow RJ, 1989. Ch. 3, “Theoretical Distributions,” Ch. 5, “Estimation,” & Ch. 8, “Taking Decisions,” Statistics: A Guide to the Use of Statistical Methods in the Physical Sciences. Mandl F, Ellison RJ & Sandiford DJ, Ed. Wiley & Sons, Toronto, 1989. Blanche TJ, Spacek MA, Hetke JF, Swindale NV, 2005a, Polytrodes: High Density Silicon Electrode Arrays for Large Scale Multiunit Recording, J Neurophysiol, 93:2987-3000. Blanche TJ, 2005b, Large Scale Neuronal Recording, PhD thesis, University of British Columbia. Campbell FW & Robson JG, 1968, Application of Fourier Analysis to the Visibility of Gratings, J Physiol,197: 551-566. Campbell FW, Cooper GF & Enroth-Cugell C, 1969, The Spatial Selectivity of the Visual Cells of the Cat, J Physiol, 203: 223-235. Cavanaugh JR, Bair W & Movshon JA, 2002, Nature and Interaction of Signals From the Receptive Field Center and Surround in Macaque V1 Neurons, J Neurophysiol, 88: 2530-2546. Chichilnisky EJ, 2001, A Simple White Noise Analysis of Neuronal Light Responses, Network: Comput. Neural. Syst., 12: 199-213. Cooper GF & Robson JG, 1968, Successive Transformations of Spatial Information in the Visual System, IEE NPL Conference Proceedings, V32: 134-143. Daugman JG, 1985, Uncertainty Relation for Resolution in Space, Spatial Frequency, and Orientation Optimized by Two-Dimensional Visual Cortical Filters, J Opt Soc Am A, 2(7): 1160-1169. Daugman JG, 1988, Complete Discrete 2-D Gabor Transforms by Neural Networks for Image Analysis and Compression, IEEE Transactions on Acoustics, Speech and Signal Processing, 36(7): 1169-1179. David SV, Vinje WE & Gallant JL, 2004, Natural Stimulus Statistics Alter the Receptive Field Structure of V1 Neurons, J Neurosci, 34(31): 6991-7006. David SV & Gallant JL, 2005, Predicting Neuronal Responses During Natural Vision, Network: Comput. Neural Syst., 16(2): 239-260. Dayan P & Abbott LF, 2001, Ch. 1, “Neural Encoding I: Firing Rates and Spike Statistics”, Ch. 2, “Neural Encoding II: Reverse Correlation and Visual Receptive Fields”. Theoretical Neuroscience: Computational and Mathematical Modelling of Neural Systems. MIT Press, Cambridge, MA, 2001. 68 DeAngelis GC, Ohzawa I & Freeman RD, 1993a, Spatiotemporal Organization of Simple-Cell Receptive Fields in the Cat's Striate Cortex. I. General Characteristics, J Neurophysiol, 69(4): 1091-1117. DeAngelis GC, Ohzawa I & Freeman RD, 1993b, Spatiotemporal Organization of Simple-Cell Receptive Fields in the Cat's Striate Cortex. II. Linearity of Temporal and Spatial Summation, J Neurophysiol, 69(4): 1118-1135. DeAngelis GC, Ghose GM, Ohzawa I & Freeman RD, 1999, Functional Micro-Organization of Primary Visual Cortex: Receptive Field Analysis of Nearby Neurons, J Neurosci, 19(9): 4046- 4064. DeValois KK, DeValois RL & Yund EW, 1979, Responses of Striate Cortex Cells to Grating and Checkerboard Patterns, J Physiol, 291: 483-505. Gardner JL, Anzai A, Ohzawa I & Freeman RD, 1999, Linear and Nonlinear Contributions to Orientation Tuning of Simple Cells in the Cat's Striate Cortex, J Neurosci, 16: 1115-1121. Heeger DJ, 1992, Normalization of Cell Responses in Cat Striate Cortex, Vis Neurosci, 9: 181-197. Heeger, DJ, 2000, Poisson Model of Spike Generation. http://www.cns.nyu.edu/~david/handouts/poisson.pdf, September 5, 2000. Hopfield JJ & Brody CD, 2001, What is a Moment? Transient Synchrony as a Collective Mechanism for Spatiotemporal Integration, PNAS, 98(3): 1282-1287. Hubel DH & Wiesel TN, 1959, Receptive Fields of Single Neurons in the Cat's Striate Cortex, J Physiol, 148: 574-591. Hubel DH, 1960, Single Unit Activity in Lateral Geniculate Body and Optic Tract of Unrestrained Cats, J Physiol, 150: 91-104. Hubel DH & Wiesel TN, 1962, Receptive Fields, Binocular Interaction and Functional Architecture in the Cat's Visual Cortex, J Physiol, 160: 106-154. Hubel DH & Wiesel TN, 1977, Ferrier Lecture: Functional Architecture of Macaque Monkey Visual Cortex, Proc. R. Soc. Lond., 198: 1-59. Jones JP & Palmer LA, 1987, An Evaluation of the Two-Dimensional Gabor Filter Model of Simple Receptive Fields in Cat Striate Cortex, J Neurophysiol, 58(6): 1233-1258. Kara P, Reinagel P & Reid RC, 2000, Low Response Variability in Simultaneously Recorded Retinal, Thalamic, and Cortical Neurons, Neuron, 27: 635-646. Kuffler SW, 1953, Discharge Patterns and Functional Organization of Mammalian Retina, J Neurophysiol, 16: 37-68. 69 Kulikowski JJ & Bishop PO, 1981, Fourier Analysis and Spatial Representation in the Visual Cortex, Exp Brain Res, 44: 286-400. Mancini M, Madden BC & Emerson RC, 1990, White Noise Analysis of Temporal Properties in Simple Receptive Fields of Cat Cortex, Biol Cybern, 63: 209-219. Miller KD & Troyer TW, 2002, Neural Noise Can Explain Expansive, Power-Law Nonlinearities in Neural Response Functions, J Neurosci, 87: 639-659. Movshon JA, Thompson ID & Tolhurst DJ, 1978a, Spatial Summation in the Receptive Fields of the Cat's Striate Cortex, J Physiol, 283: 53-77. Movshon JA, Thompson ID & Tolhurst DJ, 1978b, Receptive Field Organization of Complex Cells in the Cat's Striate Cortex, J Physiol, 283: 79-99. Olshausen BA & Field DJ, 2006, Ch. 10, “What is the Other 85 Percent of V1 Doing?”, 23 Problems in Systems Neuroscience. Van Hemmen JL & Sejnowski TJ, Ed. Oxford University Press, New York, NY, 2006. Payne BR & Peters A, 2002, “The Concept of Cat Primary Visual Cortex”, Ch. 1, The Cat Primary Visual Cortex. Payne BR & Peters A, Ed. Academic Press, 2002. Reid CR & Alonso JM, 1995, Specificity of Monosynaptic Connections from Thalamus to Visual Cortex, Nature, 378: 281-284. Reid CR, Victor JD & Shapley RM, 1997, The Use of M-Sequences in the Analysis of Visual Neurons: Linear Receptive Field Properties, Vis Neurosci, 14: 1015-1027. Reid CR, Alonso JM & Usrey WM, 2002, Ch. 8, “Integration of Thalamic Inputs to Cat Primary Visual Cortex”, The Cat Primary Visual Cortex. Payne BR & Peters A, Ed. Academic Press, 2002. Ringach DL & Malone BJ, 2007, The Operating Point of the Cortex: Neurons as Large Deviation Detectors, J Neurosci, 27(29): 7673-7683. Sakai HM, 1992, White-Noise Analysis in Neurophysiology, Physiol Rev, 72(2): 491-505. Samonds JM, Zhou Z, Bernard MR & Bonds AB, 2006, Synchronous Activity in Cat Visual Cortex Encodes Collinear and Cocircular Contours, J Neurophysiol, 95: 2602-2616. Sharpee TO, Sugihara H, Kurgansky AV, Rebrik SP, Stryker MP & Miller KD, 2006, Adaptive Filtering Enhances Information Transmission in Visual Cortex, Nature, 439: 936-942. Smyth D, Willmore B, Baker GE, Thompson ID & Tolhurst DJ, 2003, The Receptive-Field Organization of Simple Cells in Primary Visual Cortex of Ferrets under Natural Scene Stimulation, J Neurosci, 23(11): 4746-4759. 70 Spacek MA, Blanche TJ, Seamans JK & Swindale NV, 2007, Accounting for network states in cortex: are (local) pairwise correlations sufficient?, Soc Neurosci Abstr, 33:790.1. Spacek MA, Blanche TJ & Swindale NV, 2009, Python for Large-Scale Electrophysiology, Front. Neuroinform, 2: 9. Straw AD, 2008, Vision Egg: An Open-Source Library for Realtime Visual Stimulus Generation, Front Neuroinform, 2: 4. Sutter EE, 1987, “A Practical Nonstochastic Approach to Nonlinear Time-Domain Analysis”, Advanced Methods of Physiological System Modeling, Vol. 1, Marmarelis VZ, Ed. Plenum Press, New York, 1989. Willmore B & Smyth D, 2003, Methods for First-Order Kernel Estimation: Simple-Cell Receptive Fields From Responses to Natural Scenes, Network: Comput. Neural Syst., 14(3): 553-577. urtz RH & Kandel ER, 2000, Ch. 27, “Central Visual Pathways”, & Ch. 28, “Perception of Motion, Depth, and Form”, Principles of Neuroscience, 4th ed. Kandel ER, Schwarz JH & Jessel TM, Ed. McGraw-Hill Professional, Toronto, 2000. W 71 Appendix A Figure A. 1 Example of Measuring Strong Receptive Field Signal Spatiotemporal receptive field profiles and their corresponding signal maps were obtained and plotted as described in the Methods. The results for neuron 18, which had one of the strongest RF signals, are plotted in the top portion, and for its virtual LNP counterpart in the lower portion. 72 Figure A. 2 Example of Measuring Weak Receptive Field Signal Spatiotemporal receptive field profiles and their corresponding signal maps were obtained and plotted as described in the Methods. The results for neuron 27, which had one of the weaker RF signals, is plotted in the top portion, and for its virtual LNP counterpart in the lower portion. 73 Table A. 1 Receptive Field Signal Strength The measures of receptive field strength as described in Methods sections 2.6 and 2.7 are listed here for fifteen actual area 18 simple cells, and their corresponding virtual LNP cells. The numbers identifying the units are in the left hand column, which are listed in order of RF signal strength, determined as was described in section 3.2. The total RF signal, ‘Total Signal’, the amplitude of the best-fitting spatial Gabor function, ‘G Amp.’ and the peak RF value, ‘RF Max’ are listed for the actual area 18 cells. The total RF signal and peak RF values are also listed for the corresponding virtual cells. Correlation coefficients were computed between the actual and the virtual spatio-temporal RF profiles, ‘st-RF’, and also between the actual and the virtual RF signal maps, ‘ROI’, in the rightmost columns. The extreme values of each group are listed at the bottom; typed in boldface for the correlation coefficients because these describe the accuracy and precision of the m-sequence method of RF estimation given Poisson spike variability. Actual RF Signal Strength Virtual RF Signal Strength CorrCoef: Actual * Virtual Unit # Total Signal G Amp. RF Max Total Signal RF Max st-RF ROI 26 467 22.1 24.4 196 8.4 0.68 0.81 18 495 18.7 21.4 251 7.4 0.76 0.91 25 260 18.9 26.5 68 8.8 0.52 0.70 22 392 13.6 19.9 97 5.9 0.59 0.70 10 238 13.8 18.0 42 6.0 0.54 0.53 5 248 9.9 16.8 102 5.9 0.69 0.87 0 130 15.6 17.8 10 5.4 0.38 0.36 2 199 8.1 15.2 54 4.3 0.53 0.69 24 120 8.8 12.1 9 4.6 0.31 0.16 23 124 7.9 9.1 38 3.3 0.63 0.75 16 106 7.1 9.0 23 2.6 0.48 0.65 9 98 6.3 10.4 24 2.7 0.47 0.70 11 103 6.7 8.8 17 2.8 0.49 0.54 17 65 6.9 9.0 7 3.9 0.40 0.36 6 97 5.6 8.8 8 2.7 0.34 0.37 Min 65 5.6 8.8 7 2.6 0.31 0.16 Max 495 22.1 26.5 251 8.8 0.76 0.91 74 Figure A. 3 Correlation Coefficient between Real & Simulated RFs The correlation coefficients between the actual and virtual RFs that were listed in Table A.1 are plotted here, as a function of the cells’ rank number. The correlation coefficients of the st-RF profiles are plotted in the upper portion, and for the RF signal maps in the lower portion. Results are plotted for nineteen putative simple cells, ranked in order of their RF signal strength, which was determined as was described in section 3.2. The encircled values correspond to those that were also circled in Figure 3.4. RF results for the one encircled in red were plotted in Figure A.1, and for the one encircled in thick black in Figure A.2. The other encircled values correspond to those cells that were rejected from subsequent analysis. Dashed lines are the lines of least-square error fit to the data. Rank vs CorrCoef (RealRF *SimulatedRF) 0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0 5 10 15 20 Rank # R F C or rC oe f Rank vs CorrCoef (RealROI *SimulatedROI) 0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00 0 5 10 15 20 Rank # R O I C or rC oe f 75 Cat 15, tr 7c Actual Unit # Theta Gain Sigma Chi Sq DOF Chi Sq Prob 26 27 0.05 15 279 236 0.029 18 18 0.08 13 265 234 0.078 25 27 0.04 13 268 231 0.050 22 18 0.03 13 260 233 0.10 10 28 0.05 16 276 234 0.0320 5 18 0.08 16 452 240 3.4 x 10e-15 0 34 0.06 18 234 236 0.53 2 21 0.05 18 254 237 0.21 24 28 0.03 19 356 240 1.5 x 10e-6 23 8 0.05 20 235 234 0.47 16 22 0.05 25 287 233 0.010 9 33 0.12 24 238 231 0.36 11 23 0.04 23 207 234 0.90 17 47 0.06 32 271 239 0.075 6 31 0.04 27 242 238 0.41 Min 8 0.03 13 207 231 3.4 x 10e-15 Max 47 0.12 32 366 240 0.90 Cat 15, tr 7c Virtual Unit # Theta Gain Sigma Chi Sq DOF Chi Sq Prob 26 24 0.03 15 275 237 0.044 18 21 0.06 18 295 228 0.0019 25 32 0.02 16 311 241 0.0016 22 28 0.04 19 245 241 0.42 10 34 0.02 25 279 232 0.020 5 27 0.06 24 231 237 0.59 0 41 0.02 26 281 242 0.044 2 34 0.05 27 223 236 0.72 24 54 0.02 33 242 231 0.30 23 99 0.14 77 241 234 0.37 16 53 0.04 68 231 237 0.60 9 49 0.06 42 244 236 0.35 11 -14 0.008 0.038 550 239 0.00000 17 60 0.03 45 230 231 0.51 6 42 0.02 39 274 238 0.055 Min 21 0.02 15 223 228 0.0016 Max 99 0.14 77 550 242 0.72 Table A. 2 Optimal Parameter Values and Goodness-of-Fit of Static Nonlinearity The three parameter values (θ, ‘Theta’, A, ‘Gain’, and σn , ‘Sigma’) of the best-fitting static nonlinearity functions, N(L) was as described in equation 2.3, are listed here in the upper table for the fifteen simple cells, and for their virtual counterparts in the lower table. The corresponding χ2 values, ‘Chi Sq’ the numbers of degrees of freedom, ‘DOF’ and the probability of goodness-of-fit ‘Chi Sq Prob’ are also listed. The numbers identifying the units are in the left hand column, which are listed in order of RF signal strength, determined as was described in section 3.2. 76 Figure A. 4 Plots of Smoothed Model-Predicted vs. Observed Response to M-Sequence Non-parametrically smoothed scatter plots of the LN model – predicted, N(L) vs. the observed, R spike firing rates in response to the m-sequence are shown here for same two representative cells; for neuron 18 in part A), and for neuron 26 in part B). Correlation coefficients are included in the insets. Dashed lines are the line of equality, to help in visualizing where data deviates from equality. MSEQ, Actual N26 MSEQ, Actual N18 B) A) 77 Figure A. 5 Plots of Smoothed Virtual Model-Predicted vs. Observed Response to M-Sequence Non-parametrically smoothed scatter plots of the LN model – predicted, N(L) vs. the observed, R spike firing rates in response to the m-sequence are shown here for the virtual LNP cell 18 in part A), and for cell 26 in part B). These are the virtual LNP counterparts of the two representative cells for which previous results were plotted. Correlation coefficients are included in the insets. Dashed lines are the line of equality B) A) MSEQ, Virtual N26 MSEQ, Virtual N18 78 Appendix B All of the RF profiles that were measured were well fit by space-time separable spatio-temporal Gabor functions, RF(x,y,t), as defined previously by DeAngelis et al. (1999): ( ) ( ) )(,,, tHyxGktyxRF ⋅⋅= Equation B. 1 Where k is an overall scaling factor. The spatial profile, G(x,y), described in equation B.2, is modeled as the product of a one-dimensional Gabor function running perpendicular to the cell’s preferred orientation, and a one-dimensional Gaussian function running along the axis of the cell’s preferred orientation. The temporal profile, H(t), in equation B.3, is modeled as a Gabor function of time: ( ) ( ) ( )( ) ( )vGuuGyxG vu ⋅⋅= ϕcos, Equation B. 2 ( ) ( ) ( )( )sst ttGtH ϕcos⋅= Equation B. 3 where u and v represent pixel position in a Cartesian reference frame that is centred at the cell’s RF centre, (x0, y0) and rotated to lie along the cell’s axis of preferred orientation, ψ. ts is skewed time, for reproducing uneven zero-crossings of the temporal profile as are consistently observed in cat area 17 and 18 neurons (DeAngelis et al., 1999). The G•(•) terms represent the Gaussian portions, and the φ•(•) terms represent the phase angles within the sinusoid portions of the Gabor functions, as listed in the following equations: ( )          ⋅−= 2 2 1exp u u w uuG Equation B. 4 79 ( )          ⋅−= 2 2 1exp v v w vvG Equation B. 5 ( )           −⋅−= 2 0 2 1exp t s st w tttG Equation B. 6 The w• parameters determine the widths of the Gaussian envelopes, in units of pixels for the u and v dimensions, and in seconds for the time dimension. ( ) ( ) Ψ⋅−+Ψ⋅−= sincos 00 yyxxu Equation B. 7 ( ) uuu ufu φπϕ +⋅= 2 Equation B. 8 ( ) ( ) Ψ⋅−−Ψ⋅−= cossin 00 yyxxv Equation B. 9 where x and y represent position in image pixel coordinates, starting from the upper left corner of the image. ( )tts ⋅⋅= βπ arctan2 Equation B. 10 80 ( ) ( ) tstst ttft φπϕ +−⋅= 02 Equation B. 11 where β determines the degree of skew in the temporal profile. fu and ft are the neuron’s preferred spatial and temporal frequencies, in cycles per degree and cycles per second, respectively. φ u and φ t are the initial spatial and temporal phase angle shifts, respectively; both are in degrees. t0 is the temporal origin of the temporal profile, in seconds. The expression of the spatial Gabor function that is fit to the ‘X-Y’ RF plots as shown in Figure 3.1 is obtained by setting the temporal component, H(t) equal to 1, in equation B.1 above, to give the first equality in equation B.12, below. The other values are the same as defined above. It was found that all the ‘X-Y’ RF plots could be well fit only once another parameter, a relative phase shift, δфu, was added; this results in the spatial Gabor function described as the sum of two identical spatial Gabor functions with the second having a freely-varying phase shift, δфu, relative to the first, and the modified expression from equation B.2 expressed in the second equality below. ( ) ( ) ( ) ( )( ) ( )( )( ) ( )vGuuuGyxGkyxRF vuu ⋅++⋅=⋅= δφϕϕ coscos,, Equation B. 12 Equation B. 12 has nine free parameters in all: θ, x0 , y0 , wu , wv , fu , фu , δфu , and k, as were defined above for the x and y, or equivalently, the rotated u and v spatial dimensions. The free parameter values were adjusted using the downhill Nelder-Mead simplex algorithm that was included in the python scipy.optimize.fmin package to find the optimal values giving the best fit of the function with the ‘X-Y’ RF plots of the nineteen cells that were shown in Figure 3.1. The resulting values are listed in Table B.1, below. 81 Unit # θ x0 y0 wu wv fu Φu δΦu k 26 76 19 13 3.4 3.1 0.22 -84 2 22.1 18 77 18 15 3.8 3.1 0.22 -95 2 18.7 25 86 18 16 3.3 3.0 0.29 119 1 18.9 22 -14 20 17 2.7 4.0 0.25 91 -7 13.6 10 7 20 21 3.8 4.4 0.21 176 -4 13.8 5 99 19 15 3.7 2.4 0.21 -50 1 9.9 0 79 19 16 3.1 3.4 0.29 -42 -3 15.6 2 97 20 17 2.8 4.5 0.22 241 7 7.7 24 2 20 18 4.9 4.7 0.20 243 -7 8.8 23 113 18 18 2.0 1.3 0.03 87 2 7.8 16 47 19 15 3.0 3.6 0.28 -55 -2 7.1 9 -21 18 19 3.5 4.1 0.22 224 33 6.3 11 138 20 16 4.1 2.3 0.23 -61 1 6.7 17 59 18 14 3.6 3.3 0.27 103 1 6.9 6 102 19 13 4.6 3.6 0.18 -95 1 5.6 8 -29 20 16 3.5 5.5 0.26 200 -1 10.6 27 -42 19 18 3.7 3.4 0.25 175 -0.3 7.53 4 170 20 19 2.9 3.8 0.30 -10 -1 8.76 19 14 19 18 4.2 4.0 0.21 200 -3 6.90 Min 18 13 2.0 1.3 0.03 5.55 Max 20 21 4.9 4.7 0.29 22.1 Table B. 1 Parameter Values of Best-Fit Spatial Gabor Functions The optimal parameter values of expression B.12 giving the best fit to the ‘X-Y’ RF plots of the nineteen cells that were shown in Figure 3.1 are listed above. The left-hand column lists the unit number, and the cells are listed in order of their signal strength, just as in the previous tables. The resulting values of all nine parameter values defined above are listed for each of the cells. The first fifteen cells were included in subsequent analysis for testing the accuracy of the simple cell functional model for predicting responses to natural scenes. The next four rows represent those cells that were excluded from subsequent analysis on the basis of insufficient RF signal in distinct excitatory and inhibitory regions, and correspond to the cells that were marked with asterisks in Figure 3.1. The extremes of the ranges of values that were obtained are listed at the bottom of the columns, where relevant. The average values of the centre coordinates, x0 and y0, and the standard deviations, wu and wv, of the Gaussian envelopes are listed here in pixels at the natural scene resolution. These four values were used to determine the dimensions of the pink ellipse drawn in Figure 4.4. The half-lengths of the major and minor axes of the ellipse in Figure 4.4 were taken to be equal to twice the standard deviations of the Gaussian envelopes along the u and v directions, wu and wv respectively, converted into degrees. The preferred orientation, θ and absolute and relative phase angles, фu and δфu , respectively are all listed in degrees. The preferred spatial frequency, fu is listed in cycles per degree.