With modern statistical methods, we have fast and computationally tractable schemes to fit models of neural encoding and decoding to experimental data. A key insight is that, for a suitable chosen model class, the likelihood of the data being generated by the model is a concave function of the model parameters, i.e., there are no local maxima. Because of this, numerical methods of gradient ascent are bound to lead to the global maximum.
Generalized Linear Models (GLM) are the representative of this model class. Importantly, a large ensemble of generalized integrate-and-fire models, in particular the SRM with escape noise, belong to the family of GLM. As we have seen in previous chapters, the SRM can account for a large body of electrophysiological data and firing patterns such as adaptation, burst firing, time-dependent firing threshold, hyperpolarizing spike-after potential etc. The link from SRM to GLM implies that there are systematic and computationally fast methods to fit biologically plausible neuron models to data.
Interestingly, once neuron models are phrased in the language of statistics, the problems of coding and stimulus design can be formulated in a single unified framework. In the following chapter we will see that the problem of decoding can also be analyzed in the same statistical framework.
An early application of maximum likelihood approaches to neuronal data can be found in Brillinger (69). The application of the framework of Generalized Linear Models to the field of neuroscience has been made popular by Truccolo et al. (521); Pillow et al. (399). A review of Generalized Linear Models can be found in Dobson and Barnett (129).
The influential book of Rieke (437) gives a broad introduction to the field of neural coding. The time-rescaling theorem was exploited in (73) to develop useful goodness-of-fit methods for spike trains. Spike train metrics were introduced in Victor and Purpura (535, 534), but comparisons of spike trains in terms of PSTHs and other features has been commonly used before (392; 393; 174; 321; 138; 169). Many other spike train distances were also proposed (260; 531; 410; 234; 458; 357) which can be cast in the general framework of a vector space as outlined in Schrauwen and Campenhout (457), Paiva et al. (376) and Naud et al. (357); see also Paiva et al. (377, 378); Park et al. (388). Non-linear functions of the spike trains can also be used to relate to different features of the spiking process such as the interval distribution or the presence of definite firing patterns (535; 410; 513; 279; 280; 133).
Concave function and non-global optima .
(i) Suppose a function has a global maximum at location . Suppose that is a strictly increasing function of (i.e., )
Show that has a maximum at . Is it possible that has further maxima as a function of ?
(ii) A strictly concave function can be defined as a curve with negative curvature for all . Show that a concave function can have at most one maximum.
(iii) Give an example of a concave function which does not have a maximum. Give an example of a function which has a global maximum, but is not concave. Give an example of a function which is concave and has a global maximum.
Sum of concave functions
Consider a quadratic function .
(i) Show that is a concave function of for any choice of parameter .
(ii) Show that is a concave function.
(iii) Show that with is a concave function.
(iv) Repeat the steps (ii) and (iii) for a family of functions which are concave, but not necessarily quadratic.
Comparing PSTHs and spike train similarity measures. Experimentally the PSTH is constructed from a set of spike trains, , measured from repeated presentations of the same stimulus. The ensemble average of the recorded spike trains:
is typically convolved with a Gaussian function
around 5 ms, such that
is a smoothed PSTH. Suppose that two sets of experimental spike trains were recorded in two different conditions, resulting in two smoothed PSTHs
a) Show that the sum of the squared error can be written as a distance between sets of spike train with the kernel .
b) Recall that the correlation coefficient between datasets and is
Show that the correlation coefficient between the two smoothed PSTHs can be written as a angular separation between the sets of spike trains with kernel .
Victor and Purpura metric.
Consider the minimum cost
required to transform a spike train
into another spike train
if the only transformations available are
- Removing a spike has a cost of one.
- Adding a spike has a cost of one.
- Shifting a spike by a distance has a cost where is a parameter defining temporal precision.
The defines a metric that measures the dissimilarity between spike train and spike train . The smaller the most alike the spike trains are in terms of spike timing.
a) For units of cost per seconds, show that becomes the difference in number of spikes in spike trains and .
b) For greater than four times the maximum firing frequency (i.t., the inverse of the shortest observed interspike interval), show that can be written as a distance with kernel and triangular function where is the Dirac delta function and is the Heaviside function.
© Cambridge University Press. This book is in copyright. No reproduction of any part of it may take place without the written permission of Cambridge University Press.