Neural network inference from its time series: October 2013

Thursday, October 31, 2013

Izhikevich numerical stability: Normalizing the coupling strength

In the Izhikevich numerical model, the inputs to a neuron are represented by the term

$I$ in the dynamical equation for

$v$ . In the example code provided, "purely random inputs" to a neuron (called "thalamic input", given the analogy of cortical neuron modeling) are drawn from a Gaussian random variable at each time step. If the amplitude of the Gaussian random variable is held constant, the number of spikes induced by the random input is highly dependent on the time step of integration.

(I observed this phenomenon as I was trying to use decreasing time steps in integration, to show that the simulated traces would converge to a stationary value.)

I think a reasonable way to handle the unwanted variation in the strength of the random input effect is to pre-synthesize the random input traces at the beginning, and interpolate the result for a specified integration time.

Monday, October 28, 2013

Example electrophysiology trace

Example electrophysiology trace from this paper:

Monday, October 21, 2013

Proposal

See here.

Friday, October 18, 2013

Basic question #5: Can we improve the numerical stability of the Izhikevich model?

Indeed, by decreasing the integration step size from 0.5 ms (Izhikevich default) down to 1/32 ms, the peak of the action potential stabilizes at around ~30 mV, which is the physiologically realistic value:

Only then (with well-behaved action potential amplitudes) is it reasonable to compute the "analog" cross-correlation between two traces:

On the other hand, it occurred to me that I need to think carefully about the scaling of the random input ("thalamic input") as the step size is changed. Presumably, there is a nice theoretic justification for the appropriate scaling. We should also verify that, with correct scaling, the firing rate does not depend (too strongly) on the integration step size.

Thursday, October 17, 2013

Basic question #4: Robustness of the FDR network inference to larger (but unconnected) graphs

The basic motivation for using the FDR method for inference is to automatically distinguish the significant edges in a given network, in a conservative way.

As a test of the FDR procedure, I will revisit the case of

$N=9$ ,

$M=9$ . (It now occurred to me that I was not careful about checking for repeated edges in the underlying directed edge construction. Furthermore, it is possible to get

$n_i \to n_j$ and

$n_j \to n_i$ , in which case, the "true" number of inferrable undirected edges will be less than the nominal

$M$ .)

Wednesday, October 16, 2013

Basic question #3: Simulation of neural signals

Up to this point, my experiments have involved randomly generated noise. However, the intent of this work is to look for network connectivity in recordings of neuron populations. More specifically, I wish to develop a "null model" based on neuronal simulations.

Based on a brief literature review, it appears that Izhikevich's numerical code for neuron simulation has nonzero traction. (For instance, Kramer's simulated neural data is based on the Izhikevich model). More information about the Izhikevich model can be obtained here. Izhikevich seems to be an interesting guy, and I'm eager to find out more about his work.

On the other hand, in the longer run, it's probably good to test the network inference framework on multiple simulation models, so that we are not misled by idiosyncrasies of a particular numerical model.

A first stab at generating a few Izhikevich neurons:

Now, to get at the model in a bit more detail, I would like to see the traces in a more controlled setting: evolution of two neurons where neuron 1 has an excitatory synapse on neuron 2. Here's a "first pass" simulation of two neurons with a specific nonzero coupling between them. I also plot the sample cross-correlation between the two temporal traces:

Here are some thoughts on Izhikevich's numerical model:

The paper that describes the model is pretty sparse on "biological intuition."
It seems to me that the >100 mV amplitude spiking that I see in my "local implementation" (which is basically copied and pasted from the Izhikevich paper) is a numerical glitch. Two observations: (1) I find that if I step the numerical model at timesteps other than 0.5 ms (actually, using finer steps), the numerical trajectories become divergent; (2) The bottom panel of Fig. 3 in the paper shows a pretty regular sized amplitude train, which appears to be on the order of <100 mV based on the noise characteristics.

I think getting a handle on the numerical simulation will be important. I suspect that the pulse amplitudes -- which suffer from numeric noise (as I claim above) -- is throwing off the cross-correlation metric.

Another possibility is to use the "digital" firing traces (i.e. did a cell fire at a particular time or not) instead of the "analog" (voltage) time traces.

Next, my observations on how cell-to-cell connectivity is implemented.

Numerical implementation of synapses in the Izhikevich model:

In the Izhikevich numerical model, each neuron has two state variables

$(u, v)$ that proceed according to the following differential equation:

$v' = 0.04 v^2 + 5v + 140 -u + I$
$u' = a\cdot (bv - u)$

where

$v$ represents the membrane potential of that neuron. In the DE for

$v'$ , the term

$I$ is a forcing term that represents the inputs to that neuron. In particular, in Izhikevich's example code the forcing term consists of two terms: a random term (denoted "thalamic input", so presumably we're modeling cortical neurons), and another term that derives from the connection matrix

$S_{ij}$ .

How to scale the connection strength $S_{ij}$ ?

In the numerical model, we can set any arbitrary value for

$S_{ij}$ . Clearly, our ability to distinguish the cross-correlation between two neurons will be aided by choosing a large value of the connection strength. However, it is important to use values for

$S_{ij}$ that is biologically relevant.

Unfortunately, the fact that the differential equation is given in "dimensionless units" (even though v is supposed to represent voltage at mV scale and time is ms scale) and the lack of biological insights in the paper makes it hard to determine the reasonable range of

$S_{ij}$ values.

Instead, I propose to perform the following experiment. I will hook up a two neuron system where neuron 2 (N2) drives neuron 1 (N1) through a coupling term

$S_{1,2}=\alpha$ . I will remove the "thalamic" input from N1, so that N1 is driven exclusively by N2 (otherwise, it is difficult to distinguish the effect of N2). I will characterize

$\alpha$ by the amplitude of the excitatory post-synaptic potential (EPSP).

Based on this simulation, I find that the (positive) magnitude of

$alpha$ is roughly the amplitude of the EPSP. According to Wikipedia, the amplitude of PSPs can be "as low as 0.4 mV to as high as 20 mV". With

$S_{1,2}=20$ , action potentials from N2 is able to drive action potentials in N1. So, it appears that a range of

$S_{ij} \leq 20$ should be biologically feasible.

Tuesday, October 15, 2013

Basic question #2: Simple instance of FDR inference

The intent in this post is to basically replicate (and understand) the simple "pink noise data" simulation experiment described in Kramer2009. Some remarks:

Kramer's independent noise signals are colored (with $1/f^{0.33}$ spectrum), whereas I will do my initial experiment with white noise.
Kramer's experiment uses a coupling strength of $\alpha=0.4$ . That is, if node 1 connects to node 2, the signals are $x_1[t] = w_1[t]$ and $x_2[t] = w_2[t] + 0.4 w_1[t]$ where $w_i$ are independent noise signals. So, the optimal lag in the cross-correlation is expected to be zero.

I followed the FDR procedure as described in Kramer2009. As with the "pink noise data" experiment, I used a network of

$N=9$ nodes and

$M=9$ edges. For each inference, some parameters of interest are:

Number of false inferred edges (this is the quantity that the FDR method intends to control -- in expectation);
Number of inferred edges (relative to the true edge count $M$ ).

Here is a single inference run, showing the ordered

$p$ -values, using the "extremum" method (e.g. Fig. 2(c)) with "FDR level" of

$q=0.1$ :

Next, I ran

$N_{inf}=10^4$ instances of the inference problem, to see the general distribution of inference results.

Here is the distribution of num_inferred_edges:

Recall that the true number of edges is

$M=9$ . So, it appears that the current FDR procedure tends to under-report the expected number of edges (the mode appears to be

$7$ ). This is consistent with the FDR process being somewhat "conservative" in its estimate of edges.

It is important to keep in mind that there will be variations in the number of inferred edges. The time series data does not unambiguously reveal the presence or absence of a coupling.

Here is the distribution of num_false_edges = num_inferred_edges - num_correct:

Based on appearance, I attempted to model the distribution as a geometric distribution -- which is apparently not quite right.

Actually, the more "appropriate" horizontal scale for the above distribution is the proportion of false positives, since the FDR level

$q$ is "an upper bound on the expected proportion of false positives among all declared edges in [the] inferred network" [Kramer2009]. Here is that distribution:

I find a mean false edge proportion of

$0.07$ in my

$N_{inf}=10^4$ dataset, which is below the

$q=0.1$ bound as advertised.

One more thing. We might look at the variation in false detection rate conditioned on the number of inferred edges.

Initial agenda: the warm-up

Here are some warm-up tasks:

Run through a single complete instance of Kramer's FDR (false discovery rate) mechanism for a small random network with correlated noise. (Basically, replicate Section IV.A of Kramer2009.)

Consider the null distribution of $C_{ij}$ for colored noise, and as a function of the signal length $n$ .
Also, examine the distribution of $s_{ij} = \mathrm{max}_\tau|C_{ij}(\tau)|$ whose CDF $P[z]$ is given in Eq. 4 of Kramer2009.

Run through many instances of the inference on many small random networks, to verify if the expected number of spurious edges is consistent on multiple instances of the inference task (where ground truth is known).

I may want to explore the parameter space a bit, considering the accuracy of the inference as a function of the coupling strength $\alpha$ , as well as the general validity of FDR.

Acquaint myself with Izhikevich's synthetic neuron simulation routine, for the generation of simulated neural data. Note that Izhikevich's script is referred to in a few computational papers on network inference.
Acquaint myself with the "frequency domain bootstrap" method in Kramer2009, which will be relevant for experimental cases where one cannot simply analytically estimate or simulate a distribution.

I am also interested in coupling measures that are not the sample cross-correlation (many are listed in Kramer2009). In particular, I am interested in asymmetric measures that can yield directed dependencies. On this thought, a concept that is repeatedly mentioned in the literature is "Granger causality."

Basic question #1: Distribution of sample cross-correlations

To get started, I will examine some basic claims found in Kramer, et. al.'s "Network inference with confidence from multivariate time series." The paper proposes a computational methodology which, according to the authors, will allow one to reconstruct the connectivity network with confidence (i.e. an estimate of the correctness of the inferred graph, as quantified by the "false discovery rate"). This approach is clearly an improvement over ad hoc threshold-based methods, and greatly appeals to me.

Kramer defines the sample cross correlation (Eq. 1) as:

$C_{ij}[\tau] = \frac{1}{\hat{\sigma_i}\hat{\sigma_j}(n-|l|)} \sum_{t=1}^{n-\tau} (x_i[t]-\bar{x}_i)(x_j[t+\tau]-\bar{x}_j)$
Actually, Kramer's normalization of

$(n-2l)$ doesn't make sense to me for two reasons: (1) there's a spurious factor of

$2$ which does not capture the number of overlap terms in the case of finite signals, and (2)

$l$ should have the absolute value applied. I find that my corrected formula is consistent with Eq. 2 in Kramer2009, whereas the original definition is not.

Kramer also cites the following (Bartlett's) estimate for the variance of

$C_{ij}[l]$ when signals

$x_i$ and

$x_j$ are uncoupled:

$var(C_{ij}[l]) = \frac{1}{n-|l|}\sum_{\tau=-n}^n C_{ii}[\tau]C_{jj}[\tau]$ .
(Again, I added the absolute value.)

The corresponding mean is clearly

$E\left[C_{ij}[l]\right]=0$ .

For this inaugural post, I would like to verify the Bartlett estimate. I begin with two uncorrelated white Gaussian noise

$x_1[t]$ and

$x_2[t]$ .

Here is the simulated distribution of

$C_{ij}[l]$ at a few values of

$l$ , and a normal distribution whose variance is given by the Bartlett estimate. Not bad at all!:

Next, Kramer asks us to consider the Fisher transformation of the

$C_{ij}$ , as follows:

$C_{ij}^F[\tau] = \frac{1}{2}\log\frac{1+C_{ij}[\tau]}{1-C_{ij}[\tau]}$ .

Oh, this bit is trivial. The Fisher transform maps

$[-1, 1] \to [-\infty, \infty]$ , so

$C_{ij}^F$ is better described by the normal distribution than

$C_{ij}$ . I checked the correspondence of the above experiment, when the

$C_{ij}$ values underwent a Fisher transform. The agreement with the Bartlett estimated distribution is still good (the transform does little to change the values of

$C_{ij}$ above).

Next, let me consider the distribution in

$C_{ij}$ in the case of an actual correlation between

$x_1$ and

$x_2$ . I will follow Kramer's example and define the two signals as follows:

$x_1 = w_1$ and

$x_1 = w_2 + \alpha w_1$ (

$\alpha = 0.4$ ) where

$w_i$ are independent WGN instances. As I understand it, it is not required to estimate the distribution of

$C_{ij}$ under the alternate hypothesis ("H1: Coupling") for Kramer's framework, but I want to see the distribution since I have the scripts already written:

Interestingly, it does not appear to be the case that the distribution is simply the normal distribution with a mean at the coupling value

$\alpha$ and the Bartlett estimate of the variance. The distribution of

$C_{ij}$ (left panel in above figure) is definitely not well described by this candidate distribution; the distribution of

$C_{ij}^F$ fares better (right panel) but there is still a systematic offset.

[2013 10 16]: Actually, I wonder if the mean of the Bartlett estimate normal needs to be inverse Fisher-transformed...

Note that the correlated

$C_{ij}$ at

$\alpha=0.4$ would all be extremely improvable (very low

$p$ -value) under the null distribution. So, correlation at

$\alpha=0.4$ is highly detectable whereas

$\alpha \approx 0.1$ would be harder to distinguish, as judging from the null hypothesis distribution.