
Collaborators on CCB projects should look for greater detail here
My primary research interest is in developing mathematical methods for the solution of dimensionality reduction and system identification of nonlinear systems. I view these problems as fundamentally related. In most cases, the output side of a predictive model is a relatively simple function. In neuroscience, this is often nothing more then a set of action potential times in one output neuron. y=\sum_{i=1}^n{\delta(t-\tau_i)} (or occasionally in several, which does not very much complicate the output). In most cases this represents a massively smaller space than the set of all inputs that might be expected to regulate the output.
It is therefore reasonable to consider many complex predictive models in terms of a very simple predictive model operating on a driver potential, and a more complex process of calculating the driver potential. The driver potential will be, generally, a function with similar bandwidth to the desired output, but related deterministically to the values of the input space. If it is well constructed, the driver potential represents a form of compression or dimensionality reduction on the input space. In particular, it is the reduction that best maintains the elements of the input space salient to predicting the output in question. Given a solution to this problem, construction of a predictive model is often trivial.
In the context of sensory neuroscience, there is reason to believe that a system will have been selected to represent as much useful information as possible, given the output bandwidth. If this is the case, then the driver potential used in a reasonable model of such a system provides not just any dimensionality reduction on the input space, but a reduction to the most behaviorally relevant and biologically salient components.
Attempts to rigorously represent the process of sensory neural coding often phrase the coding problem in terms of seeking a (generally probabilistic) relationship between points in the (entire) input space, and the neuron output space. A common test of a putative model of this relationship is to attempt to reconstruct the input given the output. Failure of a given model to produce a good reconstruction of the input is often taken as evidence for insufficiency of that model. It is important to remember, however, that in most coding problems the output space is of much smaller dimension (or bandwidth) than the input space, as described above. In this situation, even a perfect model of a perfectly reliable encoder can't be expected to reconstruct the entire input space.
If we knew exactly what was the salient stimulus space for a given sensory system, then we could restrict our inputs to this space, and expect much better (and more informative) predictions from reconstruction models. In general we do not know the salient input space, however. Often a major component of neural coding research is in fact to find out what this space is. Since we can't determine whether a stimulus feature is salient to the system if we never present it, the conservative strategy when searching for a salient stimulus space is to begin with extremely rich stimuli, that err whenever possible in the direction of presenting too many stimuli rather than too few. This approach can be effective, but it compounds the problem of constructing a concise model of a neural code, or making an accurate stimulus reconstruction. For this reason, the dimensionality reduction component of the system identification problem is also an essential part of describing a neural coding scheme.
In fact, an effective system identification model of a sensory neural system can be viewed as a highly compressed expression of the neural code. This expression may be more attainable, and also much more tractable to use than the more usual idea of a codebook or general mapping from equivalence classes in the input space to equivalence classes in the output space. A proponent of the codebook, however, could argue the reverse; a good codebook is a system identification, and a good set of equivalence classes on the input is a dimensionality reduction.