If you want to design robots able to interact to the real world in a useful way, you will eventually bump into the problem of implementing robust object recognition, when by robust I mean able to recognize objects irrespective of (or at least able to tolerate variation in..) distance from the object, its orientation, illumination conditions, etc.
June 15, 2012|
October 22, 2010|
A recent poster by Los Alamos National Laboratory researchers led by Steven Brumby, titled "Visual Cortex on a Chip: Large-scale, real-time functional models of mammalian visual cortex on a GPGPU", shows another interesting application of graphic processing units (GPUs) to computational neuroscience. What is GPGPU? Read the rest of this entry »
February 2, 2010|
One of the major themes in the SyNAPSE project is developing chips that can learn meaningful information, and preserve it over time. In other words: memristors can learn, but we need to ensure that they are stably learning something useful for the system they are embedded in.
Some help to solve this technological problem comes from neuroscience. The question of how can the cerebral cortex develop stable memories while at the same time incorporating new information through an organism lifetime has been a central theme in many research groups. The talk posted on Neurdon describes one of these approaches. Read the rest of this entry »
July 16, 2009|
I've thought a bit about how modelers approach brain areas whose functions are still not very well constrained by robust neurophysiological data. By this, I mean that there is simply not enough data to say, in plain terms, what that particular brain area does. In terms of visual cortex, this pretty much accounts for all areas beyond V1, namely V2, V3, V4, posterior IT (ITp), anterior IT (ITa), which all form a loose hierarchy (in the order they're listed), and whatever areas of the temporal lobe may be 'visual', e.g. entorhinal. These words may sound a bit harsh, or even better, like flame-bait. Yet, when a major computationalist publishes an article titled "How Close Are We to Understanding V1?" (to be read in the accusatory sense), and one takes into account that V1 is supposed to be the one area neuroscience figured out decades ago, well, that changes things.
June 28, 2009|
I'm a 4th-year PhD student in the Institute of Cognitive Science at The University of Louisiana at Lafayette. When I entered the program, I was mostly interested in AI and evolutionary algorithms. I wanted to evolve a Go-playing program. But my interests shifted, especially in my first year when I read Jeff Hawkins' On Intelligence. I thought it was great stuff, and I liked two things central to his framework: 1) The temporal aspect of cognition, and 2) The crucial role of feedback. He made a convincing case that every modality and skill is essentially a matter of learning and processing sequences. So that's where I started focusing my attention. Read the rest of this entry »
June 9, 2009|
First, a hearty welcome to Ethan, you're starting to make this whole enterprise a little less incestuous! Anyway, your recent post raises a number of interesting issues regarding inferotemporal cortex (IT), most prominently: how does IT learn to do what we think it does?
I'd first like to address what we think IT does, which is a step I find myself skipping quite a lot (awful scientist am I!). Based a number of classical studies which compared lesions of IT with lesions of parietal cortex, for example, it was determined that IT mediated some form of visual discrimination and perhaps limited `size constancy', or at least was a key pathway in whatever area in fact does this (see here, here, and here, for instance). The presumption, based on newer electrophysiology in macaque TE and TEO (analogous to anterior IT, ITa, and posterior IT, ITp, respectively) is that IT performs some sort of hashing to signal the presence of an object across sizes, retinal translations, clutter conditions, whatever.
June 4, 2009|
Max asked me to post some information about how time could act as a ‘supervising’ learning signal to create invariant representations in IT (particular in reference to Jim DiCarlo’s work in this area). Since I am lazy, the below post is a modified section of the background from my thesis proposal - hopefully it’s not too boring….
March 12, 2009|
Most researchers presume that the meat of visual object recognition occurs in inferotemporal cortex (IT), though there is nothing near a consensus on how this is done (i.e. the, eh em, how the meat is prepared). Some claim that the firing of IT cells, in particular cells in anterior IT (ITa), represent categories of objects. That is, a cell might fire for cats and another dogs, responding in the same way to different retinal images from one category. This sort of simplistic view seems approximately correct given the volume of data amassed over the past 30 years in monkey electrophysiology, but the evidence remains frustratingly indirect. Only a few things are certain: (1) ITa cells love "complex" objects (i.e. something more complicated than an oriented bar) and (2) they appear to have large receptive fields relative to striate cortex. How these characteristics lead to the formation of category representations in IT is a mystery, and it will probably stay that way until we find better ways to look at IT cells, perhaps using dual-photon calcium imaging. Current electrophysiological methods can only record from tens of nearby cells at the most, and imaging methods don't have the resolution to tell us what particular cells are doing at the millisecond time scale.
February 23, 2009|
Riesenhuber and Poggio supplied a seminal model of object recognition in 1999. It derived a lot of its power from sheer simplicity. With just a few mathematical operations it seemed to model the entirety of the ventral stream, the area of the brain dedicated to processing "What" information, i.e. information about the identity of an object. It starts with a layer of Gaussian-tuned `simple' or S cells, which respond to particular line orientations. That is, a particular S cell might respond to a diagonal line in a particular spot in an image. Then, all S cells of the same orientation feed to a 'complex' or C cell, which represents the maximally activated S cell. In CS terms, they take an argmax over a local neighborhood.
February 10, 2009|
Humans are remarkably good at identifying the same face across illuminations, positions, deformations, and depths. The same face can even be identified through fences, glass, and water. The possible number of contexts for a face to appear in are infinite, yet we can identify it instantaneously. For whatever reason, we are really good at identifying objects, but researchers have struggled to make computers even semi-competent at it. One of the more valiant efforts is Yann LeCun's use of convolutional nets, but its primary successes are in controlled situations. Any reasonable person in the field would agree that any human can wipe the floor with even the best algorithm running on the best supercomputer (programmed by the best programmer in the best department in the best state in the best country!). So what gives?