• Home
  • DARPA SyNAPSE
  • Business-minded
  • Compute Me
  • Brainplug
  • Biophys-Ed

The latest and greatest

Jeff Markowitz | February 23, 2009

Buzz?Riesenhuber and Poggio supplied a seminal model of object recognition in 1999. It derived a lot of its power from sheer simplicity. With just a few mathematical operations it seemed to model the entirety of the ventral stream, the area of the brain dedicated to processing “What” information, i.e. information about the identity of an object. It starts with a layer of Gaussian-tuned `simple’ or S cells, which respond to particular line orientations. That is, a particular S cell might respond to a diagonal line in a particular spot in an image. Then, all S cells of the same orientation feed to a ‘complex’ or C cell, which represents the maximally activated S cell. In CS terms, they take an argmax over a local neighborhood.

This key operation results in model cells that demonstrate invariance to position. They respond in the same way to a diagonal line anywhere in the input image. For instance, a group of diagonally-tuned S cells will respond differently to a diagonal line at different parts of the input image, but the maximum over all these cells will not change as long as a diagonal line  is present. So, after a few more stages of S and C cells, the model generates an invariant representation of a complex object, which is then fed to a classifier, perhaps a support vector machine (SVM). The SVM determines whether features in the invariant representation correspond to a particular category. This type of classifier presumably models the function of the prefrontal cortex, which has been implicated in categorization through work in Earl Miller’s lab. Even if the model skimmed over some of the finer details about the ventral stream, it presented a nice, easily understandable theory that you could implement in a matter of hours.

Serre and colleagues have taken over the model, with the latest developments documented in this 2007 PNAS article. It has performed some very impressive technical feats like recognizing actions in a video feed, and, after nearly a decade, the model has surprisingly retained most of its original structure. In fact, aside from adopting the use of Gabor filters, it seems to use most of the same essential computations. But, along with these computations comes a bit of theoretical baggage. Like most models of object recognition, this model works under the presumption that cells become more and more invariant progressing through the ventral stream. In other words, the receptive fields of IT cells should be much larger than those of V1 cells (even the notion of IT cells having receptive fields is tricky, since IT has no retinotopy). This is fine as a rough approximation, but recent data in human and monkey casts some doubt on the specifics of this hypothesis. The ventral stream may construct an invariant representation, but it’s unclear if it does so according to the principles of Serre et al.’s model (still this shouldn’t detract from its technical accomplishments).

Of course, at the moment, it looks like we’re woefully short on alternatives.

PNAS, 2007. DOI: 10.1073/pnas.0700622104

(Image from Flickr user Chuckumentary)


Categories
Uncategorized
Tags
object recognition, poggio, riesenhuber, serre
Comments rss
Comments rss
Trackback
Trackback

« To spike or not to spike Reliable Computation with Biological Components »

Leave a Reply

Click here to cancel reply.

Jump to

About Neurdon
About SyNAPSE
Contact
Contributors
Editors
Glossary
Neurdon Merch

Tags

adaline adaptive resonance theory arm processor artificial intelligence auditory cat brain cochlear implant consciousness continous firing neurons controller cortical column DARPA DARPA SyNAPSE Dharmendra Modha events Excitatory Postsynaptic Potentials FACETS flash memory global workspace theory Greg Snider hearing HP HRL IBM Inhibitory Postsynaptic Potentials iSLC it Izhikevich law and robotics learning Leon Chua markram MATLAB MATLAB code Melanie-Mitchell memristor memristors Minsky modha modular robotics money Moore's Law Narayan Srinivasa neural engineering neural prosthesis neuromorphic technology NSF object recognition poggio rat brain rate-based models Ray Kurzweil riesenhuber robot robotics robotic weapons sensory fusion serre software SPICE model spike-based models spiking neurons Stanley Williams stdp super computer supercomputer synaptic plasticity time as supervisor

Blogroll

  • CELEST
  • CNS Tech Lab
rss Comments rss valid xhtml 1.1 design by jide powered by Wordpress get firefox