• Home
  • DARPA SyNAPSE
  • Business-minded
  • Compute Me
  • Brainplug
  • Biophys-Ed

A Real Test for Object Recognition

Jeff Markowitz | February 10, 2009

Dun dun dunHumans are remarkably good at identifying the same face across illuminations, positions, deformations, and depths. The same face can even be identified through fences, glass, and water. The possible number of contexts for a face to appear in are infinite, yet we can identify it instantaneously. For whatever reason, we are really good at identifying objects, but researchers have struggled to make computers even semi-competent at it. One of the more valiant efforts is Yann LeCun’s use of convolutional nets, but its primary successes are in controlled situations. Any reasonable person in the field would agree that any human can wipe the floor with even the best algorithm running on the best supercomputer (programmed by the best programmer in the best department in the best state in the best country!). So what gives?

A recent article from Pinto, Cox and DiCarlo points to a fundamental flaw in the current approach: the metric. Most algorithms for object recognition are built with the famed Cal Tech 101 database or Cal Tech 256 in mind. Looking through the datasets, they seem to be perfectly natural tests for anything that purports to recognize objects. Particular objects, e.g. cockroaches, are presented in a panoply of contexts. So, if my algorithm can recognize a cockroach on a tree, on a piece of bark, and on a white background, it’s doing its job, right? Well, it turns out that using only natural images, a recent craze in image processing, allows algorithms to leverage statistics in the image that are not part of the object itself. This is fine, and such statistics ought to be used. Yet, Pinto et al. demonstrate that even a decidedly “stupid” algorithm can perform as well as the latest and greatest when using Cal Tech 101 as a metric.

More precisely, they used a bank of linear filters to grossly approximate the function of V1, along with an off-the-shelf support vector machine library. At the very least, neuroscientists have implicated V1, V2, V4, IT, PFC, and perhaps parietal cortex in object recognition. What this demonstrates, among other things, is that the Cal Tech 101 database is not as hard as it seems. If this simple null model can perform on par with the state-of-the-art, then either object recognition algorithms have gone nowhere or the metric is all wrong. I side with the latter view, along with Pinto et al. I presume. Though, I would venture to guess that more than a few algorithms get by due to the severely lacking metric. Still, Pinto et al.’s proposed dataset of precisely and parametrically varied ray traced images helps to fill the gaps of Cal Tech 101/256. Namely, an algorithm cannot “cheat” and use information from the environment or the particularly artistic lighting of the photographer. Instead, that algorithm must recognize an object across all views, rotations, scales, illuminations, and in noise. Thankfully, their null model completely chokes on this dataset, but I’d like to see how more trumped-up models fare.

PLoS Computational Biology, 2008. DOI: 10.1371/journal.pcbi.0040027

(Image from Flickr user Max Kiesler)

Categories
Uncategorized
Tags
object recognition
Comments rss
Comments rss
Trackback
Trackback

« IBM Seeks to Build the Computer of the Future Based on Insights from the Brain What the hell do you do and how should you do it »

One Response to “A Real Test for Object Recognition”

  1. 730prof says:
    February 16, 2009 at 8:42 pm

    Your thesis is at odds with the content of the first (default) stream at: http://web.mit.edu/serre/www/InTheNews.htm (the Feb 19th entry at CN730-2009 website). Be prepared to defend yourself!

Leave a Reply

Click here to cancel reply.

Jump to

About Neurdon
About SyNAPSE
Contact
Contributors
Editors
Glossary
Neurdon Merch

Tags

adaline adaptive resonance theory arm processor artificial intelligence auditory cat brain cochlear implant consciousness continous firing neurons controller cortical column DARPA DARPA SyNAPSE Dharmendra Modha events Excitatory Postsynaptic Potentials FACETS flas flash memory global workspace theory Greg Snider hearing HP HRL Hynix IBM Inhibitory Postsynaptic Potentials iSLC it Izhikevich law and robotics learning Leon Chua markram MATLAB MATLAB code Melanie-Mitchell memristor memristors Minsky modha modular robotics money Moore's Law Narayan Srinivasa neural engineering neural prosthesis neuromorphic technology NSF object recognition poggio rat brain rate-based models Ray Kurzweil riesenhuber robot robotics robotic weapons sensory fusion serre software SPICE model spike-based models spiking neurons Stanley Williams stdp super computer supercomputer synaptic plasticity talk time as supervisor vision

Blogroll

  • CELEST
  • CNS Tech Lab
rss Comments rss valid xhtml 1.1 design by jide powered by Wordpress get firefox