/* ---- Google Analytics Code Below */

Monday, July 29, 2019

We See Shapes, DL Sees Textures

Interesting observation here, what are its ultimate implications for AI?

Where We See Shapes, AI Sees Textures in QuantaMagazine
To researchers’ surprise, deep learning vision algorithms often fail at classifying images because they mostly take cues from textures, not shapes.

To make deep learning algorithms use shapes to identify objects, as humans do, researchers trained the systems with images that had been “painted” with irrelevant textures. The systems’ performance improved, a result that may hold clues about the evolution of our own vision.

Jordana Cepelewicz    Staff Writer

When you look at a photograph of a cat, chances are that you can recognize the pictured animal whether it’s ginger or striped — or whether the image is black and white, speckled, worn or faded. You can probably also spot the pet when it’s shown curled up behind a pillow or leaping onto a countertop in a blur of motion. You have naturally learned to identify a cat in almost any situation. In contrast, machine vision systems powered by deep neural networks can sometimes even outperform humans at recognizing a cat under fixed conditions, but images that are even a little novel, noisy or grainy can throw off those systems completely.

A research team in Germany has now discovered an unexpected reason why: While humans pay attention to the shapes of pictured objects, deep learning computer vision algorithms routinely latch on to the objects’ textures instead.

This finding, presented at the International Conference on Learning Representations in May, highlights the sharp contrast between how humans and machines “think,” and illustrates how misleading our intuitions can be about what makes artificial intelligences tick. It may also hint at why our own vision evolved the way it did. ... 

Deep learning algorithms work by, say, presenting a neural network with thousands of images that either contain or do not contain cats. The system finds patterns in that data, which it then uses to decide how best to label an image it has never seen before. The network’s architecture is modeled loosely on that of the human visual system, in that its connected layers let it extract increasingly abstract features from the image. But the system makes the associations that lead it to the right answer through a black-box process that humans can only try to interpret after the fact. “We’ve been trying to figure out what leads to the success of these deep learning computer vision algorithms, and what leads to their brittleness,” said Thomas Dietterich, a computer scientist at Oregon State University who was not involved in the new study.  ... "

No comments: