/* ---- Google Analytics Code Below */

Tuesday, July 16, 2019

How Much Knowledge Has been Created?

We explored this early on with image tagging in the enterprise.   And while we have developed lots of specific usage cases, nothing as broadly usable as we wanted.

The data that trains AI increasingly calls into question AI
After 10 years of ImageNet, AI researchers are digging into the details of test sets and some are asking just how much knowledge has really been created with machine learning.
By Tiernan Ray in ZDNet

It's been 10 years since two landmark data sets appeared in the world of machine learning, ImageNet and CIFAR10, collections of pictures that have been used to train untold numbers of models of computer vision deep learning neural networks. The venerable nature of the data has prompted some AI researchers to ask what goes on with those data sets, and what their longevity means about machine learning in the bigger picture.

As a result, 2019 may mark the year the data indicted some of the fundamental beliefs about AI.

Researchers in machine learning are getting much more specific and rigorous about understanding how the choice of data affects the success of neural networks.

And the results are somewhat unsettling. Recent work suggests at least some of the success of neural networks, including state-of-the-art deep learning models, is tied to small, idiosyncratic elements of the data used to train those networks.

Exhibit A is a study put out in February and revised in June by Benjamin Recht and colleagues at UC Berkeley, with the amusing title "Do ImageNet Classifiers Generalize to ImageNet?"

They tried to reconstruct ImageNet, in a sense, by duplicating the process of gathering images from Flickr and curating them, having people on Amazon's Mechanical Turk service look at the images and assign labels.   

The original screen from back in 2009 instructing Amazon Mechanical Turk workers to pick images that fit with labels. It kicked off a decade of development of more and more advanced computer vision neural networks.

The goal was to create a new "test" set of images, a set that's like the original group of pictures, but never seen before, to see how well all the models that have been developed on ImageNet in the past decade generalize to new data.

The results were mixed. The various deep learning image recognition models that followed one another in time, such as the classic "AlexNet" and, later, more-sophisticated networks such as "VGG" and "Inception," still showed improvement from generation to generation. In fact, on this new test set, levels of improvement were actually amplified.  .... " 

No comments: