/* ---- Google Analytics Code Below */

Thursday, July 23, 2020

Image GPT

Been looking at GPT from OpenAI, and at their site found Image GPT.   There a number of links to technical papers at the pieces below:

OpenAI first described GPT-3 in a research paper published in May. But last week it began drip-feeding the software to selected people who requested access to a private beta. For now, OpenAI wants outside developers to help it explore what GPT-3 can do, but it plans to turn the tool into a commercial product later this year, offering businesses a paid-for subscription to the AI via the cloud. ... 

Image GPT
We find that, just as a large transformer model trained on language can generate coherent text, the same exact model trained on pixel sequences can generate coherent image completions and samples. By establishing a correlation between sample quality and image classification accuracy, we show that our best generative model also contains features competitive with top convolutional nets in the unsupervised setting.

Introduction
Unsupervised and self-supervised learning,1 or learning without human-labeled data, is a longstanding challenge of machine learning. Recently, it has seen incredible success in language, as transformer2 models like BERT,3 GPT-2,4 RoBERTa,5 T5,6 and other variants78910 have achieved top performance on a wide array of language tasks. However, the same broad class of models has not been successful in producing strong features for image classification.11 Our work aims to understand and bridge this gap.

Transformer models like BERT and GPT-2 are domain agnostic, meaning that they can be directly applied to 1-D sequences of any form. When we train GPT-2 on images unrolled into long sequences of pixels, which we call iGPT, we find that the model appears to understand 2-D image characteristics such as object appearance and category. This is evidenced by the diverse range of coherent image samples it generates, even without the guidance of human provided labels. As further proof, features from the model achieve state-of-the-art performance on a number of classification datasets and near state-of-the-art unsupervised accuracy[1]  ...  " 

No comments: