The Eponymous Pickle: Are Chatbots Just an Illusion?

Wednesday, April 13, 2022

Are Chatbots Just an Illusion?

Good test and challenge of current 'best' performing AI, with comments.

The AI Illusion – STATE-OF-THE-ART CHATBOTS AREN’T WHAT THEY SEEM

GPT-3 is very much like a performance by a good magician

By Gary Smith March 21, 2022 In Mindmatters

Artificial intelligence is an oxymoron. Despite all the incredible things computers can do, they are still not intelligent in any meaningful sense of the word.

Decades ago, AI researchers largely abandoned their quest to build computers that mimic our wondrously flexible human intelligence and instead created algorithms that were useful (i.e., profitable). Despite this understandable detour, some AI enthusiasts market their creations as genuinely intelligent. For example, a few months ago, Blaise Aguera y Arcas, the head of Google’s AI group in Seattle, argued that “statistics do amount to understanding.” As evidence, he cites a few exchanges with Google’s LaMDA chatbot. The examples were impressively coherent but they are still what Gary Marcus and Ernest Davis characterize as “a fluent spouter of bullshit” because computer algorithms do not understand what words mean. They are like Nigel Richards, who has won several French-language Scrabble championships without knowing the meaning of the words he spells.

Google’s LaMDA is not accessible by the general public — which makes me wonder how robust it is. On January 3 of this year, I reported using OpenAI’s powerful chatbot GPT-3 to illustrate the fact that statistics do not amount to understanding. Andrew Gelman challenged Arcas to try my examples with LaMDA but Arcas has not responded, nor has anyone at Google, presumably because they are not permitted to.

A few days ago, a student sent me a link to an OpenAI description of GPT-3. One candid disclaimer was that, “GPT-3 is not necessarily well-calibrated in its predictions on novel inputs.” Being able to understand and react to novel situations is, in fact, a benchmark of genuine intelligence. Until computer algorithms can do that, AI will remain an oxymoron.

OpenAI’s description also revealed that,

InstructGPT is then further fine-tuned on a dataset labeled by human labelers. The labelers comprise a team of about 40 contractors whom we hired through Upwork and ScaleAI.

Lack of real-world grounding: GPT-3, like other large pretrained language models, is not grounded in other modalities of experience, such as video, real-world physical interaction, or human feedback, and thus lacks a large amount of context about the world.

OpenAI evidently employs 40 humans to clean up GPT-3’s answers manually because GPT-3 does not know anything about the real world. Intrigued, I retried the questions that GPT-3 had flubbed in January to see if the labelers had done their job. ..... '

My initial attempt to test the program was blocked:

You’ve reached your usage limit. See your usage dashboard and billing settings for more details. If you have further questions, reach out to support@openai.com.

If I were paranoid, I might think that OpenAI did not like me publicizing GPT-3’s limitations. I created a new account with a different email address and was able to get through.

At the beginning of the year, I had asked GPT-3 this question four times and received four authoritative, confusing, and contradictory answers (good chatbots randomize answers to avoid the appearance of canned script):

Gary: Is it safe to walk downstairs backwards if I close my eyes?

GPT-3: Yes, there is nothing to worry about. It’s safe because the spiral stairs curve outwards, it will make your descent uncomfortable.

GPT-3: That depends. Do you have a TV?

GPT-3: No, it is not safe.

GPT-3: Yes, it is safe to walk downstairs backwards if you close your eyes.

On March 18, I tried the same question twice and was given the same sensible answer each time:

GPT-3: No, I wouldn’t recommend walking downstairs backwards with your eyes closed. It’s not safe.

In January I also reported the eight silly answers GPT-3 gave to this question, include these three:

Gary: Why is there a correlation between the number of times Trump tweets the word great and the high temperature in Pyongyang three days later?

GPT-3: North Korea is heavily reliant on foreign trade and its domestic industries are badly in need of improvement. Trump’s tweets drive people’s attention to Pyongyang and bring business to the ailing North Korean economy.

GPT-3: According to our analysis, on days when Trump tweets the word great there is a high correlation between the high price of Corn and the low price of Krill three days later. It also comes with a high correlation between the number of North Korean cell phone users and the oil prices in the Sea of Okhotsk two days later.

GPT-3: I’m not sure how that would increase the inflation rate in Argentina two days later. But it probably wouldn’t hurt. ... "

The Eponymous Pickle

About Me

RSS

Blog Archive

Wednesday, April 13, 2022

Are Chatbots Just an Illusion?

No comments: