RAND Thoughts on Chat: No Transparency
ChatGPT's Work Lacks Transparency and That Is a Problem
ChatGPT logo in this illustration from March 31, 2023
by Carter C. Price, May 8, 2023
After getting a set of questions about the COVID-19 pandemic and putting together my responses, I thought that I would see how ChatGPT would do.
While ChatGPT could not provide concrete data or citations to back up its point, even with more prompting, and it missed some nuance, this is a nice first start for a summary of the consensus view on the key takeaways from the pandemic—and that is the problem.
At their core, ChatGPT and other Large Language Models (LLMs) estimate the most likely next word, phrase, or sentence to follow a prompt from a user. To make these estimates, LLMs are trained on millions or even billions of texts that include recent news stories, articles, and other work. When asked about lessons learned from the COVID-19 pandemic, ChatGPT pulls from this vast corpus to predict the most likely set of sentences to respond.
To that end, the response can be seen as an amalgamation of the training pool of writing on COVID-19, text on lessons learned, and the general rules of language from the full corpus. This works well, but when prompted for more details about the specifics of the points presented, a LLM may not have the appropriate details in its corpus and can't necessarily predict the best information to provide. This can make the content of the responses to followup questions underwhelming or even false (when asked for sources, ChatGPT produced three references that appear to be fictitious).
The first point regarding public health infrastructure is not necessarily inaccurate but is by no means obvious or a settled question. Many of the poorest countries with relatively weak health infrastructure fared quite well due in large part to their populations being much younger than those in wealthier nations.
Comparing only rich nations, there was substantial variation in COVID-19 outcomes and health spending. While public health infrastructure was important in the COVID-19 response, this is not a clear takeaway from international comparisons as stated (there may be a stronger case within the United States). The other points made by ChatGPT are stronger but not without their shortcomings.
Developers need to be more transparent about their algorithms and data sources so that people can assess the inherent sources of bias or problems with the approach.
With these content critiques in mind, ChatGPT's five points provide a useful start for a summary of key takeaways. However, the fact that it is coherently written and, on first glance, very reasonable, is quite problematic because of the lack of transparency.
When someone reads a research study or a newspaper article, facts are typically sourced, and those sources have (hopefully) been verified. While there have been high-profile cases of a few journalists making up facts, the fact that this occurrence is so rare is what makes these cases high profile. This is not true for LLMs at present.
In a world with LLMs, there is a growing need for modernized data literacy. While basic numeracy is useful when reading statistical analysis, that is not sufficient to understanding how to treat outputs from LLMs and other modern AI. Developers need to be more transparent about their algorithms and data sources so that people can assess the inherent sources of bias or problems with the approach.
Users of LLMs may find them to be a nice shortcut to drafting material, but should be wary of factual statements made and read with a careful and critical eye. While LLMs like ChatGPT have a lot of uses, providing deep commentary or useful policy analysis is not one of those uses, for now.
Here's how ChatGPT handled the assignment of writing about the pandemic: ... '
No comments:
Post a Comment