Technical, Reviewing.
Google Publishes Technique for AI Language Model Self-Improvement
by Anthony Alford Director, Development at Genesys Cloud Services
Researchers at Google and University of Illinois at Urbana-Champaign (UIUC) have published a technique called Language Model Self-Improved (LMSI), which fine-tunes a large language model (LLM) on a dataset generated by that same model. Using LMSI, the researchers improved the performance of the LLM on six benchmarks and set new state-of-the-art accuracy records on four of them.
The team began with a pre-trained 540B parameter PaLM model. The model was given as input questions from an unlabeled training dataset, along with chain-of-thought prompts. The model generated answers for these questions, which were then used along with the inputs as a fine-tuning training dataset. The fine-tuned model was then evaluated on a suite of benchmark datasets for three different natural language processing (NLP) tasks: arithmetic reasoning, commonsense reasoning, and natural language inference. On four of the benchmarks---ARC-c, OpenBookQA, ANLI-A2 and ANLI-A3---the model outperformed previous records. According to the Google team:
We hope our simple approach and strong empirical results could encourage more future work by the community to investigate optimal performances of pretrained LLMs without additional human supervision....As part of our future work, we plan to combine large-scale generated data from our approach and existing supervised data, to further improve the performance of LLMs.
Chain-of-thought (CoT) prompting augments the input question given to a language model by prepending an example question and answer along with the reasoning steps to arrive at the answer. InfoQ recently covered Google's PaLM model, which when used with CoT prompting achieves state-of-the-art few-shot performance on several reasoning benchmarks. Given this few-shot performance, the LMSI researchers wanted to investigate PaLM's performance when fine-tuned on additional datasets. ... '
No comments:
Post a Comment