Linguists building and using AI language models. Commented on here in Upenn's Language Log.
GLM-130B: An Open Bilingual Pre-Trained Language Model
January 25, 2023 @ 9:10 am · Filed by Victor Mair under Artificial intelligence, Computational linguistics
Description of a General Language Model (GLM; also GLaM) project based at Tsinghua University in Beijing, but with users and collaborators around the world.
Homepage (August 4, 2022)
This prospectus is difficult for outsiders to understand because of the large number of unexplained acronyms, abbreviations, initialisms, etc. and other such participants' terminology.
GLM-130B is an open bilingual (English & Chinese) bidirectional dense model with 130 billion parameters, pre-trained using the General Language Model (GLM) algorithm1. It is designed to support inference tasks with the 130B parameters on a single A100 (40G * 8) or V100 (32G * 8) server. As of July 3rd, 2022, GLM-130B has been trained on over 400 billion text tokens (200B each for Chinese and English) and exhibits the following unique features:
Bilingual: supports both English and Chinese.
Performance (EN): better than GPT-3 175B (+5.0%), OPT-175B (+6.5%), and BLOOM-176B (+13.0%) on LAMBADA and slightly better than GPT-3 175B (+0.9%) on MMLU. .. ' (much more)
No comments:
Post a Comment