The Eponymous Pickle: GLM-130B: An Open Bilingual Pre-Trained Language Model

Wednesday, February 01, 2023

GLM-130B: An Open Bilingual Pre-Trained Language Model

Linguists building and using AI language models. Commented on here in Upenn's Language Log.

GLM-130B: An Open Bilingual Pre-Trained Language Model

January 25, 2023 @ 9:10 am · Filed by Victor Mair under Artificial intelligence, Computational linguistics

Description of a General Language Model (GLM; also GLaM) project based at Tsinghua University in Beijing, but with users and collaborators around the world.

Homepage (August 4, 2022)

This prospectus is difficult for outsiders to understand because of the large number of unexplained acronyms, abbreviations, initialisms, etc. and other such participants' terminology.

GLM-130B is an open bilingual (English & Chinese) bidirectional dense model with 130 billion parameters, pre-trained using the General Language Model (GLM) algorithm1. It is designed to support inference tasks with the 130B parameters on a single A100 (40G * 8) or V100 (32G * 8) server. As of July 3rd, 2022, GLM-130B has been trained on over 400 billion text tokens (200B each for Chinese and English) and exhibits the following unique features:

Bilingual: supports both English and Chinese.

Performance (EN): better than GPT-3 175B (+5.0%), OPT-175B (+6.5%), and BLOOM-176B (+13.0%) on LAMBADA and slightly better than GPT-3 175B (+0.9%) on MMLU. .. ' (much more)

The Eponymous Pickle

About Me

RSS

Blog Archive

Wednesday, February 01, 2023

GLM-130B: An Open Bilingual Pre-Trained Language Model

No comments: