AI工作台

Research

Lingli Chinese big language model

May 29, 2024

Lingli Chinese Large Language Model, based on TencentPreTrain (the world's first modular large-scale model pre-training framework with 100,000 downloads per month, published at the top natural language processing conference ACL 2023), is the first open-source Chinese 7B, 13B, and 30B large-scale model.

Its core technologies include: (1) LLaMA Chinese pre-training, instruction fine-tuning, and question answering; (2) language transfer learning through adaptive data sampling; (3) English → parallel Chinese-English corpus training; and (4) adaptive adjustment of training ratio to address knowledge forgetting and transfer.

Close