Lecture Title
BigCode – Training large language models in an open and responsible way
Speaker
Leandro von Werra
Talk Summary
In this presentation, Leandro will share several accomplishments of the BigCode project, an open-scientific collaboration working on the responsible development and use of LLMs for code generation. These include: * The StarCoder models: 15.5B parameter models with an 8K context length, fill-in-the-middle, and multi-query attention. * The Stack, 6.4 TB of permissively licensed source code with inspection tool and opt-out mechanism * Novel insights on the Chinchilla scaling laws, suggesting we haven’t reached the limit of training smaller LLMs for longer.
Speaker Bio
Leandro von Werra is a machine learning engineer in the open source and research teams at Hugging Face. He is the creator of a popular Python library called TRL, which combines transformers with reinforcement learning. Furthermore, he co-leads the BigCode project that aims at developing large language models for code in an open and responsible way with models such as StarCoder.
Time: 13.00 – 14.30
Date: Tuesday 14 November
Location: Livestreamed
The new I-X seminar series invites leaders in artificial intelligence from industry and academia to discuss the latest advancements in AI theory and applications. Bringing together researchers working on the fundamental aspects of AI and those using it in applications such as health, materials sciences, biology, and security, the seminars aim to spark collaborations around key developments in the field.