Talk Title
Large Language Models: Foundations & Frontiers

Talk Summary
This is a two-part talk series with an hours break in between on Large Language Models (LLMs) intended for non-specialists. I intend to explain the current paradigm, its main unsolved problems, and also provide links to potential interdisciplinary research areas.

Part One
12.00-13.00

In the first talk I’ll introduce the core architecture of modern autoregressive Transformer-based LLMs (including unpacking what those terms mean!), covering both the formal algorithms and common intuitions for why they work. I won’t assume any knowledge of deep learning or other approaches to language modelling. My goal for this section of the talk is that you would be able to implement a transformer-based language model following the talk.

Link to join online for Part One

(Break 13.00 – 14.00)

Part Two
14.00-15.00

In the second talk I’ll cover an (opinionated) selection of key research topics in LLMs: scaling laws, evaluation, interpretability and alignment. Scaling laws are empirically-observed regularities in model performance as the compute invested grows, and they underpin the success of ‘scaling up’ in language models. Evaluation is – perhaps unsurprisingly – evaluating model performance along multiple axes. Interpretability is the study of how LLMs work: what algorithms do they learn to implement internally and how do these vary between models. Finally, alignment concerns how to make models ‘do what we want’. In practice this typically means creating helpful assistants using Reinforcement Learning from Human Feedback and related methods, which I’ll describe.

Link to join online for Part Two

Speaker Bio – Tom McGrath
Tom McGrath did his PhD at Imperial, supervised by Nick Jones, before moving to DeepMind as a research scientist. He has worked on LLMs, interpretability, and embodied deep reinforcement learning. His main interest is in understanding deep learning systems, especially reinforcement learning agents. Tom is particularly interested in what we can learn from very capable agents, how they learn, and whether they represent their domain in ways we understand. There is also an interest in applications of interpretability to science and safety.

Time: 12.00 – 15.00
Date: Tuesday 5 December
Location: Hybrid Event | Online and in I-X Meeting Room 1, Level 5
Translation and Innovation Hub (I-HUB)
84 Wood Lane
Imperial College, White City Campus

Any questions, please contact Eileen Boyce (e.boyce@imperial.ac.uk) or Lauren Burton (lauren.burton@imperial.ac.uk).

Getting here