In the last couple of years generative AI, and Large Language Models (LLMs) especially, have captured the public's attention. We've all used ChatGPT and friends by now. But how do they operate?
In this talk I aim to provide a good mental model of what is happening in the background when interacting with an LLM. I'll give an overview of how LLMs work broadly, and then dive a bit into the details of GPT-2.0, the precursor to the GPT models we use today. In particular, I'll discuss the transformer architecture, which is the foundation technology driving the current generative AI revolution.
This will be a broadly accessible talk, aimed at a general STEM audience.
Electromagnetism is a deeply geometric physical phenomenon, an aspect that is hidden in its standard presentation. The standard presentation is efficient and effective if you want to teach people how to build radios and motors. But it turns out be be essential to understand the underlying geometry of electromagnetism when studying its relationship with general relativity, a manifestly geometric theory.
In this three-part sequence of lectures we present an unhelpful introduction to electromagnetism in the sense that it is useless for building the intuition needed to construct motors and radios and solar panels, but highlights the very simple geometric ideas and principles that give rise to electromagentism. The talks will be accessible to people with a modest mathematics background (calculus III) who have a passing interest in physics, as well as people who know far more physics than I do who would like to see a mathematician grapple with gauge theory.
The lectures: