@wsl8297: UC's Open Course on Reinforcement Learning for LLMs uses a 'theory + practice' approach to thoroughly explain key AI training techniques from the ground up, helping you systematically build a complete framework spanning from RL to LLM training. Comprehensive curriculum paired with complete resources: lecture slides, full videos, and practical exercises are all provided so you can start implementing right away…
Summary
Assistant Professor Ernest K. Ryu at UCLA offers the open course 'Reinforcement Learning for Large Language Models,' comprehensively analyzing key LLM training techniques like RLHF, PPO, and DPO alongside their supporting resources through a blend of theory and practice. The course provides developers and researchers with a systematic learning path from foundational algorithms to practical deployment.
View Cached Full Text
Cached at: 05/09/26, 03:42 AM
UC Open Course: Reinforcement Learning for Large Language Models adopts a “theory + practice” approach to thoroughly explain key AI training techniques from the ground up, helping you systematically build a complete framework spanning from reinforcement learning to LLM training. The course features comprehensive coverage and extensive supporting materials: lecture slides, complete video recordings, and hands-on exercises are all included, enabling immediate application after completion.
Course Link: http://ernestryu.com/courses/RL-LLM.html
What you will learn:
- Core of Deep Reinforcement Learning: Key algorithms such as MDP, Policy Gradient, A3C, and PPO
- Fundamentals of LLMs: Foundations and evolution of NLP, language modeling, and RNNs
- End-to-End RLHF Breakdown: Training methodologies and implementation strategies based on human feedback
- Reinforcement Learning with Verifiable Rewards (RLVR): Safer and more robust training paradigms
- Hands-on Practice: Jupyter notebook code examples and assignments for learning by doing
Taught by an Assistant Professor in the UCLA Department of Mathematics, with full video lectures available on YouTube. The curriculum is rigorous and highly recommended for anyone looking to truly master the integration of “RL + LLM training”.
Reinforcement Learning of Large Language Models
Source: https://ernestryu.com/courses/RL-LLM.html
Lecture slides
- Chapter 0: Prologue (https://ernestryu.com/courses/RL-LLM/chapter0.pdf).
- Chapter 1: Deep Reinforcement learning (https://ernestryu.com/courses/RL-LLM/chapter1.pdf).
- Chapter 2: Large Language Models (https://ernestryu.com/courses/RL-LLM/chapter2.pdf).
- Chapter 3: Reinforcement Learning of Large Language Models (https://ernestryu.com/courses/RL-LLM/chapter3.pdf).
Lecture videos
- Chapter 0: Prologue (https://youtu.be/q9972BRoXzQ).
- Chapter 1.1: MDP foundations, imitation learning, and value iteration (https://youtu.be/R2oT9Tcv0eU).
- Chapter 1.2: Deep policy evaluation (https://youtu.be/KwNs7AT3UcY).
- Chapter 1.3: Deep policy gradient methods (A3C) (https://youtu.be/iWOJpNr-kcI).
- Chapter 1.4: Deep policy gradient methods (PPO, GRPO) (https://youtu.be/qzaX7DBloZc).
- Chapter 1.5: AlphaGo, test-time compute, and expert iteration (https://youtu.be/8ZnVAu1tlYw).
- Chapter 2.1: NLP foundations, language modeling, RNNs (https://youtu.be/dhu_ZYUsBnw).
- Chapter 2.2: Transformers I (BERT, GPT-1) (https://youtu.be/q5Sl4bO-wBk).
- Chapter 2.3: Transformers II (modern transformers updates and sampling methods) (https://youtu.be/88HtzKoSzSE).
- Chapter 2.4: In-context learning and instruction fine-tuning (https://youtu.be/dPHrgBv4c9s).
- Chapter 3.1: Reinforcement learning from human feedback (PPO, DPO) (https://youtu.be/IijXgwZJarU).
- Chapter 3.2: Reinforcement learning with verifiable rewards (RLVR) (https://youtu.be/QKgVPTC_M1Q).
Course Information
Instructor
Ernest K. Ryu (http://www.math.snu.ac.kr/~ernestryu/) Assistant Professor of Mathematics, UCLA, Photo of Ernest Ryu
Prerequisites
Students are expected to have basic familiarity with deep learning at the level of image classification. No prior experience with reinforcement learning (RL) or large language models (LLMs) is assumed. For the deep RL lectures, students should be familiar with conditional expectations and the tower property (law of total expectation).
Similar Articles
@NFTCPS: If you work in AI, take this UCLA course! Theory + practice: a deep dive into RL and LLM training from scratch. Covers MDP, PPO algorithms, the full RLHF process, and hands-on Jupyter coding. Taught by a UCLA professor with videos and assignments, ready to apply immediately after completion. Course URL: https://ernestryu.com/courses/RL-LLM.html…
This article recommends a UCLA-led online course on Reinforcement Learning for Large Language Models, covering theory, algorithms like PPO and RLHF, and practical coding exercises.
@Jolyne_AI: Sharing an advanced course from UC Berkeley: Advanced LLM Agents. This course focuses on the latest advances in large language model agents—from reasoning to planning, from code to mathematical proofs—systematically deconstructing how "thinking and acting" agents are built. The course is taught by Dawn Song…
Sharing the advanced course Advanced LLM Agents from UC Berkeley, focusing on the latest advancements in large language model agents, taught by Professor Dawn Song with guest lecturers from Google, Meta, etc., covering reasoning, planning, code generation, and more.
@GitHub_Daily: Want to understand the underlying principles of large language models? Most resources only cover theory or provide source code, leaving you still confused. Stumbled upon this open-source tutorial, EveryonesLLM, which guides us step by step to build a complete large language model from scratch on Google Colab, writing code throughout. The whole tutorial is divided into...
EveryonesLLM is an open-source tutorial that provides 29 chapters of Colab notebooks. It teaches users step by step to build a complete large language model from scratch on Google Colab, including pre-training and instruction fine-tuning, and supports Chinese.
@Michaelzsguo: Alisa Liu mentioned the Stanford course CS336: Language Modeling from Scratch while preparing for an OpenAI interview. If you want to systematically learn LLM now, or if you plan to pursue AI research / MTS / ML e…
Recommends the Stanford open course CS336: Language Modeling from Scratch, which systematically explains the full training pipeline of language models from scratch, suitable for those preparing for AI interviews or wanting to deeply learn LLM.
@tan_maty: I'm blown away by this course, a must-see for CS majors: CS336, a course that's recently become legendary in the AI community. Building large language models from scratch. This course is offered by Stanford, taught by top NLP experts Percy Liang and Tatsunori Hashim…
A thread promoting Stanford's CS336 course on building language models from scratch, taught by NLP experts Percy Liang and Tatsunori Hashimoto, emphasizing hands-on understanding.