@li9292: 如何加入OpenAI?只需精通以下课程: 1. 斯坦福大学的“从零开始的语言建模”课程:http://cs336.stanford.edu/spring2025/ 2. 在掌握广度之后,她逐一深入研究概念,使用博客、论文、与 ChatGP…
摘要
该推文推荐了斯坦福大学CS336课程及一系列学习资源,作为加入OpenAI的准备路径。
查看缓存全文
缓存时间: 2026/06/23 06:01
如何加入OpenAI?只需精通以下课程:
-
斯坦福大学的“从零开始的语言建模”课程:http://cs336.stanford.edu/spring2025/
-
在掌握广度之后,她逐一深入研究概念,使用博客、论文、与 ChatGPT 和 Claude 聊天,以及从零实现相关内容。
-
在面试中,实现/调试 Transformer 的问题经常出现。将它转化为肌肉记忆:http://github.com/stanford-cs336…
-
刷 Leetcode http://leetcode.com/studyplan/leet…
-
她分享的其他学习资源: a. 自注意力 & Transformer:http://web.stanford.edu/class/cs224n/r…
b. 插图版 GPT-2:http://jalammar.github.io/illustrated-gp…
c. 反向传播 http://cs231n.github.io/optimization-2/
d. 语言模型策略梯度入门 http://ivison.id.au/2026/02/09/pol…
e. 理解 GRPO 和强化学习原理的轻量级指南 http://gitlostmurali.com/blog/grpo-intr…
f. 如何扩展你的模型 http://jax-ml.github.io/scaling-book/
这是我能免费看的内容吗?
Stanford CS336 | Language Modeling from Scratch (Spring 2025 Archive)
Source: https://cs336.stanford.edu/spring2025/ This is the archived website for the Spring 2025 offering of CS336. The latest offering ishere.
Content
What is this course about?
Language models serve as the cornerstone of modern natural language processing (NLP) applications and open up a new paradigm of having a single general purpose system address a range of downstream tasks. As the field of artificial intelligence (AI), machine learning (ML), and NLP continues to grow, possessing a deep understanding of language models becomes essential for scientists and engineers alike. This course is designed to provide students with a comprehensive understanding of language models by walking them through the entire process of developing their own. Drawing inspiration from operating systems courses that create an entire operating system from scratch, we will lead students through every aspect of language model creation, including data collection and cleaning for pre-training, transformer model construction, model training, and evaluation before deployment.
Prerequisites
- Proficiency in PythonThe majority of class assignments will be in Python. Unlike most other AI classes, students will be given minimal scaffolding. The amount of code you will write will be at least an order of magnitude greater than for other classes. Therefore, being proficient in Python and software engineering is paramount.
- Experience with deep learning and systems optimizationA significant part of the course will involve making neural language models run quickly and efficiently on GPUs across multiple machines. We expect students to be able to have a strong familiarity with PyTorch and know basic systems concepts like the memory hierarchy.
- College Calculus, Linear Algebra(e.g. MATH 51, CME 100)You should be comfortable understanding matrix/vector notation and operations.
- Basic Probability and Statistics(e.g. CS 109 or equivalent)You should know the basics of probabilities, Gaussian distributions, mean, standard deviation, etc.
- Machine Learning(e.g. CS221, CS229, CS230, CS124, CS224N)You should be comfortable with the basics of machine learning and deep learning.
Note that this is a 5-unit class. This is a very implementation-heavy class, so please allocate enough time for it.
Coursework
Assignments
- Assignment 1: Basics[leaderboard]- Implement all of the components (tokenizer, model architecture, optimizer) necessary to train a standard Transformer language model. - Train a minimal language model.
- Assignment 2: Systems[leaderboard]- Profile and benchmark the model and layers from Assignment 1 using advanced tools, optimize Attention with your own Triton implementation of FlashAttention2. - Build a memory-efficient, distributed version of the Assignment 1 model training code.
- Assignment 3: Scaling- Understand the function of each component of the Transformer. - Query a training API to fit a scaling law to project model scaling.
- Assignment 4: Data[leaderboard]- Convert raw Common Crawl dumps into usable pretraining data. - Perform filtering and deduplication to improve model performance.
- Assignment 5: Alignment and Reasoning RL- Apply supervised finetuning and reinforcement learning to train LMs to reason when solving math problems. - Optional Part 2: implement and apply safety alignment methods such as DPO.
All (currently tentative) deadlines are listed in theschedule.
GPU compute for self-study
If you are following along at home, you can access GPU compute from a cloud provider to complete the assignments. Here are a few options (prices for asingle H100 80GB GPUon June 6, 2025):
- RunPod:1\.99\-2.99/hour(RunPod Pricing)
- Lambda Labs:2\.49–3.29/hour(Lambda Labs Pricing)
- Paperspace:$2.24/hour(Paperspace Pricing)
- Together:$2.85/hour, minimum 8 GPUs (Together Instant GPU Cluster Pricing)
For convenience and to save money, we recommend debugging correctness of your implementation on CPU first and then using GPU(s) (with the count recommended in the assignments) for completing training runs (A1, A4, A5) or benchmarking GPU operations (A2).
Honor code
Like all other classes at Stanford, we take the studentHonor Codeseriously. Please respect the following policies:
- Collaboration: Study groups are allowed, but students must understand and complete their own assignments, and hand in one assignment per student. If you worked in a group, please put the names of the members of your study group at the top of your assignment. Please ask if you have any questions about the collaboration policy.
- AI tools: Prompting LLMs such as ChatGPT is permitted for low-level programming questions or high-level conceptual questions about language models, but using it directly to solve the problem is prohibited. We strongly encourage you to disable AI autocomplete (e.g., Cursor Tab, GitHub CoPilot) in your IDE when completing assignments (though non-AI autocomplete, e.g., autocompleting function names is totally fine). We have found that AI autocomplete makes it much harder to engage deeply with the content.
- Existing code: Implementations for many of the things you will implement exist online. The handouts we’ll give will be self-contained, so that you will not need to consult third-party code for producing your own implementation. Thus, you should not look at any existing code unless when otherwise specified in the handouts.
Submitting coursework
- All coursework are submitted via Gradescope by the deadline. Do not submit your coursework via email.
- If anything goes wrong, please ask a question in Slack or contact a course assistant.
- You can submit as many times as you’d like until the deadline: we will only grade the last submission.
- Partial work is better than not submitting any work.
Late days
- Each student has6 late days to use. A late day extends the deadline by 24 hours.
- You can use up to 3 late days per assignment.
Regrade requests
If you believe that the course staff made an objective error in grading, you may submit a regrade request on Gradescope within 3 days after the grades are released.
Sponsor
We would like to thankTogether AIfor sponsoring the compute for this class.
Alisa Liu (@alisawuffles): I’m joining OpenAI next week!🥹 The job search turned out to be really challenging but also super rewarding, so I wrote a small blog to share what I learned along the way and hopefully make the process a little less mysterious for the next person.
相似文章
@tan_maty: 我勒个去,下周去 OpenAI 上班的神仙姐姐 @alisawuffles 分享的 AI 斯坦福课程,我给找到了,小白必看! 我已经学废了,你们也快来吧,我感觉我英文水平也进步了! Stanford CS336: Language Mod…
斯坦福大学CS336课程旨在让学生从零开始构建语言模型,深入理解数据、系统和模型的全栈设计,课程视频已公开,适合AI初学者学习。
@FinnTsai88: https://x.com/FinnTsai88/status/2066451245515333902
文章介绍了OpenAI于6月12日发布的三门新课程,并提供了一个7天练习计划,帮助用户从基础使用到构建AI工作流,最后给出了检查清单。
@Russell3402: 有朋友想学习 AI 工程 我半天没能给出一个好的学习链路 这里给大家推荐一个开源 AI 工程学习焚决! 它想带你从底层开始,把 AI 工程完整学一遍: 从数学、机器学习、深度学习、Transformer、LLM、Agent、MCP、多智能…
推荐一个开源AI工程学习课程,包含20个阶段、503节课,从数学基础到生产部署,覆盖Python等语言,旨在从零构建完整AI工程体系。
@stanfordnlp: “我开始了我的过程,首先观看了斯坦福大学的《从头开始的语言建模》课程的所有讲座,这有助于……”
Alisa Liu 宣布她将加入 OpenAI,并分享了一篇关于求职经历的博客文章,其中包含来自斯坦福大学《从头开始的语言建模》课程的见解。
@RealCodedAlpha: https://x.com/RealCodedAlpha/status/2064921935507837260
一篇关于OpenAI Codex精通教程的深度文章,涵盖从思维模型到实际应用(如大规模代码迁移、安全审计、性能优化、团队协作、构建个人AI操作系统和产品开发)的完整知识体系。