@wsl8297: Discovered a deep learning paper reading project on GitHub: paper-reading. Author Mu Shen reads classic and new deep learning papers paragraph by paragraph, recorded into video explanations, has been updated for over 3 years. GitHub: https://github.com/mli/paper-reading...

X AI KOLs Timeline Tools

Summary

Mu Shen's deep learning paper reading project on GitHub includes in-depth reading videos of major papers such as GPT-4, Llama 3.1, Sora, etc. Each video is about 1 hour, suitable for AI researchers and developers to deeply understand classic papers.

Discovered a deep learning paper reading project on GitHub: paper-reading. Author Mu Shen reads classic and new deep learning papers paragraph by paragraph, records video explanations, and has been updated for over 3 years. GitHub: https://github.com/mli/paper-reading... The project includes in-depth reading videos of significant papers such as GPT-4, Llama 3.1, Sora, DALL·E 2, Instruct GPT, Whisper, Chain of Thought, etc. Each video is about 1 hour of deep explanation, breaking down the paper content paragraph by paragraph. The videos are simultaneously updated on Bilibili and YouTube, and also include series such as multimodal paper overviews and CLIP improvement work overviews. Besides paper reading, there are also sharing of research ideas in the era of large models, research methodology, etc. Suitable for AI researchers and developers who want to deeply understand classic papers and keep up with cutting-edge progress.
Original Article
View Cached Full Text

Cached at: 05/15/26, 10:59 AM

Deep Learning Paper Reading Project: paper-reading

Discovered a deep learning paper reading project on GitHub: paper-reading. The author, Mu Shen, does a paragraph-by-paragraph deep reading of classic and new deep learning papers, recording video explanations that have been updated for over 3 years. GitHub: https://github.com/mli/paper-reading… The project includes in-depth reading videos for major papers like GPT-4, Llama 3.1, Sora, DALL·E 2, Instruct GPT, Whisper, Chain of Thought, etc. Each video is about an hour of deep explanation, breaking down the paper paragraph by paragraph. Videos are also updated simultaneously on Bilibili and YouTube, and include series such as multimodal paper overviews and CLIP improvement work overviews. In addition to paper reading, there are also sharing of research ideas in the era of large models, research methodology, and more. Suitable for AI researchers and developers who want to deeply understand classic papers and keep up with cutting-edge progress.

mli/paper-reading

Source: https://github.com/mli/paper-reading

Deep Learning Paper Reading

Recorded Papers

DateTitleCoverDurationVideo (Views)
1/10/25OpenAI Sora (https://openai.com/index/video-generation-models-as-world-simulators/) Part 1 (including Movie Gen and HunyuanVideo)1:04:18bilibili (https://www.bilibili.com/video/BV1VdcxesEAt/?share_source=copy_web&vd_source=5d037e935914fc22e2e978cdccf5cdfe)
9/04/24Llama 3.1 Paper Reading · 5. Model Training Process10:41bilibili (https://www.bilibili.com/video/BV1c8HbeaEXi)
8/28/24Llama 3.1 Paper Reading · 4. Training Infrastructure25:04bilibili (https://www.bilibili.com/video/BV1b4421f7fa)
8/13/24Llama 3.1 Paper Reading · 3. Model26:14bilibili (https://www.bilibili.com/video/BV1Q4421Z7Tj)
8/05/24Llama 3.1 Paper Reading · 2. Pre-training Data (https://arxiv.org/pdf/2407.21783)23:37bilibili (https://www.bilibili.com/video/BV1u142187S5)
7/31/24Llama 3.1 Paper Reading · 1. Introduction18:53bilibili (https://www.bilibili.com/video/BV1WM4m1y7Uh)
3/30/23GPT-4 (https://openai.com/research/gpt-4)1:20:38bilibili (https://www.bilibili.com/video/BV1vM4y1U7b5)
3/23/23Four Research Ideas in the Era of Large Models1:06:29bilibili (https://www.bilibili.com/video/BV1oX4y1d7X6)
3/10/23Anthropic LLM (https://arxiv.org/pdf/2204.05862.pdf)1:01:51bilibili (https://www.bilibili.com/video/BV1XY411B7nM)
1/20/23Helm (https://arxiv.org/pdf/2211.09110.pdf) Comprehensive Language Model Evaluation1:23:37bilibili (https://www.bilibili.com/video/BV1z24y1B7uX)
1/11/23Multimodal Paper Overview · Part 21:03:29bilibili (https://www.bilibili.com/video/BV1fA411Z772)
12/29/22Instruct GPT (https://arxiv.org/pdf/2203.02155.pdf)1:07:10bilibili (https://www.bilibili.com/video/BV1hd4y187CR)
12/19/22Neural Corpus Indexer (https://arxiv.org/pdf/2206.02743.pdf) Document Retrieval55:47bilibili (https://www.bilibili.com/video/BV1Se411w7Sn)
12/12/22Multimodal Paper Overview · Part 11:12:27bilibili (https://www.bilibili.com/video/BV1Vd4y1v77v)
11/14/22OpenAI Whisper (https://cdn.openai.com/papers/whisper.pdf) In-depth Reading1:12:16bilibili (https://www.bilibili.com/video/BV1VG4y1t74x)
11/07/22Before Talking About OpenAI Whisper, I Made a Little Video Editing Tool23:39bilibili (https://www.bilibili.com/video/BV1Pe4y1t7de)
10/23/22Chain of Thought (https://arxiv.org/pdf/2201.11903.pdf) Paper, Code, and Resources33:21bilibili (https://www.bilibili.com/video/BV1t8411e7Ug)
9/17/22CLIP Improvement Work Overview (Part 2)1:04:26bilibili (https://www.bilibili.com/video/BV1gg411U7n4)
9/2/22CLIP Improvement Work Overview (Part 1)1:14:43bilibili (https://www.bilibili.com/video/BV1FV4y1p7Lm)
7/29/22ViLT (https://arxiv.org/pdf/2102.03334.pdf) Paper In-depth Reading1:03:26bilibili (https://www.bilibili.com/video/BV14r4y1j74y)
7/22/22Reasons, Evidence, and Warrants [The Craft of Research (https://press.uchicago.edu/ucp/books/book/chicago/C/bo23521678.html) · 4]44:14bilibili (https://www.bilibili.com/video/BV1SB4y1a75c)
7/15/22How to Tell a Good Story, Arguments in a Story [The Craft of Research (https://press.uchicago.edu/ucp/books/book/chicago/C/bo23521678.html) · 3]43:56bilibili (https://www.bilibili.com/video/BV1WB4y1v7ST)
7/8/22DALL·E 2 (https://arxiv.org/pdf/2204.06125.pdf) Paragraph-by-Paragraph Reading1:27:54bilibili (https://www.bilibili.com/video/BV17r4y1u77B)
7/1/22Understanding the Importance of the Problem [The Craft of Research (https://press.uchicago.edu/ucp/books/book/chicago/C/bo23521678.html) · 2]1:03:40bilibili (https://www.bilibili.com/video/BV11S4y1v7S2/)
6/24/22Connecting with Readers [The Craft of Research (https://press.uchicago.edu/ucp/books/book/chicago/C/bo23521678.html) · 1]45:01bilibili (https://www.bilibili.com/video/BV1hY411T7vy/)
6/17/22Zero (https://arxiv.org/pdf/1910.02054.pdf) Paragraph-by-Paragraph Reading52:21bilibili (https://www.bilibili.com/video/BV1tY411g7ZT/)
6/10/22DETR (https://arxiv.org/pdf/2005.12872.pdf) Paragraph-by-Paragraph Reading54:22bilibili (https://www.bilibili.com/video/BV1GB4y1X72R/)
6/3/22Megatron LM (https://arxiv.org/pdf/1909.08053.pdf) Paragraph-by-Paragraph Reading56:07bilibili (https://www.bilibili.com/video/BV1nB4y1R7Yz/)
5/27/22GPipe (https://proceedings.neurips.cc/paper/2019/file/093f65e080a295f8076b1c5722a46aa2-Paper.pdf) Paragraph-by-Paragraph Reading58:47bilibili (https://www.bilibili.com/video/BV1v34y1E7zu/)
5/5/22Pathways (https://arxiv.org/pdf/2203.12533.pdf) Paragraph-by-Paragraph Reading1:02:13bilibili (https://www.bilibili.com/video/BV1xB4y1m7Xi/)
4/28/22Video Understanding Paper Overview (https://arxiv.org/pdf/2012.06567.pdf) (Part 2)1:08:32bilibili (https://www.bilibili.com/video/BV11Y411P7ep/)
4/21/22Parameter Server (https://www.usenix.org/system/files/conference/osdi14/osdi14-paper-li_mu.pdf) Paragraph-by-Paragraph Reading1:37:40bilibili (https://www.bilibili.com/video/BV1YA4y197G8/)
4/14/22Video Understanding Paper Overview (https://arxiv.org/pdf/2012.06567.pdf) (Part 1)51:15bilibili (https://www.bilibili.com/video/BV1fL4y157yA/)
3/31/22I3D (https://arxiv.org/pdf/1705.07750.pdf) Paper In-depth Reading52:31bilibili (https://www.bilibili.com/video/BV1tY4y1p7hq/)
3/24/22Stanford 2022 AI Index Report (https://aiindex.stanford.edu/wp-content/uploads/2022/03/2022-AI-Index-Report_Master.pdf) In-depth Reading1:19:56bilibili (https://www.bilibili.com/video/BV1s44y1N7eu/)
3/17/22AlphaCode (https://storage.googleapis.com/deepmind-media/AlphaCode/competition_level_code_generation_with_alphacode.pdf) Paper In-depth Reading44:00bilibili (https://www.bilibili.com/video/BV1ab4y1s7rc/)
3/10/22OpenAI Codex (https://arxiv.org/pdf/2107.03374.pdf) Paper In-depth Reading47:58bilibili (https://www.bilibili.com/video/BV1iY41137Zi/) zhihu (https://www.zhihu.com/zvideo/1490959755963666432)
3/3/22GPT (https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf), GPT-2 (https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf), GPT-3 (https://arxiv.org/abs/2005.14165) In-depth Reading1:29:58bilibili (https://www.bilibili.com/video/BV1AF411b7xQ/)
2/24/22Two-Stream (https://proceedings.neurips.cc/paper/2014/file/00ec53c4682d36f5c4359f4ae7bd7ba1-Paper.pdf) Paragraph-by-Paragraph Reading52:57bilibili (https://www.bilibili.com/video/BV1mq4y1x7RU/)
2/10/22CLIP (https://openai.com/blog/clip/) Paragraph-by-Paragraph Reading1:38:25bilibili (https://www.bilibili.com/video/BV1SL4y1s7LQ/) zhihu (https://www.zhihu.com/zvideo/1475706654562299904)
2/6/22Have You Been (or Complained About) Papers Not Being Novel (https://perceiving-systems.blog/en/post/novelty-in-science) Enough?14:11bilibili (https://www.bilibili.com/video/BV1ea41127Bq/) zhihu (https://www.zhihu.com/zvideo/1475719090198876161)
1/23/22AlphaFold 2 (https://www.nature.com/articles/s41586-021-03819-2.pdf) In-depth Reading1:15:28bilibili (https://www.bilibili.com/video/BV1oR4y1K7Xr/) zhihu (https://www.zhihu.com/zvideo/1469132410537717760)
1/18/22How to Judge the Value of (Your Own) Research Work9:59bilibili (https://www.bilibili.com/video/BV1oL411c7Us/) zhihu (https://www.zhihu.com/zvideo/1475716940051869696)
1/15/22Swin Transformer (https://arxiv.org/pdf/2103.14030.pdf) In-depth Reading1:00:21bilibili (https://www.bilibili.com/video/BV13L4y1475U/) zhihu (https://www.zhihu.com/zvideo/1466282983652691968)
1/7/22Guiding Mathematical Intuition (https://www.nature.com/articles/s41586-021-04086-x.pdf)52:51bilibili (https://www.bilibili.com/video/BV1YZ4y1S72j/) zhihu (https://www.zhihu.com/zvideo/1464060386375299072)
1/5/22AlphaFold 2 Preview03:28bilibili (https://www.bilibili.com/video/BV1Eu411U7Te/)
12/20/21Contrastive Learning Paper Survey1:32:01bilibili (https://www.bilibili.com/video/BV19S4y1M7hm/) zhihu (https://www.zhihu.com/zvideo/1460828005077164032)
12/15/21MoCo (https://arxiv.org/pdf/1911.05722.pdf) Paragraph-by-Paragraph Reading1:24:11bilibili (https://www.bilibili.com/video/BV1C3411s7t9/) zhihu (https://www.zhihu.com/zvideo/1454723120678936576)
12/9/21How to Find Research Ideas 15:34bilibili (https://www.bilibili.com/video/BV1qq4y1z7F2/)
12/8/21MAE (https://arxiv.org/pdf/2111.06377.pdf) Paragraph-by-Paragraph Reading47:04bilibili (https://www.bilibili.com/video/BV1sq4y1q77t/) zhihu (https://www.zhihu.com/zvideo/1452458167968251904)
11/29/21ViT (https://arxiv.org/pdf/2010.11929.pdf) Paragraph-by-Paragraph Reading1:11:30bilibili (https://www.bilibili.com/video/BV15P4y137jb/) zhihu (https://www.zhihu.com/zvideo/1449195245754380288)
11/18/21BERT (https://arxiv.org/abs/1810.04805) Paragraph-by-Paragraph Reading45:49bilibili (https://www.bilibili.com/video/BV1PL411M7eQ/) zhihu (https://www.zhihu.com/zvideo/1445340200976785408)
11/9/21GAN (https://papers.nips.cc/paper/2014/file/5ca3e9b122f61f8f06494c97b1afccf3-Paper.pdf) Paragraph-by-Paragraph Reading46:16bilibili (https://www.bilibili.com/video/BV1rb4y187vD/) zhihu (https://www.zhihu.com/zvideo/1442091389241159681)
11/3/21Zero-Based Multi-Figure Detailed Explanation of Graph Neural Networks (https://distill.pub/2021/gnn-intro/) (GNN/GCN)1:06:19bilibili (https://www.bilibili.com/video/BV1iT4y1d7zP/) zhihu (https://www.zhihu.com/zvideo/1439540657619087360)
10/27/21Transformer (https://arxiv.org/abs/1706.03762) Paragraph-by-Paragraph Reading (References mentioned in video 1)1:27:05bilibili (https://www.bilibili.com/video/BV1pu411o7BE/) zhihu (https://www.zhihu.com/zvideo/1437034536677404672)
10/22/21ResNet (https://arxiv.org/abs/1512.03385) Paragraph-by-Paragraph Reading53:46bilibili (https://www.bilibili.com/video/BV1P3411y7nn/) zhihu (https://www.zhihu.com/zvideo/1434795406001180672)
10/21/21ResNet (https://arxiv.org/abs/1512.03385): The Backbone of Computer Vision11:50bilibili (https://www.bilibili.com/video/BV1Fb4y1h73E/) zhihu (https://www.zhihu.com/zvideo/1434787226101751808)
10/15/21AlexNet (https://papers.nips.cc/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf) Paragraph-by-Paragraph Reading55:21bilibili (https://www.bilibili.com/video/BV1hq4y157t1/) zhihu (https://www.zhihu.com/zvideo/1432354207483871232)
10/14/21Rereading a Foundational Work of Deep Learning 9 Years Later: AlexNet (https://papers.nips.cc/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf)19:59bilibili (https://www.bilibili.com/video/BV1ih411J7Kz/) zhihu (https://www.zhihu.com/zvideo/1432155856322920448)
10/06/21How to Read a Paper06:39bilibili (https://www.bilibili.com/video/BV1H44y1t75x/) zhihu (https://www.zhihu.com/zvideo/1428973951632969728)
1

1 Stanford 200+ page survey with 100+ authors (https://arxiv.org/abs/2108.07258), 2 New research on LayerNorm (https://arxiv.org/pdf/1911.07013.pdf), 3 Research on the role of Attention in Transformers (https://arxiv.org/abs/2103.03404)

All Papers

Includes papers already recorded and those to be introduced later. The selection principle is influential papers in deep learning within the last 10 years (must-read papers), or recent interesting papers. Of course, there are too many important works in these ten years to cover one by one. When selecting, I will lean towards those not covered in previous live classes (https://c.d2l.ai/zh-v2/).

Feel free to provide suggestions (requests) in the discussion area (https://github.com/mli/paper-reading/discussions).

Total papers: 67, Recorded: 32 (Citations here use Semantic Scholar because it provides an API (https://api.semanticscholar.org/api-docs/graph#operation/get_graph_get_paper) to automatically fetch without manual updates.)

Computer Vision - CNN

RecordedYearNameDescriptionCitations
2012AlexNet (https://papers.nips.cc/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf)Foundational work of the deep learning boomcitation (https://www.semanticscholar.org/paper/ImageNet-classification-with-deep-convolutional-Krizhevsky-Sutskever/abd1c342495432171beb7ca8fd9551ef13cbd0ff)
2014VGG (https://arxiv.org/pdf/1409.1556.pdf)Deeper networks using 3x3 convolutionscitation (https://www.semanticscholar.org/paper/Very-Deep-Convolutional-Networks-for-Large-Scale-Simonyan-Zisserman/eb42cf88027de515750f230b23b1a057dc782108)
2014GoogleNet (https://arxiv.org/pdf/1409.4842.pdf)Deeper networks using parallel architecturescitation (https://www.semanticscholar.org/paper/Going-deeper-with-convolutions-Szegedy-Liu/e15cf50aa89fee8535703b9f9512fca5bfc43327)
2015ResNet (https://arxiv.org/pdf/1512.03385.pdf)Residual connections essential for deep networks.citation (https://www.semanticscholar.org/paper/Deep-Residual-Learning-for-Image-Recognition-He-Zhang/2c03df8b48bf3fa39054345bafabfeff15bfd11d)
2017MobileNet (https://arxiv.org/pdf/1704.04861.pdf)Small CNN suitable for mobile devicescitation (https://www.semanticscholar.org/paper/MobileNets%3A-Efficient-Convolutional-Neural-Networks-Howard-Zhu/3647d6d0f151dc05626449ee09cc7bce55be497e)
2019EfficientNet (https://arxiv.org/pdf/1905.11946.pdf)CNN obtained through architecture searchcitation (https://www.semanticscholar.org/paper/EfficientNet%3A-Rethinking-Model-Scaling-for-Neural-Tan-Le/4f2eda8077dc7a69bb2b4e0a1a086cf054adb3f9)
2021Non-deep networks (https://arxiv.org/pdf/2110.07641.pdf)Achieving SOTA on ImageNet with shallow networkscitation (https://www.semanticscholar.org/paper/Non-deep-Networks-Goyal-Bochkovskiy/0d7f6086772079bc3e243b7b375a9ca1a517ba8b)

Computer Vision - Transformer

RecordedYearNameDescriptionCitations
2020ViT (https://arxiv.org/pdf/2010.11929.pdf)Transformer enters CVcitation (https://www.semanticscholar.org/paper/An-Image-is-Worth-16x16-Words%3A-Transformers-for-at-Dosovitskiy-Beyer/7b15fa1b8d413fbe14ef7a97f651f47f5aff3903)
2021Swin Transformer (https://arxiv.org/pdf/2103.14030.pdf)Hierarchical Vision Transformercitation (https://www.semanticscholar.org/paper/Swin-Transformer%3A-Hierarchical-Vision-Transformer-Liu-Lin/c8b25fab5608c3e033d34b4483ec47e68ba109b7)
2021MLP-Mixer (https://arxiv.org/pdf/2105.01601.pdf)Replacing self-attention with MLPscitation (https://www.semanticscholar.org/paper/MLP-Mixer%3A-An-all-MLP-Architecture-for-Vision-Tolstikhin-Houlsby/2def61f556f9a5576ace08911496b7c7e4f970a4)
2021MAE (https://arxiv.org/pdf/2111.06377.pdf)BERT version for CVcitation (https://www.semanticscholar.org/paper/Masked-Autoencoders-Are-Scalable-Vision-Learners-He-Chen/c1962a8cf364595ed2838a097e9aa7cd159d3118)

Generative Models

RecordedYearNameDescriptionCitations
2014GAN (https://papers.nips.cc/paper/2014/file/5ca3e9b122f61f8f06494c97b1afccf3-Paper.pdf)Pioneering work in generative modelscitation (https://www.semanticscholar.org/pap

Similar Articles

@QingQ77: 'Dive into Deep Learning' is an excellent introductory book, but its update speed struggles to keep pace with the field's development. Since the Transformer, content like CLIP, Diffusion, vLLM, and more has proliferated. While online resources are abundant, they are highly fragmented—today you study Attention, tomorrow LoRA, the day after...

X AI KOLs Timeline

This project is a systematic deep learning notes repository covering PyTorch, Transformers, generative models, and more. It aims to address the fragmentation of learning materials and provides code implementations along with practical guides.

@VincentLogic: This video is essentially a 'must-watch' checklist for AI engineers! It clearly explains the 10 core papers that have shaped today's AI industry, ranging from the foundational Transformer architecture to LoRA fine-tuning, RAG, Agents, and even the latest MCP protocol. If you want to dive deeper into how…

X AI KOLs Timeline

This article recommends a video that systematically explains the 10 core papers shaping today's AI industry, covering Transformer, LoRA, RAG, Agents, and the MCP protocol, aiming to help engineers clarify the technological lineage.

@vista8: Last night I casually tested Knowly developed by the @Ethan_Yang_AI team. Tried interpreting YouTube videos and arXiv papers – the results were stunning. Except for a rather limited free quota and slightly slow vector processing. In terms of both product interaction and interpretation quality, it's no less impressive than NotebookLM. With a Chrome extension that has only a few users but has already been selected by Google as a featured pick, its strength is evident. Official site in comments https://t.co/62NkT3pO4G

X AI KOLs Timeline

Introduces Knowly AI tool, capable of interpreting YouTube videos and arXiv papers with impressive results. Interaction and interpretation quality rival NotebookLM. Comes with a Chrome extension already featured by Google. Drawbacks: limited free quota and slightly slow vector processing.

@wsl8297: When learning AI, the scariest part is getting stuck at "understanding the theory" and freezing when it's time to write code — not knowing where to start, and unable to find decent practice projects. I unearthed a practical treasure trove on GitHub: AI-Project-Gallery. It collects 30+ high-quality AI projects, covering classic topics like house price prediction and disease classification, as well as hot applications like Gemini chatbot and document generator...

X AI KOLs Timeline

This post shares a curated GitHub repository containing over 30 practical AI projects, covering domains from regression to generative AI, with many end-to-end examples, suitable for learners and developers.

@nuannuan_share: If I wanted to land a $200K AI engineer job in 90 days, I wouldn't go back to school. I'd master these 10 GitHub repositories. 1. awesome-llm-apps — A production-grade AI guide covering RAG, agents, and multimodal apps with full code. 106K+ stars. Repo …

X AI KOLs Timeline

A Chinese social media post recommends 10 GitHub repositories, claiming that mastering them can help land a $200K AI engineer job within 90 days. The repos cover mainstream AI development frameworks and tools including LangChain, LangGraph, CrewAI, Ollama, and Qdrant.