@wsl8297: Discovered a deep learning paper reading project on GitHub: paper-reading. Author Mu Shen reads classic and new deep learning papers paragraph by paragraph, recorded into video explanations, has been updated for over 3 years. GitHub: https://github.com/mli/paper-reading...

X AI KOLs Timeline 05/15/26, 08:56 AM Tools

deep-learning paper-reading video-tutorials education open-source ai-research

Summary

Mu Shen's deep learning paper reading project on GitHub includes in-depth reading videos of major papers such as GPT-4, Llama 3.1, Sora, etc. Each video is about 1 hour, suitable for AI researchers and developers to deeply understand classic papers.

Discovered a deep learning paper reading project on GitHub: paper-reading. Author Mu Shen reads classic and new deep learning papers paragraph by paragraph, records video explanations, and has been updated for over 3 years. GitHub: https://github.com/mli/paper-reading... The project includes in-depth reading videos of significant papers such as GPT-4, Llama 3.1, Sora, DALL·E 2, Instruct GPT, Whisper, Chain of Thought, etc. Each video is about 1 hour of deep explanation, breaking down the paper content paragraph by paragraph. The videos are simultaneously updated on Bilibili and YouTube, and also include series such as multimodal paper overviews and CLIP improvement work overviews. Besides paper reading, there are also sharing of research ideas in the era of large models, research methodology, etc. Suitable for AI researchers and developers who want to deeply understand classic papers and keep up with cutting-edge progress.

Original Article

View Cached Full Text

Cached at: 05/15/26, 10:59 AM

Deep Learning Paper Reading Project: paper-reading

Discovered a deep learning paper reading project on GitHub: paper-reading. The author, Mu Shen, does a paragraph-by-paragraph deep reading of classic and new deep learning papers, recording video explanations that have been updated for over 3 years. GitHub: https://github.com/mli/paper-reading… The project includes in-depth reading videos for major papers like GPT-4, Llama 3.1, Sora, DALL·E 2, Instruct GPT, Whisper, Chain of Thought, etc. Each video is about an hour of deep explanation, breaking down the paper paragraph by paragraph. Videos are also updated simultaneously on Bilibili and YouTube, and include series such as multimodal paper overviews and CLIP improvement work overviews. In addition to paper reading, there are also sharing of research ideas in the era of large models, research methodology, and more. Suitable for AI researchers and developers who want to deeply understand classic papers and keep up with cutting-edge progress.

mli/paper-reading

Source: https://github.com/mli/paper-reading

Deep Learning Paper Reading

Recorded Papers

Date	Title	Duration	Video (Views)
1/10/25	OpenAI Sora (https://openai.com/index/video-generation-models-as-world-simulators/) Part 1 (including Movie Gen and HunyuanVideo)	1:04:18	bilibili (https://www.bilibili.com/video/BV1VdcxesEAt/?share_source=copy_web&vd_source=5d037e935914fc22e2e978cdccf5cdfe)
9/04/24	Llama 3.1 Paper Reading · 5. Model Training Process	10:41	bilibili (https://www.bilibili.com/video/BV1c8HbeaEXi)
8/28/24	Llama 3.1 Paper Reading · 4. Training Infrastructure	25:04	bilibili (https://www.bilibili.com/video/BV1b4421f7fa)
8/13/24	Llama 3.1 Paper Reading · 3. Model	26:14	bilibili (https://www.bilibili.com/video/BV1Q4421Z7Tj)
8/05/24	Llama 3.1 Paper Reading · 2. Pre-training Data (https://arxiv.org/pdf/2407.21783)	23:37	bilibili (https://www.bilibili.com/video/BV1u142187S5)
7/31/24	Llama 3.1 Paper Reading · 1. Introduction	18:53	bilibili (https://www.bilibili.com/video/BV1WM4m1y7Uh)
3/30/23	GPT-4 (https://openai.com/research/gpt-4)	1:20:38	bilibili (https://www.bilibili.com/video/BV1vM4y1U7b5)
3/23/23	Four Research Ideas in the Era of Large Models	1:06:29	bilibili (https://www.bilibili.com/video/BV1oX4y1d7X6)
3/10/23	Anthropic LLM (https://arxiv.org/pdf/2204.05862.pdf)	1:01:51	bilibili (https://www.bilibili.com/video/BV1XY411B7nM)
1/20/23	Helm (https://arxiv.org/pdf/2211.09110.pdf) Comprehensive Language Model Evaluation	1:23:37	bilibili (https://www.bilibili.com/video/BV1z24y1B7uX)
1/11/23	Multimodal Paper Overview · Part 2	1:03:29	bilibili (https://www.bilibili.com/video/BV1fA411Z772)
12/29/22	Instruct GPT (https://arxiv.org/pdf/2203.02155.pdf)	1:07:10	bilibili (https://www.bilibili.com/video/BV1hd4y187CR)
12/19/22	Neural Corpus Indexer (https://arxiv.org/pdf/2206.02743.pdf) Document Retrieval	55:47	bilibili (https://www.bilibili.com/video/BV1Se411w7Sn)
12/12/22	Multimodal Paper Overview · Part 1	1:12:27	bilibili (https://www.bilibili.com/video/BV1Vd4y1v77v)
11/14/22	OpenAI Whisper (https://cdn.openai.com/papers/whisper.pdf) In-depth Reading	1:12:16	bilibili (https://www.bilibili.com/video/BV1VG4y1t74x)
11/07/22	Before Talking About OpenAI Whisper, I Made a Little Video Editing Tool	23:39	bilibili (https://www.bilibili.com/video/BV1Pe4y1t7de)
10/23/22	Chain of Thought (https://arxiv.org/pdf/2201.11903.pdf) Paper, Code, and Resources	33:21	bilibili (https://www.bilibili.com/video/BV1t8411e7Ug)
9/17/22	CLIP Improvement Work Overview (Part 2)	1:04:26	bilibili (https://www.bilibili.com/video/BV1gg411U7n4)
9/2/22	CLIP Improvement Work Overview (Part 1)	1:14:43	bilibili (https://www.bilibili.com/video/BV1FV4y1p7Lm)
7/29/22	ViLT (https://arxiv.org/pdf/2102.03334.pdf) Paper In-depth Reading	1:03:26	bilibili (https://www.bilibili.com/video/BV14r4y1j74y)
7/22/22	Reasons, Evidence, and Warrants [The Craft of Research (https://press.uchicago.edu/ucp/books/book/chicago/C/bo23521678.html) · 4]	44:14	bilibili (https://www.bilibili.com/video/BV1SB4y1a75c)
7/15/22	How to Tell a Good Story, Arguments in a Story [The Craft of Research (https://press.uchicago.edu/ucp/books/book/chicago/C/bo23521678.html) · 3]	43:56	bilibili (https://www.bilibili.com/video/BV1WB4y1v7ST)
7/8/22	DALL·E 2 (https://arxiv.org/pdf/2204.06125.pdf) Paragraph-by-Paragraph Reading	1:27:54	bilibili (https://www.bilibili.com/video/BV17r4y1u77B)
7/1/22	Understanding the Importance of the Problem [The Craft of Research (https://press.uchicago.edu/ucp/books/book/chicago/C/bo23521678.html) · 2]	1:03:40	bilibili (https://www.bilibili.com/video/BV11S4y1v7S2/)
6/24/22	Connecting with Readers [The Craft of Research (https://press.uchicago.edu/ucp/books/book/chicago/C/bo23521678.html) · 1]	45:01	bilibili (https://www.bilibili.com/video/BV1hY411T7vy/)
6/17/22	Zero (https://arxiv.org/pdf/1910.02054.pdf) Paragraph-by-Paragraph Reading	52:21	bilibili (https://www.bilibili.com/video/BV1tY411g7ZT/)
6/10/22	DETR (https://arxiv.org/pdf/2005.12872.pdf) Paragraph-by-Paragraph Reading	54:22	bilibili (https://www.bilibili.com/video/BV1GB4y1X72R/)
6/3/22	Megatron LM (https://arxiv.org/pdf/1909.08053.pdf) Paragraph-by-Paragraph Reading	56:07	bilibili (https://www.bilibili.com/video/BV1nB4y1R7Yz/)
5/27/22	GPipe (https://proceedings.neurips.cc/paper/2019/file/093f65e080a295f8076b1c5722a46aa2-Paper.pdf) Paragraph-by-Paragraph Reading	58:47	bilibili (https://www.bilibili.com/video/BV1v34y1E7zu/)
5/5/22	Pathways (https://arxiv.org/pdf/2203.12533.pdf) Paragraph-by-Paragraph Reading	1:02:13	bilibili (https://www.bilibili.com/video/BV1xB4y1m7Xi/)
4/28/22	Video Understanding Paper Overview (https://arxiv.org/pdf/2012.06567.pdf) (Part 2)	1:08:32	bilibili (https://www.bilibili.com/video/BV11Y411P7ep/)
4/21/22	Parameter Server (https://www.usenix.org/system/files/conference/osdi14/osdi14-paper-li_mu.pdf) Paragraph-by-Paragraph Reading	1:37:40	bilibili (https://www.bilibili.com/video/BV1YA4y197G8/)
4/14/22	Video Understanding Paper Overview (https://arxiv.org/pdf/2012.06567.pdf) (Part 1)	51:15	bilibili (https://www.bilibili.com/video/BV1fL4y157yA/)
3/31/22	I3D (https://arxiv.org/pdf/1705.07750.pdf) Paper In-depth Reading	52:31	bilibili (https://www.bilibili.com/video/BV1tY4y1p7hq/)
3/24/22	Stanford 2022 AI Index Report (https://aiindex.stanford.edu/wp-content/uploads/2022/03/2022-AI-Index-Report_Master.pdf) In-depth Reading	1:19:56	bilibili (https://www.bilibili.com/video/BV1s44y1N7eu/)
3/17/22	AlphaCode (https://storage.googleapis.com/deepmind-media/AlphaCode/competition_level_code_generation_with_alphacode.pdf) Paper In-depth Reading	44:00	bilibili (https://www.bilibili.com/video/BV1ab4y1s7rc/)
3/10/22	OpenAI Codex (https://arxiv.org/pdf/2107.03374.pdf) Paper In-depth Reading	47:58	bilibili (https://www.bilibili.com/video/BV1iY41137Zi/) zhihu (https://www.zhihu.com/zvideo/1490959755963666432)
3/3/22	GPT (https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf), GPT-2 (https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf), GPT-3 (https://arxiv.org/abs/2005.14165) In-depth Reading	1:29:58	bilibili (https://www.bilibili.com/video/BV1AF411b7xQ/)
2/24/22	Two-Stream (https://proceedings.neurips.cc/paper/2014/file/00ec53c4682d36f5c4359f4ae7bd7ba1-Paper.pdf) Paragraph-by-Paragraph Reading	52:57	bilibili (https://www.bilibili.com/video/BV1mq4y1x7RU/)
2/10/22	CLIP (https://openai.com/blog/clip/) Paragraph-by-Paragraph Reading	1:38:25	bilibili (https://www.bilibili.com/video/BV1SL4y1s7LQ/) zhihu (https://www.zhihu.com/zvideo/1475706654562299904)
2/6/22	Have You Been (or Complained About) Papers Not Being Novel (https://perceiving-systems.blog/en/post/novelty-in-science) Enough?	14:11	bilibili (https://www.bilibili.com/video/BV1ea41127Bq/) zhihu (https://www.zhihu.com/zvideo/1475719090198876161)
1/23/22	AlphaFold 2 (https://www.nature.com/articles/s41586-021-03819-2.pdf) In-depth Reading	1:15:28	bilibili (https://www.bilibili.com/video/BV1oR4y1K7Xr/) zhihu (https://www.zhihu.com/zvideo/1469132410537717760)
1/18/22	How to Judge the Value of (Your Own) Research Work	9:59	bilibili (https://www.bilibili.com/video/BV1oL411c7Us/) zhihu (https://www.zhihu.com/zvideo/1475716940051869696)
1/15/22	Swin Transformer (https://arxiv.org/pdf/2103.14030.pdf) In-depth Reading	1:00:21	bilibili (https://www.bilibili.com/video/BV13L4y1475U/) zhihu (https://www.zhihu.com/zvideo/1466282983652691968)
1/7/22	Guiding Mathematical Intuition (https://www.nature.com/articles/s41586-021-04086-x.pdf)	52:51	bilibili (https://www.bilibili.com/video/BV1YZ4y1S72j/) zhihu (https://www.zhihu.com/zvideo/1464060386375299072)
1/5/22	AlphaFold 2 Preview	03:28	bilibili (https://www.bilibili.com/video/BV1Eu411U7Te/)
12/20/21	Contrastive Learning Paper Survey	1:32:01	bilibili (https://www.bilibili.com/video/BV19S4y1M7hm/) zhihu (https://www.zhihu.com/zvideo/1460828005077164032)
12/15/21	MoCo (https://arxiv.org/pdf/1911.05722.pdf) Paragraph-by-Paragraph Reading	1:24:11	bilibili (https://www.bilibili.com/video/BV1C3411s7t9/) zhihu (https://www.zhihu.com/zvideo/1454723120678936576)
12/9/21	How to Find Research Ideas 1	5:34	bilibili (https://www.bilibili.com/video/BV1qq4y1z7F2/)
12/8/21	MAE (https://arxiv.org/pdf/2111.06377.pdf) Paragraph-by-Paragraph Reading	47:04	bilibili (https://www.bilibili.com/video/BV1sq4y1q77t/) zhihu (https://www.zhihu.com/zvideo/1452458167968251904)
11/29/21	ViT (https://arxiv.org/pdf/2010.11929.pdf) Paragraph-by-Paragraph Reading	1:11:30	bilibili (https://www.bilibili.com/video/BV15P4y137jb/) zhihu (https://www.zhihu.com/zvideo/1449195245754380288)
11/18/21	BERT (https://arxiv.org/abs/1810.04805) Paragraph-by-Paragraph Reading	45:49	bilibili (https://www.bilibili.com/video/BV1PL411M7eQ/) zhihu (https://www.zhihu.com/zvideo/1445340200976785408)
11/9/21	GAN (https://papers.nips.cc/paper/2014/file/5ca3e9b122f61f8f06494c97b1afccf3-Paper.pdf) Paragraph-by-Paragraph Reading	46:16	bilibili (https://www.bilibili.com/video/BV1rb4y187vD/) zhihu (https://www.zhihu.com/zvideo/1442091389241159681)
11/3/21	Zero-Based Multi-Figure Detailed Explanation of Graph Neural Networks (https://distill.pub/2021/gnn-intro/) (GNN/GCN)	1:06:19	bilibili (https://www.bilibili.com/video/BV1iT4y1d7zP/) zhihu (https://www.zhihu.com/zvideo/1439540657619087360)
10/27/21	Transformer (https://arxiv.org/abs/1706.03762) Paragraph-by-Paragraph Reading (References mentioned in video ¹)	1:27:05	bilibili (https://www.bilibili.com/video/BV1pu411o7BE/) zhihu (https://www.zhihu.com/zvideo/1437034536677404672)
10/22/21	ResNet (https://arxiv.org/abs/1512.03385) Paragraph-by-Paragraph Reading	53:46	bilibili (https://www.bilibili.com/video/BV1P3411y7nn/) zhihu (https://www.zhihu.com/zvideo/1434795406001180672)
10/21/21	ResNet (https://arxiv.org/abs/1512.03385): The Backbone of Computer Vision	11:50	bilibili (https://www.bilibili.com/video/BV1Fb4y1h73E/) zhihu (https://www.zhihu.com/zvideo/1434787226101751808)
10/15/21	AlexNet (https://papers.nips.cc/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf) Paragraph-by-Paragraph Reading	55:21	bilibili (https://www.bilibili.com/video/BV1hq4y157t1/) zhihu (https://www.zhihu.com/zvideo/1432354207483871232)
10/14/21	Rereading a Foundational Work of Deep Learning 9 Years Later: AlexNet (https://papers.nips.cc/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf)	19:59	bilibili (https://www.bilibili.com/video/BV1ih411J7Kz/) zhihu (https://www.zhihu.com/zvideo/1432155856322920448)
10/06/21	How to Read a Paper	06:39	bilibili (https://www.bilibili.com/video/BV1H44y1t75x/) zhihu (https://www.zhihu.com/zvideo/1428973951632969728)

1 Stanford 200+ page survey with 100+ authors (https://arxiv.org/abs/2108.07258), 2 New research on LayerNorm (https://arxiv.org/pdf/1911.07013.pdf), 3 Research on the role of Attention in Transformers (https://arxiv.org/abs/2103.03404)

All Papers

Includes papers already recorded and those to be introduced later. The selection principle is influential papers in deep learning within the last 10 years (must-read papers), or recent interesting papers. Of course, there are too many important works in these ten years to cover one by one. When selecting, I will lean towards those not covered in previous live classes (https://c.d2l.ai/zh-v2/).

Feel free to provide suggestions (requests) in the discussion area (https://github.com/mli/paper-reading/discussions).

Total papers: 67, Recorded: 32 (Citations here use Semantic Scholar because it provides an API (https://api.semanticscholar.org/api-docs/graph#operation/get_graph_get_paper) to automatically fetch without manual updates.)

Computer Vision - CNN

Recorded	Year	Name	Description	Citations
✅	2012	AlexNet (https://papers.nips.cc/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf)	Foundational work of the deep learning boom	citation (https://www.semanticscholar.org/paper/ImageNet-classification-with-deep-convolutional-Krizhevsky-Sutskever/abd1c342495432171beb7ca8fd9551ef13cbd0ff)
	2014	VGG (https://arxiv.org/pdf/1409.1556.pdf)	Deeper networks using 3x3 convolutions	citation (https://www.semanticscholar.org/paper/Very-Deep-Convolutional-Networks-for-Large-Scale-Simonyan-Zisserman/eb42cf88027de515750f230b23b1a057dc782108)
	2014	GoogleNet (https://arxiv.org/pdf/1409.4842.pdf)	Deeper networks using parallel architectures	citation (https://www.semanticscholar.org/paper/Going-deeper-with-convolutions-Szegedy-Liu/e15cf50aa89fee8535703b9f9512fca5bfc43327)
✅	2015	ResNet (https://arxiv.org/pdf/1512.03385.pdf)	Residual connections essential for deep networks.	citation (https://www.semanticscholar.org/paper/Deep-Residual-Learning-for-Image-Recognition-He-Zhang/2c03df8b48bf3fa39054345bafabfeff15bfd11d)
	2017	MobileNet (https://arxiv.org/pdf/1704.04861.pdf)	Small CNN suitable for mobile devices	citation (https://www.semanticscholar.org/paper/MobileNets%3A-Efficient-Convolutional-Neural-Networks-Howard-Zhu/3647d6d0f151dc05626449ee09cc7bce55be497e)
	2019	EfficientNet (https://arxiv.org/pdf/1905.11946.pdf)	CNN obtained through architecture search	citation (https://www.semanticscholar.org/paper/EfficientNet%3A-Rethinking-Model-Scaling-for-Neural-Tan-Le/4f2eda8077dc7a69bb2b4e0a1a086cf054adb3f9)
	2021	Non-deep networks (https://arxiv.org/pdf/2110.07641.pdf)	Achieving SOTA on ImageNet with shallow networks	citation (https://www.semanticscholar.org/paper/Non-deep-Networks-Goyal-Bochkovskiy/0d7f6086772079bc3e243b7b375a9ca1a517ba8b)

Computer Vision - Transformer

Recorded	Year	Name	Description	Citations
✅	2020	ViT (https://arxiv.org/pdf/2010.11929.pdf)	Transformer enters CV	citation (https://www.semanticscholar.org/paper/An-Image-is-Worth-16x16-Words%3A-Transformers-for-at-Dosovitskiy-Beyer/7b15fa1b8d413fbe14ef7a97f651f47f5aff3903)
✅	2021	Swin Transformer (https://arxiv.org/pdf/2103.14030.pdf)	Hierarchical Vision Transformer	citation (https://www.semanticscholar.org/paper/Swin-Transformer%3A-Hierarchical-Vision-Transformer-Liu-Lin/c8b25fab5608c3e033d34b4483ec47e68ba109b7)
	2021	MLP-Mixer (https://arxiv.org/pdf/2105.01601.pdf)	Replacing self-attention with MLPs	citation (https://www.semanticscholar.org/paper/MLP-Mixer%3A-An-all-MLP-Architecture-for-Vision-Tolstikhin-Houlsby/2def61f556f9a5576ace08911496b7c7e4f970a4)
✅	2021	MAE (https://arxiv.org/pdf/2111.06377.pdf)	BERT version for CV	citation (https://www.semanticscholar.org/paper/Masked-Autoencoders-Are-Scalable-Vision-Learners-He-Chen/c1962a8cf364595ed2838a097e9aa7cd159d3118)

Generative Models

Recorded	Year	Name	Description	Citations
✅	2014	GAN (https://papers.nips.cc/paper/2014/file/5ca3e9b122f61f8f06494c97b1afccf3-Paper.pdf)	Pioneering work in generative models	citation (https://www.semanticscholar.org/pap