@QingQ77: 《动手学深度学习》是很好的入门书,但更新速度已经有些跟不上这个领域的发展。Transformer 之后,CLIP、Diffusion、vLLM 等等内容越来越多,网上资料虽然丰富,却很零散,今天看 Attention,明天学 LoRA,后…

X AI KOLs Timeline 工具

摘要

该项目是一个系统化的深度学习笔记仓库,涵盖 PyTorch、Transformer、生成模型等内容,旨在解决学习资料碎片化问题,并提供代码实现与实践指南。

《动手学深度学习》是很好的入门书,但更新速度已经有些跟不上这个领域的发展。Transformer 之后,CLIP、Diffusion、vLLM 等等内容越来越多,网上资料虽然丰富,却很零散,今天看 Attention,明天学 LoRA,后天又去读扩散模型,最后留下的往往只是碎片,很难真正串成体系。 本项目目前主要使用 Quarto Markdown 进行维护和发布,并构建为静态网站。Quarto Markdown 是一种基于 Markdown 的纯文本格式,适合版本控制和持续更新。 内容主要包括: - PyTorch 核心与工程实践 - 注意力机制与 Transformer 系列模型 - 生成模型,如 GAN、VAE、Diffusion - 多模态模型,如 CLIP 等 - Hugging Face 生态与实际应用 - 从数据处理到训练、推理、部署的实践笔记 https://github.com/jshn9515/deep-learning-notes…
查看原文 导出为 Word 导出为 PDF
查看缓存全文

缓存时间: 2026/05/10 02:21

《动手学深度学习》是很好的入门书,但更新速度已经有些跟不上这个领域的发展。Transformer 之后,CLIP、Diffusion、vLLM 等等内容越来越多,网上资料虽然丰富,却很零散,今天看 Attention,明天学 LoRA,后天又去读扩散模型,最后留下的往往只是碎片,很难真正串成体系。 本项目目前主要使用 Quarto Markdown 进行维护和发布,并构建为静态网站。Quarto Markdown 是一种基于 Markdown 的纯文本格式,适合版本控制和持续更新。 内容主要包括: - PyTorch 核心与工程实践 - 注意力机制与 Transformer 系列模型 - 生成模型,如 GAN、VAE、Diffusion - 多模态模型,如 CLIP 等 - Hugging Face 生态与实际应用 - 从数据处理到训练、推理、部署的实践笔记 https://github.com/jshn9515/deep-learning-notes…


jshn9515/deep-learning-notes

Source: https://github.com/jshn9515/deep-learning-notes

Deep Learning Notes

English | 简体中文

dnnl-title

For a long time, I struggled with how to learn deep learning effectively.

Dive into Deep Learning is an excellent introductory book, but its update pace has gradually fallen behind the speed of progress in this field. Since the rise of Transformers, topics like CLIP, Diffusion, and vLLM have become increasingly important. Although there is no shortage of online material, most of it is scattered. One day you study Attention, the next day LoRA, and the day after that diffusion models. In the end, what often remains are only fragments, and it is hard to build a truly coherent understanding.

So I decided to systematically organize what I have learned. From the fundamentals of PyTorch, to Attention and Transformers, and then to GANs, CLIP, Stable Diffusion, and SAM3, I try to explain the core ideas, mathematical derivations, code implementations, and common pitfalls of each topic as clearly as possible. This repository is the public version of those notes. If you are also learning deep learning on your own, I hope it can be helpful to you.

📌 About These Notes

This project is primarily maintained and published in Quarto Markdown, and built as a static website. Quarto Markdown is a plain-text format based on Markdown, which makes it well suited for version control and continuous updates.

The content mainly includes:

  • PyTorch fundamentals and engineering practice
  • Attention mechanisms and Transformer-based models
  • Generative models, such as GANs, VAEs, and diffusion models
  • Multimodal models, such as CLIP
  • The Hugging Face ecosystem and its practical use
  • Practical notes covering the full workflow from data processing to training, inference, and deployment

To make the material easier to use, I also periodically prepare corresponding Jupyter Notebook versions:

  • Monthly Releases: provide relatively stable packaged Notebook versions
  • GitHub Actions Artifacts: provide the latest build outputs

If you want a stable version, please check the Releases page. If you want the latest version, please check the Artifacts in GitHub Actions.

If you prefer generating Notebook files from the source yourself, you can also install Quarto locally and use the quarto convert command to convert .qmd files into Jupyter Notebooks. For example:

quarto convert path/to/file.qmd

🔧 Environment

All code in this repository has been tested in the following environment:

  • Python 3.14
  • PyTorch 2.11

See requirements.txt for the full list of dependencies.

Before running the related content, please first enter the dnnl directory and install the dnnl library according to the instructions in dnnl/README.md. This library contains some custom implementations and utility functions used throughout the notes, and many examples will not run properly without it.

This project uses Transformers v5. If you are following other repositories or tutorials based on v4, there may be significant API differences (such as tokenizers and quantization configurations). Please refer to the official migration guide for adjustments.

🤝 Contributions

If you find an explanation unclear, notice a problem in the code, or have topics you would like me to add, feel free to contribute through Issues or Pull Requests.

Possible contributions include, but are not limited to:

  • Pointing out errors or inaccuracies in the notes
  • Adding clearer explanations, derivations, or code comments
  • Suggesting improvements to structure, wording, or formatting
  • Recommending topics or practical cases for future coverage

Since this is a project I am building and refining while learning, there will inevitably be places where my understanding is incomplete or my explanations are not precise enough. I read all helpful feedback carefully and try to improve the notes whenever possible.

If you would like to make a larger change, it is recommended to open an Issue first with a brief description so that we can discuss it in advance.

🙏 Acknowledgements

While organizing these notes, I have benefited from many excellent resources. In particular, Dive into Deep Learning by Aston Zhang, Zachary C. Lipton, Mu Li, and Alexander J. Smola, as well as Professor Hung-yi Lee’s deep learning lecture series, have helped me greatly in understanding many core concepts in deep learning.

This project website is built with Quarto.

📄 License

  • The notes in this repository are licensed under CC BY-NC 4.0.
  • The dnnl library is licensed under MIT.

相似文章