@PandaTalk8: 1/ Recently read a book that is perfect for systematically learning LLM basics: 《Foundations of Large Language Models》 by Tong Xiao and Jingbo Zhu, from China's Northeastern University NLP Lab and NiuTrans…
Summary
Recommend a book for systematically learning the basics of large language models: 《Foundations of Large Language Models》, written by Tong Xiao and Jingbo Zhu from Northeastern University NLP Lab and NiuTrans Research.
View Cached Full Text
Cached at: 06/12/26, 12:58 PM
1/ I recently came across a book that is a great systematic introduction to LLM foundations:
Foundations of Large Language Models
The authors are Tong Xiao and Jingbo Zhu, from the NLP Lab at Northeastern University (China) and NiuTrans Research.
2/ This is not a trendy “model gossip book”—it’s a foundational read that helps you understand the underlying logic of large models:
Pre-training, generative models, prompting, alignment, and inference.
3/ Both authors have long been deeply involved in NLP, machine translation, and language modeling.
Tong Xiao is a professor of computer science at Northeastern University, director of the NLP Lab, and co-founder of NiuTrans; Jingbo Zhu is a professor and PhD supervisor at Northeastern University.
4/ The book’s background is clear:
Today, LLMs look like general AI interfaces, but their roots are in NLP.
From language modeling to instruction fine-tuning, RLHF, and inference acceleration, many new capabilities have clear technical lineages.
5/ What you gain from reading this book isn’t “fancier prompts,” but foundational judgment:
Why does pre-training work? Why has decoder-only become mainstream? Why can prompts change model behavior? Why is alignment so important?
6/ This book is suitable for three groups of people:
AI product managers: to build a technical judgment framework; Developers/engineers: to understand capability boundaries and system bottlenecks; Researchers/learners: to systematically organize LLM fundamental concepts and technical paths.
7/ Core topic one: Pre-training.
It covers unsupervised, supervised, and self-supervised pre-training, explaining how models learn general capabilities from vast text data, and then adapt to different tasks via fine-tuning or prompting.
8/ Core topic two: Generative models.
This includes decoder-only Transformer, training pipeline, model scaling, distributed training, scaling laws, and long-context modeling.
If you want to understand “why bigger models are stronger,” this chapter is key.
9/ Core topic three: Prompting.
Not just prompt tricks, but a systematic introduction to in-context learning, CoT, problem decomposition, self-improvement, multi-answer ensembling, RAG, tool calling, and prompt optimization.
10/ Core topic four: Alignment.
A pre-trained model is just a “continuation generator”; to become a useful, controllable, human-preference-aligned assistant, methods like instruction fine-tuning, reward modeling, RLHF, and DPO are needed.
11/ Core topic five: Inference.
In real deployment, inference efficiency, decoding strategies, KV cache, batching, parallelization, and inference-time scaling are crucial.
Fast, stable, and cheap—much of that depends on this layer.
12/ I recommend this book because it breaks LLMs down into five stable modules:
Pre-training → Generative models → Prompting → Alignment → Inference
After reading it, when you encounter new models, new papers, or new products, it’ll be much easier to judge what is truly novel.
Similar Articles
@GitHub_Daily: Want to understand the underlying principles of large language models? Most resources only cover theory or provide source code, leaving you still confused. Stumbled upon this open-source tutorial, EveryonesLLM, which guides us step by step to build a complete large language model from scratch on Google Colab, writing code throughout. The whole tutorial is divided into...
EveryonesLLM is an open-source tutorial that provides 29 chapters of Colab notebooks. It teaches users step by step to build a complete large language model from scratch on Google Colab, including pre-training and instruction fine-tuning, and supports Chinese.
@_rohit_tiwari_: This 115-page book unlocks the secrets of LLM fine tuning. https://drive.google.com/file/d/1cS5sWZw9XUDRI4uRh02-28Xq4-P…
A comprehensive 115-page guide to fine-tuning large language models, covering theory and practice.
@wsl8297: Sharing an easy-to-read open-source book 'Foundations of Large Models'. From an introduction to large language models to architectural evolution, then to key technologies such as Prompt engineering, parameter-efficient fine-tuning, model editing, retrieval-augmented generation (RAG), all in one book. GitHub: https://github.com/ZJU-LLMs/…
The Zhejiang University team open-sourced an easy-to-understand textbook on large models 'Foundations of Large Models', covering from architectural evolution to key technologies like RAG, accompanied by the Agent-Kernel multi-agent framework.
@Xx15573208: I've read many articles about Transformers and understand the theory, but when I actually sit down to write code, I have no idea where to start. LLMs-from-scratch is specifically designed to solve this problem: it accompanies the book "Build a Large Language Model" and guides you through implementing GPT from scratch using PyTorch…
LLMs-from-scratch is a GitHub repository that accompanies the book "Build a Large Language Model," providing complete code to implement GPT from scratch with PyTorch, covering the full pipeline including pretraining, fine-tuning, and RLHF. It has gained 93K+ stars and is ideal for developers who want to deeply understand the principles behind large language models.
@tut_ml: Best LLM Courses- https://mltut.com/best-large-language-models-courses/…
A blog post listing the 10 best large language models (LLMs) courses and training resources, including courses from Coursera, DataCamp, Udacity, and universities like Vanderbilt.