@PandaTalk8: 1/ Recently read a book that is perfect for systematically learning LLM basics: 《Foundations of Large Language Models》 by Tong Xiao and Jingbo Zhu, from China's Northeastern University NLP Lab and NiuTrans…

X AI KOLs Timeline News

Summary

Recommend a book for systematically learning the basics of large language models: 《Foundations of Large Language Models》, written by Tong Xiao and Jingbo Zhu from Northeastern University NLP Lab and NiuTrans Research.

1/ Recently read a book that is perfect for systematically learning LLM basics: 《Foundations of Large Language Models》 Authors are Tong Xiao and Jingbo Zhu, from China's Northeastern University NLP Lab and NiuTrans Research. https://t.co/fWYjicwpa6
Original Article
View Cached Full Text

Cached at: 06/12/26, 12:58 PM

1/ I recently came across a book that is a great systematic introduction to LLM foundations:

Foundations of Large Language Models

The authors are Tong Xiao and Jingbo Zhu, from the NLP Lab at Northeastern University (China) and NiuTrans Research.

2/ This is not a trendy “model gossip book”—it’s a foundational read that helps you understand the underlying logic of large models:

Pre-training, generative models, prompting, alignment, and inference.

3/ Both authors have long been deeply involved in NLP, machine translation, and language modeling.

Tong Xiao is a professor of computer science at Northeastern University, director of the NLP Lab, and co-founder of NiuTrans; Jingbo Zhu is a professor and PhD supervisor at Northeastern University.

4/ The book’s background is clear:

Today, LLMs look like general AI interfaces, but their roots are in NLP.

From language modeling to instruction fine-tuning, RLHF, and inference acceleration, many new capabilities have clear technical lineages.

5/ What you gain from reading this book isn’t “fancier prompts,” but foundational judgment:

Why does pre-training work? Why has decoder-only become mainstream? Why can prompts change model behavior? Why is alignment so important?

6/ This book is suitable for three groups of people:

AI product managers: to build a technical judgment framework; Developers/engineers: to understand capability boundaries and system bottlenecks; Researchers/learners: to systematically organize LLM fundamental concepts and technical paths.

7/ Core topic one: Pre-training.

It covers unsupervised, supervised, and self-supervised pre-training, explaining how models learn general capabilities from vast text data, and then adapt to different tasks via fine-tuning or prompting.

8/ Core topic two: Generative models.

This includes decoder-only Transformer, training pipeline, model scaling, distributed training, scaling laws, and long-context modeling.

If you want to understand “why bigger models are stronger,” this chapter is key.

9/ Core topic three: Prompting.

Not just prompt tricks, but a systematic introduction to in-context learning, CoT, problem decomposition, self-improvement, multi-answer ensembling, RAG, tool calling, and prompt optimization.

10/ Core topic four: Alignment.

A pre-trained model is just a “continuation generator”; to become a useful, controllable, human-preference-aligned assistant, methods like instruction fine-tuning, reward modeling, RLHF, and DPO are needed.

11/ Core topic five: Inference.

In real deployment, inference efficiency, decoding strategies, KV cache, batching, parallelization, and inference-time scaling are crucial.

Fast, stable, and cheap—much of that depends on this layer.

12/ I recommend this book because it breaks LLMs down into five stable modules:

Pre-training → Generative models → Prompting → Alignment → Inference

After reading it, when you encounter new models, new papers, or new products, it’ll be much easier to judge what is truly novel.

Similar Articles

@GitHub_Daily: Want to understand the underlying principles of large language models? Most resources only cover theory or provide source code, leaving you still confused. Stumbled upon this open-source tutorial, EveryonesLLM, which guides us step by step to build a complete large language model from scratch on Google Colab, writing code throughout. The whole tutorial is divided into...

X AI KOLs Timeline

EveryonesLLM is an open-source tutorial that provides 29 chapters of Colab notebooks. It teaches users step by step to build a complete large language model from scratch on Google Colab, including pre-training and instruction fine-tuning, and supports Chinese.

@wsl8297: Sharing an easy-to-read open-source book 'Foundations of Large Models'. From an introduction to large language models to architectural evolution, then to key technologies such as Prompt engineering, parameter-efficient fine-tuning, model editing, retrieval-augmented generation (RAG), all in one book. GitHub: https://github.com/ZJU-LLMs/…

X AI KOLs Timeline

The Zhejiang University team open-sourced an easy-to-understand textbook on large models 'Foundations of Large Models', covering from architectural evolution to key technologies like RAG, accompanied by the Agent-Kernel multi-agent framework.

@Xx15573208: I've read many articles about Transformers and understand the theory, but when I actually sit down to write code, I have no idea where to start. LLMs-from-scratch is specifically designed to solve this problem: it accompanies the book "Build a Large Language Model" and guides you through implementing GPT from scratch using PyTorch…

X AI KOLs Timeline

LLMs-from-scratch is a GitHub repository that accompanies the book "Build a Large Language Model," providing complete code to implement GPT from scratch with PyTorch, covering the full pipeline including pretraining, fine-tuning, and RLHF. It has gained 93K+ stars and is ideal for developers who want to deeply understand the principles behind large language models.