handbook

#handbook

An open handbook on LLM inference at scale (GPU internals, KV cache, batching, vLLM/SGLang/TensorRT-LLM) [P]

Reddit r/MachineLearning ↗ · 3d ago

An open, in-progress handbook explaining LLM inference internals including GPU memory hierarchy, KV cache, batching, and popular inference engines like vLLM and TensorRT-LLM.

0 favorites 0 likes

#handbook

@777BHAVYA: if u want to study llms end to end be it from each componets in llm from the vaswani till now incliding quantization st…

X AI KOLs Timeline ↗ · 2026-05-25 Cached

A tweet recommends the Language AI Handbook, a free online resource that covers LLM components from classical NLP to modern transformers, quantization, RL, and safety.

0 favorites 0 likes

handbook

An open handbook on LLM inference at scale (GPU internals, KV cache, batching, vLLM/SGLang/TensorRT-LLM) [P]

@777BHAVYA: if u want to study llms end to end be it from each componets in llm from the vaswani till now incliding quantization st…

Submit Feedback