从零开始在8GB显存上训练LLM。我开心

Reddit r/LocalLLaMA 2026/05/29 20:16 工具

training 8gb-vram tiny-model open-source from-scratch community-project

摘要

构建了一个仓库，用于在8GB显存上从零训练一个微型语言模型（25M参数），支持MTP，但指出mHC和BitNet的局限性。

我昨天发了个帖子：[https://www.reddit.com/r/LocalLLaMA/comments/1tqjuzg/why\_is\_there\_no\_community\_project\_for\_training/](https://www.reddit.com/r/LocalLLaMA/comments/1tqjuzg/why_is_there_no_community_project_for_training/) 我今天写了个程序：[https://github.com/epoyraz/train-a-model-from-scratch](https://github.com/epoyraz/train-a-model-from-scratch) 亮点：\- 用8GB显存从零训练tinystories。耶 \- mHC不好（模型太小） \- BitNet太慢（训练时没有内存增益） \- TurboQuant（不需要） \- MTP可行。耶耶耶（但让训练变慢）嗯……这不是LLM，是个25M的小模型：[https://huggingface.co/epoyraz/tinystories-25m](https://huggingface.co/epoyraz/tinystories-25m)

查看原文

从零开始在8GB显存上训练LLM。我开心

相似文章

本地LLM CPU用户……你们做任何事情要花多长时间？

@heygurisingh: 过去训练参数量达数十亿的LLM需要花费1000万美元以上。有人开源了一个仓库，现在可以在单张GPU上完成。

rasbt/LLMs-from-scratch

我从零开始训练了一个75M参数的LLM，使用18B tokens，它击败了几乎两倍大小的模型

@tom_doerr: 在单个4GB GPU上运行70B大语言模型 https://github.com/lyogavin/airllm

提交意见反馈