Want to build a custom model
Summary
A user discusses building a small autocomplete model (25M parameters) as a learning project, mentions hardware constraints (32GB VRAM), data requirements (~100M tokens), and seeks advice on datasets and data formatting for autocomplete-style training.
Similar Articles
Are small local models for automation a thing?
A Reddit user discusses the potential of small local language models (1B-4B parameters) for automation and scripting, and asks for resources focused on this use case.
What if i really wanna train an AI from scratch?
A personal reflection on the challenges and allure of training an AI model from scratch, highlighting the difficulties with data, hardware, and scaling, while noting that surprisingly good small models can be trained on modest hardware.
Me train LLM on 8GB from Scratch. Me happy
Built a repository to train a tiny language model (25M parameters) from scratch on 8GB VRAM, with support for MTP but noting limitations of mHC and BitNet.
@paulabartabajo_: Advice for AI engineers A small Visual Language Model fine-tuned on your custom dataset is as accurate as GPT-5... ... …
A tweet claims that a small visual language model fine-tuned on custom data can match GPT-5 accuracy while costing 50× less, citing Liquid AI’s 1.6B model running locally with llama.cpp.
@harshbhatt7585: https://x.com/harshbhatt7585/status/2063593933314113587
The author shares learnings from training a 160M parameter LLM from scratch, experimenting with architectures like multi-token prediction and hierarchical reasoning models. They emphasize the importance of fast iteration, simplifying ideas, and understanding why architectures work.