@cjzafir: VLMs (Vertical Language Models) are beating top LLMs. These small 7B to 15B niche-focused models are beating SoTA model…

X AI KOLs Timeline News

Summary

The author demonstrates that small vertical language models (6B-15B) can outperform top LLMs on niche benchmarks through cost-effective fine-tuning using open-source models and Codex orchestration, achieving results with a $300 dataset.

VLMs (Vertical Language Models) are beating top LLMs. These small 7B to 15B niche-focused models are beating SoTA models in their niche benchmarks. I post-trained a 6B dense model in 15 days and beat Sonnet 4.6 and Gemini 3 Flash. I use Codex 5.5 (Extra High) to plan the SFT dataset scope, then I use DeepSeek v4 Pro & Kimi 2.6 API to generate handwritten examples. (No synthetic, templated datasets.) Codex runs each batch through quality gates and filters out all the weak data. I was able to build a 350M-parameter dataset for just $300 using Codex as the orchestrator & DS + Kimi as the executors. I can compete with giant data labs on my own, beat their VLMs, and not break the bank. This happened only because of open-source models, as they're fighting neck-and-neck with SoTA models. If I had to start a career right now, I'd start an agency that fine-tunes SLMs (small language models) for enterprises. I'd charge them a $10k to $20k one-time fee. Use Qwen 3.5 or Gemma 4 as base models, use Codex as the brain and DeepSeek v4 + Kimi as the muscle, and post-train a strong SLM under $1000. This might feel far-fetched, but in 6 months you'll see agencies like these. Not everything requires an LLM. SLMs can achieve vertical intelligence if properly trained with 10x lower cost, no privacy issues, and full control over the model. I'll be sharing technical findings here on X on the go. If you enjoy nerdy fine-tuning stuff, stay tuned.
Original Article

Similar Articles

Are super tiny LLMs any good?

Reddit r/singularity

Explores whether very small language models can handle casual conversations adequately, and what training factors differentiate the better ones.