Tag
ZeroGPU launches specialized small language models (SLMs) for ad tech tasks, offering lower costs and faster performance compared to large language models. The SLMs run on CPUs and have already reduced expenses for early adopter Dappier by 50%.
The author introduces a Fetch API for RAG and web ingestion that returns page labels (dead link, content category, page structure) to help filter low-value pages before indexing. They seek feedback on what additional fields would be useful.