Has anyone here used SLMs inside agent workflows?

Reddit r/AI_Agents 05/16/26, 06:08 PM News

Summary

A user asks the community about using small/local language models within agent workflows for specific tasks like routing, classification, and extraction, and shares thoughts on whether larger models are always necessary.

I’m curious if anyone here is actually using small/local language models as part of agent systems. Not necessarily as the main “brain” of the agent, but for specific parts of the workflow, like routing, classification, extraction, summarization, tool selection, validation, memory cleanup, or simple decision steps. I keep thinking that a lot of agent flows probably don’t need a large model for every single step. Some parts feel like they could be handled by a smaller fine-tuned model, especially when the task is narrow and repetitive. Has anyone tried this in production or in a serious project? What parts of the agent pipeline worked well with an SLM, and where did you still need a larger model? I’d love to hear real examples, even small ones.

Original Article

Similar Articles

The power of structured workflows and small local models

Reddit r/LocalLLaMA

The author details their experience building a custom agent loop using a small local model (Qwen3.5 9B) with structured workflows and a map-reduce pattern to manage context limits, replacing Claude Code for most tasks.

Are super tiny LLMs any good?

Reddit r/singularity

Explores whether very small language models can handle casual conversations adequately, and what training factors differentiate the better ones.

When Does Multi-Agent RL Improve LLM Workflows? Workflow, Scale, and Policy-Sharing Tradeoffs

arXiv cs.AI

This paper studies when end-to-end reinforcement learning training improves multi-agent LLM workflows, comparing shared-policy and isolated-policy training across different workflows, tasks, and model scales, revealing conditional tradeoffs.

The reason small-model agent stacks aren't the default has nothing to do with whether they work

Reddit r/LocalLLaMA

Small language models can match or outperform large frontier models on agentic tasks at a fraction of the cost, yet adoption lags because frontier labs have no incentive to promote them. A key concern is that small models often produce correct answers through flawed reasoning, which can be mitigated with retrieval and a verification layer.

@j_golebiowski: The next agent stack: a frontier LLM as orchestrator, fine-tuned SLMs as skills. For PII redaction, the orchestrator ne…

X AI KOLs Timeline

Describes an agent stack design where a frontier LLM orchestrates fine-tuned small language models for PII redaction, ensuring privacy by keeping raw text local.