@MaximeRivest: https://x.com/MaximeRivest/status/2055293570119065875
Summary
MaximeRivest explains DSPy's five core components—Optimizers, Signatures, LMs, Modules, and Adapters—and argues that effective AI engineering requires mastering these elements, highlighting the often-overlooked role of rendering structured outputs.
View Cached Full Text
Cached at: 05/15/26, 03:05 PM
A Simple Explanation of What DSPy Can Teach You About AI Engineering
Exactly one year ago, I tried DSPy for the first time. It felt magical. It took me a whole year of wanting to look into it before I finally sat down one morning and actually ran the example snippets in the Getting Started docs. They felt too short and magical to be “enough”—but they are enough.
Anyway, today this post is not so much about why DSPy is so magical, but rather about what DSPy is doing a bit differently that makes it so important for the future of integrating AI into our society.
Why listen to me? In the last year, while I was working for a big academic publisher, I used DSPy to build a pipeline that runs on virtually all scientific publications in the world—roughly 100 million times per week—fully releasing data analysts from the tedious task of creating custom scientific classifications. That would have cost $400K per week with ChatGPT. With vLLM, Llama 8B, Qwen embeddings, and DSPy, it cost just $50. I also built a pipeline to parse millions of scanned PDFs at human-level quality while being 10× faster. I have since moved on and am now working full-time in open-source AI engineering. I’ve made several DSPy community libraries and am now a contributor to DSPy. Just this morning I pushed my first PR to DSPy, where we’re taking the first step toward formalizing DSPy’s contract between its five key components. Those five components are what I want to teach you about.
Optimizers, Signatures, LMs, Modules, and Adapters
I’ve stated them with their DSPy names and in the order people tend to encounter them.
-
Optimizers: Automatically change your prompts and/or model weights to improve performance on an eval.
-
Signatures: A high-level way to specify input and output names and types so the details can be left to automatic optimization.
-
LM: The connection between DSPy and the outside world—that’s where tokens are generated.
-
Modules: Where programming, inference strategies, and several LLM calls can be put together into a compute graph, working together as one system (a compound AI system).
-
Adapters: Where task-independent, type- and structure-related inference strategies live. These render the task inputs and the optimized instructions into text prompts and request parameters.
Any effective AI programming will need these components. Many AI frameworks have several of them; few (if any, other than DSPy) have all of them. My favorite—and the one that is most underappreciated—is the adapters.
Let’s rename them in more general terms. The work of an AI engineer will be about:
-
Evals: Evaluating and improving
-
Interface: Defining your task, its inputs and outputs at the highest level
-
Inference: Making your pipeline run on different providers and models
-
Call Graph: Considering how you decompose the task (if you do), what you do with AI, what you do in code or traditional ML, whether you’re using reasoning, whether you’re using tools
-
Rendering: How you render, format, and parse the domain-specific prompt and input/output types into the actual complete request
Rendering
That is probably the least obvious part to most readers, so let’s start here.
Rendering is about how you render your instructions and inputs to the model and how you instruct the model to render its output. The two often go together. If you tell the model to use XML tags, you’ll use XML tags in your prompt. The same goes for JSON and custom delimiters.
When you decide to ask for structured output using XML tags, you are using an inference strategy. That inference strategy is independent of your task—it’s about how you will render your prompt to show to the model and how you ask it to render its output so you can parse it.
To get structured output, XML is just one of many options. Alternatives include: JSON, native structured outputs, custom delimiters, BAML, CSV, and many more.
Structured output is only one axis of rendering. How you render reasoning, images, tool calls, videos, PDFs, and citations—these are all rendering-related, task-independent inference strategies you need to make. You can keep it simple and just use whatever is “native” from the provider, but that is rarely the best option. It’s just delegating the decision to them.
For example, JSON tool calling is the default now, but there are many other (often superior) ways of rendering a request for tool usage. You could parse and run all Markdown code cells that start with #!run. You could parse and run text inside <toolcall></toolcall> delimiters, etc.
For PDFs, you could extract the text with traditional OCR and provide an image of the document. You could provide just the text, just the image, or the binary (probably with low success), etc.
For images, if it’s like a logo, you could turn it into SVG and provide just the SVG. You could do two steps: a model that describes it, then a model that receives just the description. You could lower the resolution or tile multiple images together into one, etc.
For reasoning, you could use <thinking></thinking> at the top of the document. You could require the model to have a #REASONING: comment before any lines of code. You could have thinking tags throughout the outputs, etc.
This is simple. It’s done for you if you’re not doing it yourself. The three biggest recent advancements from the big AI providers were all related to rendering: reasoning, structured outputs, and tool calls.
Call Graph
Decomposing a task into many sub-calls to the LLM and delegating each to the appropriate model is one of the most effective ways to change the cost, performance, and latency profile of your AI pipeline.
You can call the same model many times. You can use specialized models (guards). You can call the best models and combine their responses. You can have a task done in many different languages and programs and take the majority response. You can have “specialized” model personas, each focusing on different elements. You can mix AI calls with code and traditional programming.
This is all done inside a module, and you should have an end-to-end way of calling it that is independent of your decomposition. These are compound AI systems—and they are powerful.
Inference
You will need to shop around and evolve. Open-source and commercial models are released pretty much daily now. You need all of your work on prompts, rendering, and call graphs to be easily plug-and-play with any provider and model.
The most effective way to do that is to target one specific universal format for your AI request, then map that format once to all the providers and models you want to try, and map their responses back into a universal format that your pipeline can parse, evaluate, render, etc.
Interface
To be useful and impactful, your AI program needs to interface with the world. It needs to be called by an app. It needs to run daily on some data stream, etc. That interface needs to be stable—because it is your true task.
You have to keep that separate and abstracted away from all the hacking, fiddling, optimizing, decomposing, and rendering you’re doing underneath to reach a satisfactory cost, performance, and accuracy profile. Define your system’s signature once, then fiddle inside it.
Evals
None of the above means anything if you’re not trying to improve your performance. You need to evaluate your work.
Don’t build big, beautiful evals too early, though. On many tasks, a single obvious example won’t even work. Once you’re making your program go from zero to a few working examples, just evaluate by hand: interact, look at your data and traces. Is there a bug in your rendering? In your request to the language models? In your parsing? Etc.
After that, make a small dataset—that’s enough to run automatic prompt optimization. Then run it in production, collect your inputs and outputs so you have a real data distribution, and maybe you’ll even have enough for fine-tuning!
Conclusion
AI engineering has five important components. For any given task, a subset of these will be more important to focus on, but they are all always there—you might just be delegating the decisions to others and to circumstances.
DSPy lets me geek out on any one of them without worrying too much about the others, and it lets all of us share best practices and general solutions to those problems.
Similar Articles
@DSPyOSS: indeed it's all just signatures (specs), modules ("harnesses", "inference scaling"), and optimizers (learning algorithm…
A post reflecting on the DSPy framework's architecture built around signatures, modules, and optimizers, and noting its continued growth since 2022.
@MaximeRivest: At first glance: > Structural Equation Modeling (SEM/Path Analysis) > Neural Ordinary Differential Equations (Neural OD…
The author compares Structural Equation Modeling, Neural ODEs, and AI Programs like DSPy as declarative frameworks for defining and optimizing computational graphs, arguing that structured flows are essential for trustworthy AI agents.
@MaximeRivest: Compound AI System for Images are way under appreciated. We need gepa, dspy, autoresearch style optimization to go from…
Maxime Rivest argues that compound AI systems for images are undervalued and suggests leveraging optimization frameworks like DSPy and GEPA to automate pipeline creation involving SAM and classifiers.
A developer shares insights on how to maximize AI agent capabilities, arguing that simpler setups and understanding core principles are more effective than complex harnesses and libraries.
A developer shares insights on how to maximize AI agent capabilities, arguing that simpler setups and understanding core principles are more effective than complex harnesses and libraries.
@dosco: use perplexity, parallel, google, x search whatever and build this in 5 minutes using DSPy+RLM (ax-agent) http://axllm.…
Ax is an open-source TypeScript library that implements DSPy-style typed signatures and agent frameworks for building reliable AI applications with minimal prompting. It supports multiple LLM providers and includes features like agents, flows, RAG, and self-improving pipelines.