Discusses the unsolved pain points in shipping AI agents to production and explores the idea of an agent marketplace where discrete units of work are sold, with standardized I/O and shared evaluations.
What's your biggest unsolved pain in shipping agents to production? A few engineer friends and I have been kicking around the idea of an agent marketplace. Basically a place where users (and eventually other agents) can buy discrete units of work from specialized agents. Before we sink real time into building it, we want to make sure the problems we think it solves are actually problems people have. Here's what's been bugging us, plus stuff we keep hearing from others. First, composing agents across different vendors or frameworks is a mess. Schemas don't line up, errors mean different things in different systems, and there's no shared idea of what it even means for a sub-task to have succeeded. Second, discovery is rough. If I want an agent that's genuinely good at, say, parsing messy invoices or doing a legal redline, my options are reading blog posts or DMing founders. There's nothing like npm or RapidAPI for agentic work. Plenty exists for tools, nothing for the work itself. Third, the pricing model feels off. Per-token billing has nothing to do with what the buyer actually cares about. "Review this contract" is a unit of work. "3.2 million tokens" isn't. Fourth, there's no good way to tell if Agent A is actually better than Agent B at a given task without paying to find out. Every vendor claims they're great. No shared evals. Our hypothesis is that a marketplace where work is sold as actual units (per task, per outcome, per SLA), with shared eval harnesses and standardized I/O, would chip away at all four. A few questions we'd love thoughts on: Which of those four hits closest to home, and which feels overblown? Anything we're missing? We have a feeling orchestration and state handoff between agents is bigger than we're giving it credit for, but we're not sure. If you've tried building on top of someone else's agent and given up, what was the moment you decided to do it yourself instead? Happy to go deeper in the thread or in DMs.
This article discusses the emerging trend of AI agents acting as software buyers, highlighting the lack of standardized protocols for agent-friendly product evaluation, pricing discovery, and autonomous checkout. It proposes three existing solutions—llms.txt, MCP servers, and agent checkout protocols—and questions whether companies are prepared for non-human traffic.
A year-long reflection on the hard parts of shipping AI agents for real service businesses, highlighting that infrastructure and edge cases matter more than the AI layer.
A discussion about deploying multi-agent AI systems in production, where different agents handle planning, execution, communication, and project management, asking about real-world experiences and bottlenecks.
The team behind Boids shares their realization that building a useful agent marketplace requires helping creators turn their expertise into reusable agents, not just curating existing ones.
Cosmic Agent Marketplace offers AI agents built into a CMS platform for team productivity. It provides a marketplace of AI agents integrated directly into the Cosmic content management system.