@ThisisHan1_: Recently built a development pipeline and wanted to share the thinking behind it. I was inspired by loop/goal engineering and auto-goal (letting the agent write its own goals and spawn subtasks). But what really clicked for me was...

X AI KOLs Timeline Tools

Summary

This development pipeline first creates a rough prototype to elicit user feedback, turning every 'that's not right' reaction into a checkable rule. Then AI agents independently develop and validate against those rules, catching issues early and avoiding self-deception.

Recently built a development pipeline and wanted to share the thinking behind it. I was inspired by loop/goal engineering and auto-goal (letting the agent write its own goals and spawn subtasks). But what really clicked for me was something about myself: my taste is the 'know it when I see it' type, not the 'can describe it' type. For many things, I have to see it to know 'no, that's not it'; I can't write it clearly in advance. If asked out of the blue what I want, I can't squeeze anything out; but give me an ugly prototype and I immediately know what's wrong. So this pipeline works in reverse: first throw out an extremely ugly prototype to provoke my reaction → copy down each 'no, I want...' as a checkable rule → these rules become acceptance criteria, letting the agent develop them one by one → and each item is independently validated by another agent that didn't participate in building, so it cannot declare itself done. Compare with not doing this: you say 'make X for me', the agent plows all the way to the end, and finally gives you something polished that isn't what you wanted — and you only find out at the most expensive moment (the end). What this pipeline does is move the 'discover that it's wrong' moment to the earliest, cheapest point, and throughout, the agent is not allowed to cut corners or self-deceive. The inspirations are worth reading: Lance Martin on long-running agents getting lazy, narcissistic, and deviating from goals; Addy Osmani on distilling recurring corrections in a session into reusable rules. http://github.com/H1an1/idea-to-shipped…
Original Article
View Cached Full Text

Cached at: 06/17/26, 01:57 PM

Recently, I built a development pipeline and wanted to share the thinking behind it. I was inspired by the whole loop/goal engineering and auto-goal (letting the agent write its own goals and spawn subtasks) lineage. But what really made it click was something about myself: my taste is the “I know it when I see it” type, not the “I can describe it” type. I can’t articulate what I want when asked coldly, but show me an ugly prototype and I instantly know what’s wrong.

So this pipeline works in reverse: first throw out an ugly prototype to provoke my reaction → write down each “no, I want…” as a checkable rule → turn those rules into acceptance criteria for the agent to develop one by one → and have each item independently verified by another agent that wasn’t involved in building it, so it can’t just declare itself done.

The alternative: you say “build me X”, the agent runs all the way to the end, and gives you something polished but not what you wanted — and you only find out at the most expensive moment (the end). This pipeline moves that “finding out it’s wrong” moment to the very beginning, when it’s cheapest, and never lets the agent cheat or deceive itself.

Both inspirations are worth reading: Lance Martin on how long-running agents slack off, get narcissistic, and drift off course; Addy Osmani on how to extract recurring corrections from a session into reusable rules.

http://github.com/H1an1/idea-to-shipped…

H1an1/idea-to-shipped

Source: https://github.com/H1an1/idea-to-shipped

idea-to-shipped

A taste-driven build pipeline for Claude Code — a package of six skills that take a half-formed idea all the way to a self-running build, without asking you to write a spec, and without losing the parts of “what you want” that you can only recognize once you see them. It’s built on one fact about how people actually work: your taste shows up when you react to something concrete, not when you’re asked to describe it in advance. So the pipeline never makes you specify everything up front. It makes you react, and it writes down what your reactions reveal.

Formerly roadmap-to-loop — that skill is now just one stage of the larger pipeline.

The pipeline

idea
↓
1. idea-to-prototype → an ugly, fast mock-up to react to. You say "no, not that"; it learns.
↓
2. prototype-to-roadmap → once the picture's firm, it slices it into a buildable, ordered plan.
↓
3. roadmap-to-loop → turns the plan into LOOP.md: what "done" means, and who checks it.
↓
4. loop-to-goal → the agent writes its own goals and builds, unattended, checking itself against your criteria, stopping only when it needs your eye.
↓
shipped
taste-to-rubric → runs underneath all of it. Every time you react to something — the first mock-up, a mid-build check, final sign-off — it captures the preference into RUBRIC.md so nobody has to ask you twice.
idea-to-shipped → the front door: it routes you to the right stage by what you already have, and protects the principles the pipeline is built on.

RUBRIC.md — your taste, written down

Each project grows one file, RUBRIC.md. Every “no, not that” that turns out to be a standing preference (not a one-off tweak) gets distilled into one checkable line, tagged revealed (you only knew it on sight) or known. Over a project it becomes the rulebook the agent checks itself against; rules graduate from “only the owner can judge” to “a verifier subagent can judge” over time. Over many projects, those rulebooks can be compared to distill the constants of your taste — but that’s later, once a few exist.

How to use it

You don’t pick a stage. Just say what you want — “I want something that…” — and the front door (idea-to-shipped) figures out where you are and starts you at the right place. A vague idea starts at the mock-up; if you already have a firm concept or a plan, it jumps in further down. Your job, start to end, is small and only yours: react (at the first mock-up) and judge (at the gates the agent stops you for). Everything in between is the machine’s.

Install

These are Claude Code (https://claude.com/claude-code) skills. Copy the ones you want into your personal skills directory:

git clone https://github.com/H1an1/idea-to-shipped.git
cp -R idea-to-shipped/skills/* ~/.claude/skills/

Each skill is a self-contained skills/<skill>/SKILL.md; its description controls when Claude reaches for it. You can also drop them into a project’s .claude/skills/ instead, to scope them to one repo.

The skills

SkillWhat it does
idea-to-shippedthe front door — routes you to the right stage
idea-to-prototypevague idea → cheapest rough prototype that provokes your taste
prototype-to-roadmapfirm concept → sliced, ordered roadmap with your rules loaded in
roadmap-to-looproadmap → LOOP.md with acceptance criteria tagged machine / verifier / you
loop-to-goalLOOP.md → unattended build that writes its own goals and verifies itself
taste-to-rubricany reaction, anywhere → a written rule in RUBRIC.md

What’s intentionally not built yet

  • The cross-project master rubric (needs several projects’ RUBRIC.md to compare).
  • Auto-loading rules back in / scoring — its shape will reveal itself once real rules pile up.

These come with mileage, not design. Run the pipeline on real things first; the next pieces will show you their shape.

Similar Articles

@freeman1266: https://x.com/freeman1266/status/2064702757773496552

X AI KOLs Timeline

This article introduces the concept of Loop Engineering, which involves designing automated systems that allow AI agents to work in autonomous loops, including elements such as automated tasks, work trees, skills, plugins, and sub-agents, thereby replacing manual prompting and improving development efficiency.

@cellinlab: https://x.com/cellinlab/status/2064144608242679822

X AI KOLs Timeline

This article introduces the concept of Loop Engineering — instead of directly writing prompts for AI agents, it designs a system (loop) that recursively lets the agent iterate on tasks until completion. The article provides a detailed comparison of how Claude Code and Codex implement five building blocks: automations, worktrees, skills, sub-agents, etc. It suggests this could be the future trend of collaborating with coding agents, but also warns about token costs and AI slop issues.

@jakevin7: Sharing something interesting Maka is currently working on: letting agents automatically optimize their own system prompt, fully closed-loop, without any human intervention. Karpathy's autoresearch, AEGIS, etc. have explored similar directions—a goal-driven self-reinforcement learning system.

X AI KOLs Following

Maka is a local-first desktop AI workbench whose new feature allows agents to automatically optimize their own system prompts by generating variants, using Harbor container evaluation, and an acceptance policy for iterative improvement, all without human intervention.

@vikingmute: Great workflow, now it's also my main workflow for developing new features and new ideas: Grill - let AI fiercely question every detail until clear -> Research - separately analyze difficult areas and create a research document (optional) -> PRD - gen...

X AI KOLs Timeline

VikingMute shares their main workflow for developing new features and ideas: using AI (Grill) to drill down on details, Research to analyze difficulties, generating a PRD, breaking it into independent Issues, step-by-step implementation, and finally Review. This is a supplement to Matt Pocock's seven-stage AI development method.

@freeman1266: Loop Engineering needs five things: 1. Goal Definition - Not "do one thing", but "keep doing until this condition is met." The /goal primitive lets you define a verifiable stopping condition, judged by an independent small model—not by the agent doing the work…

X AI KOLs Timeline

Introduces the five key elements of Loop Engineering: Goal Definition, Iteration Loop, State Management, Verification Mechanism, and Self-correction, designed to achieve continuous improvement through automated AI agent loops.