@jholtdigital: A friend encouraged me recently if I want to know how well something works for my use case, I should try it for a month…

X AI KOLs Following 06/20/26, 02:36 PM Tools

ai-tool long-horizon-tasks factory-ai flutter code-generation agent-orchestration design-system

Summary

A user shares experience using FactoryAI to convert a design system from HTML/CSS to Flutter widgets with E2E testing. The tool employs an orchestrator, workers, and validators using multiple AI models to plan and execute long-horizon tasks over 79 hours, spawning over 229 agents.

A friend encouraged me recently if I want to know how well something works for my use case, I should try it for a month. So I gave this a shot with another AI tool I've been wanting to explore. I ran my first @FactoryAI Mission--a very cool idea from the Factory/Droid team. The structure they built for this kind of long horizon task -- an orchestrator plans, asking clarifying questions, creates milestones, plans around your codebase constraints, and executes long horizon work. It's a much more structured system than the /goal commands I've used with other apps. Some stats from this run: - Active time: 79h 2m - Avg per milestone: 5.3h - Avg per feature: 30.2m - 229+ agents spawned--not counting subagents that they spawned (subagents for reviewing the code in areas like simplicity and architectural fit) - 158 tasks This specific task was taking a design system I had prototyped and fine-tuned in HTML/CSS and converting it to Flutter widgets with E2E testing in Patrol. It replaced the existing basic material design theme and widgets. The agent first had to capture a number of screenshots of the HTML prototype before planning the implementation. The initial plan that the orchestrator created was done by Fable 5--but post-plan creation the orchestrator was either Opus 4.8 or more often GPT 5.5 xhigh. Workers were Composer 2.5, some GLM 5.2, and some Kimi K2.6/K2.7. Validators were almost exclusively GPT 5.5 I've never seen a system like this. It's very impressive. I need to spend the next couple of weeks doing a lot of refinements. The system is complete and working well but there are a number of UI mistakes that did not match the prototype correctly. I think though this would be the case no matter what. Agents can get you about 80% there and the last 20% of refinements is where a big chunk of the work is. Super impressed! Maybe I can do a longer write up if people are interested in more details.

Original Article

View Cached Full Text

Cached at: 06/23/26, 09:54 PM

A friend encouraged me recently if I want to know how well something works for my use case, I should try it for a month. So I gave this a shot with another AI tool I’ve been wanting to explore.

I ran my first @FactoryAI Mission–a very cool idea from the Factory/Droid team. The structure they built for this kind of long horizon task – an orchestrator plans, asking clarifying questions, creates milestones, plans around your codebase constraints, and executes long horizon work. It’s a much more structured system than the /goal commands I’ve used with other apps.

Some stats from this run:

Active time: 79h 2m
Avg per milestone: 5.3h
Avg per feature: 30.2m
229+ agents spawned–not counting subagents that they spawned (subagents for reviewing the code in areas like simplicity and architectural fit)
158 tasks

This specific task was taking a design system I had prototyped and fine-tuned in HTML/CSS and converting it to Flutter widgets with E2E testing in Patrol. It replaced the existing basic material design theme and widgets. The agent first had to capture a number of screenshots of the HTML prototype before planning the implementation. The initial plan that the orchestrator created was done by Fable 5–but post-plan creation the orchestrator was either Opus 4.8 or more often GPT 5.5 xhigh. Workers were Composer 2.5, some GLM 5.2, and some Kimi K2.6/K2.7. Validators were almost exclusively GPT 5.5

I’ve never seen a system like this. It’s very impressive. I need to spend the next couple of weeks doing a lot of refinements. The system is complete and working well but there are a number of UI mistakes that did not match the prototype correctly. I think though this would be the case no matter what. Agents can get you about 80% there and the last 20% of refinements is where a big chunk of the work is. Super impressed! Maybe I can do a longer write up if people are interested in more details.

@jholtdigital: A friend encouraged me recently if I want to know how well something works for my use case, I should try it for a month…

Similar Articles

@rohanpaul_ai: Factory 2.0 is here. Connects AI agents to the whole software workflow: tickets, customer requests, code, tests, securi…

Tried 12+ agentic AI workflow builders this year — these 5 actually work in production

AI agents are wasting tokens on repeated work. I built something to fix it and need testers.

@cryptopunk7213: this is pretty genius. in a world of increasingly expensive and abundant ai models products like this are a dream AI mo…

@nikunj: Man, /goal is just AGI if given the right tools.. Like what do you mean you went through all the entire database of 2k+…

Submit Feedback

Similar Articles

@rohanpaul_ai: Factory 2.0 is here. Connects AI agents to the whole software workflow: tickets, customer requests, code, tests, securi…

Tried 12+ agentic AI workflow builders this year — these 5 actually work in production

AI agents are wasting tokens on repeated work. I built something to fix it and need testers.

@cryptopunk7213: this is pretty genius. in a world of increasingly expensive and abundant ai models products like this are a dream AI mo…

@nikunj: Man, /goal is just AGI if given the right tools.. Like what do you mean you went through all the entire database of 2k+…