@jholtdigital: A friend encouraged me recently if I want to know how well something works for my use case, I should try it for a month…

X AI KOLs Following Tools

Summary

A user shares experience using FactoryAI to convert a design system from HTML/CSS to Flutter widgets with E2E testing. The tool employs an orchestrator, workers, and validators using multiple AI models to plan and execute long-horizon tasks over 79 hours, spawning over 229 agents.

A friend encouraged me recently if I want to know how well something works for my use case, I should try it for a month. So I gave this a shot with another AI tool I've been wanting to explore. I ran my first @FactoryAI Mission--a very cool idea from the Factory/Droid team. The structure they built for this kind of long horizon task -- an orchestrator plans, asking clarifying questions, creates milestones, plans around your codebase constraints, and executes long horizon work. It's a much more structured system than the /goal commands I've used with other apps. Some stats from this run: - Active time: 79h 2m - Avg per milestone: 5.3h - Avg per feature: 30.2m - 229+ agents spawned--not counting subagents that they spawned (subagents for reviewing the code in areas like simplicity and architectural fit) - 158 tasks This specific task was taking a design system I had prototyped and fine-tuned in HTML/CSS and converting it to Flutter widgets with E2E testing in Patrol. It replaced the existing basic material design theme and widgets. The agent first had to capture a number of screenshots of the HTML prototype before planning the implementation. The initial plan that the orchestrator created was done by Fable 5--but post-plan creation the orchestrator was either Opus 4.8 or more often GPT 5.5 xhigh. Workers were Composer 2.5, some GLM 5.2, and some Kimi K2.6/K2.7. Validators were almost exclusively GPT 5.5 I've never seen a system like this. It's very impressive. I need to spend the next couple of weeks doing a lot of refinements. The system is complete and working well but there are a number of UI mistakes that did not match the prototype correctly. I think though this would be the case no matter what. Agents can get you about 80% there and the last 20% of refinements is where a big chunk of the work is. Super impressed! Maybe I can do a longer write up if people are interested in more details.
Original Article
View Cached Full Text

Cached at: 06/23/26, 09:54 PM

A friend encouraged me recently if I want to know how well something works for my use case, I should try it for a month. So I gave this a shot with another AI tool I’ve been wanting to explore.

I ran my first @FactoryAI Mission–a very cool idea from the Factory/Droid team. The structure they built for this kind of long horizon task – an orchestrator plans, asking clarifying questions, creates milestones, plans around your codebase constraints, and executes long horizon work. It’s a much more structured system than the /goal commands I’ve used with other apps.

Some stats from this run:

  • Active time: 79h 2m
  • Avg per milestone: 5.3h
  • Avg per feature: 30.2m
  • 229+ agents spawned–not counting subagents that they spawned (subagents for reviewing the code in areas like simplicity and architectural fit)
  • 158 tasks

This specific task was taking a design system I had prototyped and fine-tuned in HTML/CSS and converting it to Flutter widgets with E2E testing in Patrol. It replaced the existing basic material design theme and widgets. The agent first had to capture a number of screenshots of the HTML prototype before planning the implementation. The initial plan that the orchestrator created was done by Fable 5–but post-plan creation the orchestrator was either Opus 4.8 or more often GPT 5.5 xhigh. Workers were Composer 2.5, some GLM 5.2, and some Kimi K2.6/K2.7. Validators were almost exclusively GPT 5.5

I’ve never seen a system like this. It’s very impressive. I need to spend the next couple of weeks doing a lot of refinements. The system is complete and working well but there are a number of UI mistakes that did not match the prototype correctly. I think though this would be the case no matter what. Agents can get you about 80% there and the last 20% of refinements is where a big chunk of the work is. Super impressed! Maybe I can do a longer write up if people are interested in more details.

Similar Articles