@jholtdigital: A friend encouraged me recently if I want to know how well something works for my use case, I should try it for a month…
Summary
A user shares experience using FactoryAI to convert a design system from HTML/CSS to Flutter widgets with E2E testing. The tool employs an orchestrator, workers, and validators using multiple AI models to plan and execute long-horizon tasks over 79 hours, spawning over 229 agents.
View Cached Full Text
Cached at: 06/23/26, 09:54 PM
A friend encouraged me recently if I want to know how well something works for my use case, I should try it for a month. So I gave this a shot with another AI tool I’ve been wanting to explore.
I ran my first @FactoryAI Mission–a very cool idea from the Factory/Droid team. The structure they built for this kind of long horizon task – an orchestrator plans, asking clarifying questions, creates milestones, plans around your codebase constraints, and executes long horizon work. It’s a much more structured system than the /goal commands I’ve used with other apps.
Some stats from this run:
- Active time: 79h 2m
- Avg per milestone: 5.3h
- Avg per feature: 30.2m
- 229+ agents spawned–not counting subagents that they spawned (subagents for reviewing the code in areas like simplicity and architectural fit)
- 158 tasks
This specific task was taking a design system I had prototyped and fine-tuned in HTML/CSS and converting it to Flutter widgets with E2E testing in Patrol. It replaced the existing basic material design theme and widgets. The agent first had to capture a number of screenshots of the HTML prototype before planning the implementation. The initial plan that the orchestrator created was done by Fable 5–but post-plan creation the orchestrator was either Opus 4.8 or more often GPT 5.5 xhigh. Workers were Composer 2.5, some GLM 5.2, and some Kimi K2.6/K2.7. Validators were almost exclusively GPT 5.5
I’ve never seen a system like this. It’s very impressive. I need to spend the next couple of weeks doing a lot of refinements. The system is complete and working well but there are a number of UI mistakes that did not match the prototype correctly. I think though this would be the case no matter what. Agents can get you about 80% there and the last 20% of refinements is where a big chunk of the work is. Super impressed! Maybe I can do a longer write up if people are interested in more details.
Similar Articles
@rohanpaul_ai: Factory 2.0 is here. Connects AI agents to the whole software workflow: tickets, customer requests, code, tests, securi…
Factory 2.0 connects AI agents to the entire software workflow, managing a feedback loop from tickets to incidents, treating every signal as training data.
Tried 12+ agentic AI workflow builders this year — these 5 actually work in production
A review of five agentic AI workflow builders that actually work in production, highlighting SimplAI as a standout enterprise agent operating system and discussing the importance of workflow layer over model quality.
AI agents are wasting tokens on repeated work. I built something to fix it and need testers.
A developer built a system to reduce token waste in AI agent workflows by reusing information across tasks, and is seeking testers for feedback.
@cryptopunk7213: this is pretty genius. in a world of increasingly expensive and abundant ai models products like this are a dream AI mo…
Factory Router automatically selects the best AI model for each task, claiming to cut costs by 25% while maintaining frontier performance, a promising tool for large enterprises.
@nikunj: Man, /goal is just AGI if given the right tools.. Like what do you mean you went through all the entire database of 2k+…
A user describes an AI agent that autonomously fixed product images, frontend bugs, and descriptions from a database, used browser automation and web search, and ran for two hours while the user met founders, highlighting impressive AGI-like capabilities.