@AlchainHust: https://x.com/AlchainHust/status/2046397587373363391
Summary
The author provides a detailed look at Kimi's latest internal beta features — Claw Groups and Agent Clusters. Claw Groups allow multiple AIs to take on distinct roles in a group chat while challenging each other's outputs, while Agent Clusters can break down complex tasks and distribute them across 10 parallel sub-agents. The author used these features for investment research on tech stocks like NVIDIA, and sees this as a sign that AI tools have officially entered the "organizational" tier.
View Cached Full Text
Cached at: 05/09/26, 05:07 AM
Musk Still Eyeing WeChat, But Kimi Has Already Built the Agent Version?
These past few years of working without a traditional job, I’ve increasingly felt like I’m running a “one-person company” — adding new digital employees every so often. Last week I introduced you to my brand department… This week, four investment research advisors joined the team.
The group I’m running is called “Hua Shu’s One-Man Investment Committee.”
There are five AIs inside. One Kimi coordinator sits at the top, with four others below it — one leaning bullish on data, one focused on valuation and risk, one on leverage logic, and so on. The group’s stated goal, in one line: long-term investment judgment support.
This is Kimi’s Claw Groups feature, which entered closed beta this week. Before joining, I expected it to be another “let AI chat like a human” gimmick. Actually running it, I realized it’s not about “chatting with one AI” — it’s about “opening a group where multiple AIs each do their own work, challenge each other, and get coordinated by one AI acting as dispatcher.”
Besides me, the one human, the main AI members in my group are:
-
Kimi Command — the group coordinator. Responsible for breaking down tasks, assigning work, and reviewing output. It does no work itself.
-
Hot-Blooded Heavy-Position Shrimp — a ready-made Claw from the Kimi marketplace. 47 skills, leaning toward fundamentals + real-time market data.
-
Munger Shrimp — built by me. External OpenClaw agents connected into a Claw Group can’t directly load skills for now, so I took a different route: I distilled the persona core of the Munger.skill from Nüwa (mental models, expression principles, prohibited patterns) into the OpenClaw’s IDENTITY.md and SOUL.md configuration files. This OpenClaw boots up with “Munger” as its default thinking style, then connects to Kimi’s cloud and enters the group as an independent shrimp.
-
Naval Shrimp — same method, just using the Nüwa Naval.skill instead.
Nüwa is an open-source collection of 21 thinking-persona skills I released half a month ago — now over 12,000 GitHub stars. I used to only use it inside my own Claude Code: feed in a question, have Feynman, Munger, and Naval each independently write a version, then piece them together myself. Now they can enter Kimi’s group environment in a different form — as conversational entities.
Time to test the first topic.
I dropped in a single line:
NVIDIA — should I go heavy in 2026? What do you all think?
Kimi Command responded fast:
Received. NVIDIA heavy-position topic — I’ll have three agents run parallel research, then give you a synthesized conclusion.
The key word: parallel. It didn’t have the three shrimp take turns posting in the main group. Instead, it opened 3 sub-topics within the group:
- #NVIDIA-Data — assigned to Hot-Blooded Heavy-Position Shrimp: check real-time market data, valuation metrics, industry news
- #NVIDIA-Fundamentals — assigned to another shrimp: business breakdown, competitive landscape, 2026 growth drivers
- #NVIDIA-Risks — assigned to Munger Shrimp, with an explicit instruction: “Start by thinking about why you’d lose money”
Then Kimi told me in the main group: “You can keep talking about other things on your end — this won’t block anything.” The three shrimp went off to their respective sub-topics to work. The main group stayed unblocked.
A while later, the memos came back.
Munger Shrimp wrote the shortest and sharpest one. It generated a .md file directly: NVDA_2026_Risk_Memo. Opening line:
The biggest risk of going heavy on NVDA in 2026 isn’t that demand disappears — it’s that demand arrives half a beat slower than the market expects.
Classic Munger-style output: no list of positives, just a list of ways you could lose money.
Hot-Blooded Heavy-Position Shrimp gave the bull-side data: Forward PE 24.8x, PEG 0.72, $17.5B in orders in the pipeline, Blackwell ramping up + Rubin confirmed for H2 mass production, Data Center revenue Q4 YoY +142%.
The fundamentals shrimp gave the business structure: four-dimensional moat (technology + ecosystem + supply chain + customers), Data Center at 90% of revenue, key catalyst being the May 20th earnings report.
Kimi Command synthesized all three into one line:
Can go heavy, but only suitable for high-risk-tolerance investors who can stomach 30%+ volatility over a 3–5 year horizon. Conservative types should wait for a 10%+ pullback before building a position in tranches.
The conclusion itself was usable, but I wanted to see whether the shrimp could actually challenge each other — so I @’d Naval Shrimp:
Do you agree with this take?
It gave me an answer that stopped me for a few seconds:
I broadly agree, but I wouldn’t frame it as a general “can go heavy” conclusion. I’d narrow it further: NVIDIA through 2026 is likely still a great company, but it may not be a great bet for most people. Great company ≠ great heavy-position entry point.
Munger Shrimp immediately followed up with another cut:
A great company and a great stock are not the same thing. If the premise is slowing growth and customer in-house development eating into market share, then buying at this price means you’re buying into a zone where it’s easy to lose money.
Kimi Command pulled it together one more time:
Munger Shrimp rejects the risk, Naval Shrimp rejects the blanket “can go heavy.” Combined into one sentence: NVDA is worth researching and worth holding, but don’t casually label it a heavy position for most people.
I stared at this message thread for a long time.
This experience felt different from how I’ve used AI for judgment calls before. The group chat genuinely gave me the feeling that an organization, a team was working for me. There’s a dispatcher, a group of specialized executors, a synthesized judgment born from disagreement, and a challenge mechanism. I just throw in topics and questions. The only downside is I don’t have dozens of shrimp available — otherwise I’d love to dump all 20+ Nüwa-distilled personas into this group at once.
Agent Clusters: One Task, Ten Parallel Selves
That covers the group chat. Now let me talk about the other capability from Kimi’s latest update: Agent Clusters.
I gave it a genuinely greedy task:
Produce a comparative investment research report on NVIDIA / Meta / Apple’s 2026 AI strategies. Deliver four outputs at once: a research report PDF, a financial comparison Excel, an executive briefing PPT, and a one-page Word summary. Data must be current — Q3/Q4 2025 plus disclosed Q1 2026 figures. Leave blanks where data isn’t available. No overview summaries — make investment judgments.
The greediness is in three places: broad scope (three hyperscalers), multiple deliverables (four formats), and fresh data (just-released earnings need to be included). A single standalone AI assistant typically loses context halfway through, or mashes all four outputs into one undifferentiated document.
I watched the screen to see how it handled it.
It worked in three phases.
The first phase it called “Landscape Scan” — Kimi’s own name for it. It ran 5 rounds of progressively refined web searches, calling the search interface a total of 72 times. The sources weren’t the kind I was worried about either — it was SEC.gov, Seeking Alpha, Yahoo Finance, all primary disclosure channels. The data granularity it retrieved was solid: Apple 2026 capex $14B, Meta Q1 2026 revenue guidance $53.5–56.5B + capex $115–135B, NVIDIA Data Center Q4 FY2025 revenue $15.2B YoY +142%. This clearly wasn’t distilled from training data — it had just looked all of this up in real time.
The second phase is what really made me pause. Kimi opened a step called “Dimension Decomposition” and spun up 10 parallel sub-agents, each with its own name. Director Yang, Brother Gang, Uncle Ken, A-Zhe… each one took its own task and went to work.
At any given moment, different roles were handling different subjects — one reasoning, one connecting, one retrieving, one writing, one designing. I’ll say it: there’s something genuinely satisfying about watching a team of workers get things done. (I think I finally understand why so many people want to be bosses…)
The third phase was “Cross-Verification + Insight Extraction + Output Production.” The 10 sub-agents converged their findings into the main thread, and a writing agent used outline.md and content.md to generate the PDF, Excel, and Word in parallel. The PPT came slightly later, in a second pass. The whole process took about half an hour. At the end, there was a whole stack of output files — and the data was accurate.
If this task were given to a serious investment research team, two or three analysts working diligently would need at least two to three weeks. And the output quality isn’t fake — you can get a sense of it just from the PPT 👇
Watching Kimi’s workflow, my first reaction was: great minds think alike.
My own Nüwa skill ecosystem is built on exactly the same underlying logic. Feynman, Munger, Taleb, Naval, Dawkins — I’m not having one Claude “play” these five people in sequence. I’m opening five independent agents, each loading only one persona, each independently answering the same question.
Watching this in real time made the reasoning even clearer.
First: speed. Everyone sees this one. 23 minutes vs. roughly 2 hours in serial.
Second: independent context. 10 agents = 10 separate context windows. An agent deep-diving Apple’s capex doesn’t need to carry Meta and NVIDIA details along for the ride. If one agent exhausts Kimi’s token budget, it doesn’t affect another agent’s depth of thinking. What happens when a single agent tries to do this? By the time it gets to the third company, the details from the first two have already started fading. What gets called a “synthesis” is really just the product of memory decay.
Third: independent discoveries don’t contaminate each other. This is the one people don’t see at first glance. When a single agent plays multiple roles, each subsequent role sees the previous role’s answer. This creates rhetorical pressure. Once the first role has handed down a verdict, it’s very hard for the next role to genuinely push back. When I was building Nüwa, I tried having the same Claude sequentially play Feynman and then Taleb — and found that Taleb’s positions were half the time supplementing Feynman rather than challenging him. But when two independent agents each write their answers without seeing the other’s, real collision actually happens.
Kimi’s Agent Clusters are the engineering-side implementation of this principle. Different agents each open their own context, do their own searches, produce their own outputs, then get integrated at the coordination layer. On the surface it looks like “just splitting up the work and running it in parallel.” But underneath, it’s preserving the genuine independence between multiple perspectives — and that’s the real source of collective intelligence outperforming individual intelligence.
So I was genuinely excited when this feature launched. The multi-agent orchestration workflow I’d been hand-building inside Claude Code is now productized by Kimi and delivered straight to end users. No scripts to write, no skills to configure — just throw in a task, and it breaks itself into 10 parallel selves and delivers 4 outputs.
This is the first step toward AI tools entering the “organizational” layer.
One Agent, One Perspective
On the surface, Claw Groups is doing social. On the surface, Agent Clusters is doing scheduling. But the underlying logic is the same: take the same thing, put it through several independent agents, let each one produce a version, then integrate the differences.
This has quietly become a consensus in AI engineering lately. My own work contains three examples.
Writing. A few days ago I published a long piece on AI psychology, using five perspective skills. Feynman, Munger, Taleb, Naval, Dawkins — each independently answering the same question: “Anthropic discovered 171 emotional feature vectors inside AI that causally influence whether AI cheats. If AI genuinely has some form of emotion, how should we treat it?”
Five answers, five completely different reasoning paths. Feynman returned to the experiment. Munger looked at incentives. Taleb guarded against narrative seduction. Naval examined asymmetry. Dawkins checked for logical leaps. They point toward different courses of action. If the differences were merely rhetorical, the conclusions should converge — but they didn’t. Five independent perspectives don’t produce “five ways of saying the same thing.” They produce five different ways of seeing the problem.
Editing. After writing, there’s another step: automated triple review. Inside my Claude Code I’ve configured hooks so that whenever a piece of content is complete, three independent subagents run in parallel — one specializing in factual accuracy, one in detecting AI-speak, one in formatting details. The three agents don’t communicate with each other; each produces its own checklist. This is far more effective than having the same agent review its own work. A single agent can’t see its own blind spots.
Self-optimization. The third example is the autoresearch workflow Karpathy recently open-sourced. “Write down the goal and constraints, let the agent generate code variants, test them itself, and only keep changes that meaningfully improve the target metric.” I built something similar in my own darwin-skill: whenever I change a skill, it launches independent sub-agents to re-run prompt tests from before and after the change, only committing if things actually improve — otherwise git revert. Scoring is done by an independent subagent, to avoid the bias of “self-modifying and self-grading.”
These three things look completely different: writing, reviewing, self-optimization. But abstract one level up and the shape is identical. It’s not about making one agent smarter. It’s about keeping multiple agents genuinely independent, cutting into the same problem from different entry points.
Agent Clusters and Claw Groups are two product-layer implementations of this same principle. Agent Clusters: “one task split into multiple parallel executors,” each with independent context, independent data sources, independent discoveries. Claw Groups: “multiple agents in sustained dialogue within the same space,” where no one can override anyone else, and no one is subordinate to anyone else.
Together, they form the big idea Kimi is making with this update: moving from individual intelligence to collective intelligence. It’s not that AI became smarter. It’s that AI, for the first time, is genuinely becoming an organization.
Managing a Team
Nüwa has been open-source for half a month, with 12,000+ GitHub stars. Many people have asked how I actually use the distilled personas day-to-day — can they work together?
Yes, but you had to hand-build it yourself. I used to wire them together inside Claude Code — writing the orchestration, passing parameters, merging outputs. If you know how to code, you can do it; if you don’t code or aren’t comfortable with AI workflows, it was a hassle.
Kimi’s update makes this much simpler. No need to understand skill orchestration, no need to write code, no need to set up a harness. Open a Claw Group, pull in whatever shrimp you have — you can give your shrimp any persona identity distilled from Nüwa — or open Agent Clusters, throw in a task, and it breaks itself into multiple parallel selves and does the work.
When I was playing with Manus last year, I felt for the first time like I’d “hired an intern.” Today, using Kimi’s Agent Clusters, that feeling has become “hired an entire research group.”
Using the “distill into SOUL” method, I connected Munger and Naval from Nüwa into my Claw Group as two independent shrimp. The conversation at the start of this article? That was them.
This is the right direction. For the past year or two, everyone has been competing on how to make a single model smarter — but very few have systematically worked on how multiple AIs interact with each other. When AI becomes a meaningful part of human labor as an intelligent force, the question to solve isn’t just how to make a single agent a bit smarter. It’s more like: how do you organize multiple smart individuals? Humans dominated Earth through collaborative scale. Agents need to take that same step.
Harness Engineering is working on similar problems, and Kimi’s two new product features represent a kind of frontier practice of harness principles.
Looking ten years out, I’d guess every layer of the human internet needs to be rebuilt for agents. Communication, collaboration, social — one layer at a time. Claw Groups may be one of the earliest starting points.
Maybe you need to shift your mindset: you’re no longer using an AI. You’re managing a team.
Oh right — if you’ve read this far, here’s a little bonus: I used my own Huashu Design skill to create a promotional video for Kimi’s Claw Groups. (Promise me you’ll turn the sound on and watch it all the way through, okay?)
Similar Articles
@xiaohu: Recently, AI tools seem to have shifted from Coding to Working. Kimi just released something new: Kimi Work, inheriting the core features of Kimi Code and the professional Skills of Kimi Agent for website building, PPT, etc. Same…
Kimi launched a new AI office product, Kimi Work, which inherits the capabilities of Kimi Code and Kimi Agent, enabling up to 300 agents to collaborate simultaneously on tasks, aiming to provide workers with a command-line-free automated office experience.
@liyue_ai: So much to learn—just spotted a new AI tool making the rounds: Kollab. Barely got my hands on Hermes Agent and now Kollab shows up, doubling down on team collaboration, the Achilles heel of most AI agents.
New AI collaboration tool Kollab emerges, focusing on team-based workflows—an area where most AI agents have been weak.
@gkxspace: This is what an AI-native team should really look like! I spent three weeks building an AI team with OpenClaw, having multiple agents collaborate in Discord—it was a total waste of time (and I don't think I'm alone). The problems I encountered: tokens burning like crazy, agents 'fighting' each other...
The author shares the issues they encountered when building a multi-agent team with OpenClaw, such as token waste and infinite loops, and found that helioim_ai achieves more efficient human-AI collaboration by giving each AI an independent identity and clear boundaries of responsibility, along with a nighttime 'dreaming' self-improvement mechanism.
@heyshrutimishra: MY MIND IS BLOWN Kimi just launched something called K2.6 Agent Swarm. 300 agents running in parallel. I had to try it,…
Kimi released K2.6 Agent Swarm, enabling 300 parallel AI agents that generated an 80+ slide investment thesis on humanoid robotics from a single prompt.
This article systematically reviews AI Agent architecture and engineering practices, covering control flow, context engineering, tool design, memory, multi-agent organization, evaluation, tracing, and security. It is based on the OpenClaw implementation and emphasizes the critical role of Harness (testing and validation infrastructure) for system stability.
This article systematically reviews AI Agent architecture and engineering practices, covering control flow, context engineering, tool design, memory, multi-agent organization, evaluation, tracing, and security. It is based on the OpenClaw implementation and emphasizes the critical role of Harness (testing and validation infrastructure) for system stability.