we replaced single-model code review with a consensus of models. the one rule that made it actually work

Reddit r/AI_Agents Tools

Summary

The article describes replacing single-model code review with a consensus of multiple AI models, where only explicit approvals count, leading to more reliable code reviews at the cost of longer discussions.

context: small team, dogfooding on our own repos, no external users, so grain of salt. honestly the core problem with agent loops is that a single model reviewing its own work just rubber-stamps it. it writes a bug, "reviews" it, goes "looks good", merges. so we stopped trusting one reviewer and made the review a consensus instead. in practice that means every change gets checked by a few independent passes, different models and different angles, and they have to actually agree, not just fail to object. if one of them is unsure that's the signal and the change waits. disagreement is the useful part, not the noise. the one rule that made it click was dumb but load-bearing: a comment never counts as approval, only an explicit approve does. you'd be surprised how much "seems fine, one concern" was quietly acting like a yes before that. what finally sold me was an issue where the reviewers argued it out for ~31 rounds before they converged and it merged a clean two-line fix. annoying to watch. but it didn't merge garbage, which was the whole point. anyone else doing multi-model consensus on the review/merge step, or do you just trust a single reviewer? curious where people draw that line.
Original Article

Similar Articles