Death by consensus

Adding more AI agents to a project makes it worse. Not marginally worse. Up to 38% worse.

Researchers from Stanford University just tested what happens when you put multiple AI agents on a team and let them collaborate. The kind of setup every AI vendor is pitching right now — autonomous agents working together, self-organizing, no fixed roles.

The results are damning.

Across five frontier benchmarks, agent teams consistently failed to match the performance of their single best member. The gap ranged from 8% on easier tasks to 37.6% on the hardest ones.

Even worse?

Telling the team who the expert was didn’t help.

The bottleneck wasn't identifying expertise.

It was leveraging it.

What happened instead? The researchers call it "integrative compromise." Non-expert agents proposed middle-ground positions instead of deferring to the agent who actually knew the answer. The expert's signal got averaged into mush.

And it got worse as teams got bigger. Every new agent diluted the expert further (p < 0.05 across all tasks). In human teams, singling out the expert lifts the group to expert-level performance. AI teams just can't do this.

One bright spot? The same consensus-seeking that kills expertise also makes agent teams weirdly robust to sabotage. When an adversarial agent tries to tank performance, it barely moves the needle. The group averages out the bad actor too.

Robust to sabotage.

But incapable of leveraging talent.

Sounds like a dismal place to work.

The implication for anyone building multi-agent systems: stop designing them like democratic teams. Design them like well-run ops—explicit authority, clear role specification, and a human in the loop where expertise matters most.

I help companies design agent architectures that actually work. If your multi-agent system is underperforming and you can't figure out why, the org design might be the problem. DM me.

Source: Multi-Agent Teams Hold Experts Back, Pappu et al, Feb 2026

Previous
Previous

Too many agents spoil the broth

Next
Next

Why is your AI so stuck?