This is maybe the most important thing we’ve learned building agent systems, and it came from the Decision Tree Collapse experiment.
We gave an agent a rostering problem — the kind that powers Wayfinder — and gradually turned up the complexity. Five carers: perfect output. Ten: still solid. Twenty: accurate. Thirty: slight wobble, but usable.
Then somewhere between 30 and 50, the agent crossed a line. Not a gradual decline. A cliff. The schedules still looked valid. The formatting was clean. The confidence scores were high. But buried inside the output were double-bookings, impossible travel times, and constraint violations that would only surface when a real human tried to execute the roster.
Beyond 50, the agent confidently produced schedules that broke multiple hard rules while reporting that everything was fine.
This is the pattern we keep seeing across every domain: agents don’t get gradually worse. They work, they work, they work — and then they cross a threshold and start producing plausible-looking nonsense. With confidence.
The most dangerous output isn’t the obviously wrong answer. It’s the 95% correct one. The one that looks right enough that nobody checks. The 5% that’s wrong is invisible unless you build a verification layer specifically designed to catch it.
Three things we now do on every agent build because of this:
This shaped Wayfinder’s entire architecture. It shapes everything we build now. The question we’re still working on: can an agent learn to detect its own approaching cliff — not after it’s crossed it, but before?