When does an AI agent with travelling salesman skills stop reasoning and start hallucinating?
Give an AI agent optimisation skills — like the travelling salesman problem — and point it at a real-world scheduling challenge: aged care rostering. As the number of carers, clients, constraints, and preferences grows, the decision tree expands exponentially. This experiment tests where the boundary is — at what point does the agent stop making sound decisions and start confabulating plausible-looking but broken schedules?
Progressively scaling complexity until the agent's reasoning breaks down — then mapping the failure patterns.
When does an AI agent with travelling salesman skills stop reasoning and start hallucinating?
Give an AI agent optimisation skills — like the travelling salesman problem — and point it at a real-world scheduling challenge: aged care rostering. As the number of carers, clients, constraints, and preferences grows, the decision tree expands exponentially. This experiment tests where the boundary is — at what point does the agent stop making sound decisions and start confabulating plausible-looking but broken schedules?
Progressively scaling complexity until the agent’s reasoning breaks down — then mapping the failure patterns.
Agents don’t degrade gracefully — they cross a threshold and start producing confidently wrong output, with no warning from their own confidence scores.
Constraint validation must be a separate verification layer — never trust the same system to both generate and validate.
Breaking combinatorial problems into geographic clusters delays the collapse point by 3–4x — decomposition is the primary scaling strategy.
The most dangerous failure mode is 95% correct output — the 5% is invisible without automated checks, and humans assume the rest is fine.
This directly shaped Wayfinder’s architecture — geographic clustering, hard constraint verification layers, and an agent that knows when to flag a problem instead of guessing. The collapse point research continues to inform how we build any agent that handles combinatorial problems.