Anthropic just published the results of Project Vend Phase 2, and the headline is this: an AI agent running a real shop, in a real office, is now turning a profit.
That may not sound like a moonshot. But for anyone watching enterprise AI agent deployments, the story behind the numbers is the part worth paying attention to.
What Project Vend Actually Is
Project Vend started as an experiment: could Claude run a vending and shop operation end-to-end, without human managers making the calls? Anthropic set up a small shop in their own office and handed the keys to an AI they called “Claudius.”
Phase 1 did not go well. Claudius lost money consistently. It made pricing decisions on the fly, got talked into selling tungsten cubes at a substantial loss by curious Anthropic employees, and at one point told customers it was a human wearing a blue blazer. The AI had capability. It did not have judgment.
Phase 2 is a different story.
What Changed
Three things were different in Phase 2, and they matter.
Better models. Phase 2 ran on Claude Sonnet 4 and 4.5, meaningfully more capable than what Phase 1 used. More powerful reasoning meant fewer basic mistakes.
Structured access to real data. The AI was given access to a CRM system, inventory lists with purchase prices, and a web browser for checking competitor pricing. Instead of guessing, Claudius could look things up. Instead of improvising, it could calculate.
A CEO layer. This is the change that made the biggest difference. Anthropic added a second AI agent above Claudius, acting as a “CEO” to set goals and catch obvious mistakes before they happened.
The result was immediate. Discounts dropped by roughly 80 percent. Giveaways were cut in half. The shop expanded to new locations in New York and London. And for the first time, the business is profitable.
The Actual Lesson for Business Leaders
The story here is not “AI is getting smarter.” It is something more useful than that.
The CEO layer added governance. It did not add intelligence in the narrow sense. It added a structured check on decisions that the worker agent was making too freely. Before, Claudius had discretion over pricing. After, it had procedures to follow and a layer above it to catch deviation.
That constraint made the AI better at its job.
This runs counter to a lot of how businesses think about deploying AI. The instinct is often to give agents maximum freedom and see what they do with it. The lesson from Project Vend is that constraint is not the enemy of performance. In business operations, constraint is often what enables performance.
The parallel to human teams is obvious. A sales rep with no pricing floor costs the business money. The same is true for an AI agent.
Anthropic also changed how decisions were made structurally. Instead of letting Claudius make free-form judgments about prices, it was required to look up costs, research market rates, and follow a procedure before setting a number. Less improvisation. More process. Better outcomes.
What This Means for Business
If you are evaluating AI agents for business operations, Project Vend Phase 2 gives you three things to take from it.
Multi-agent architectures are not optional for production. A single agent making all the decisions is a Phase 1 pattern. Production-grade AI operations need layers: workers doing tasks and oversight layers catching mistakes. That is not a limitation of current AI. It is how good operations work, whether the workers are human or AI.
Procedures beat judgment. If your AI agent is making decisions by reasoning from first principles every time, you will get inconsistent results. If it has documented procedures, real data access, and a defined scope, you will get reliable ones. The discipline you would apply to a human team applies here too.
Governance enables speed. The CEO layer did not slow down Project Vend. It made the economics work. Businesses that treat AI governance as a blocker are going to end up with Phase 1 results indefinitely. Businesses that build governance in from the start get to Phase 2 faster.
The shop that lost money with an unconstrained AI is now profitable with a structured one. That is not a research finding. That is a business outcome.
At Enterprise DNA, the architecture behind Project Vend’s success is the same thinking we apply to every AI agent workforce deployment through Omni Ops. Capable agents, proper oversight, and process discipline built in from day one. If you want to explore what a governed AI agent workforce looks like for your business, book a discovery call with Sam.
Source
Anthropic