At Computex in Taipei last week, NVIDIA CEO Jensen Huang made a statement that would have seemed overstated just two years ago: we need a new class of CPU built entirely for AI agents, not for humans.
The result is NVIDIA Vera, the company’s first CPU purpose-built for agentic workloads. Announced on June 1, 2026 at GTC Taipei, Vera is already heading into production with some of the most recognisable names in cloud computing and AI research. The signal it sends is hard to miss: enterprise AI agents are no longer an experiment. They are infrastructure.
What Vera Actually Is
Vera is not a rebranded data centre chip. NVIDIA designed it from scratch around the specific demands of running AI agents at scale: the Python runtimes, sandboxed code execution, orchestration logic, analytics pipelines, and memory-heavy tool calls that characterise real agentic workloads.
Key specifications:
- 88 custom Olympus cores: a new NVIDIA-designed CPU architecture built for branch-heavy, memory-sensitive agentic code
- 1.2 TB/s memory bandwidth: via an LPDDR5X memory subsystem designed to keep agent environments fed with data
- 1.8x faster task completion versus x86 CPUs on agentic benchmarks
- 3x bandwidth per core compared to x86 processors
- 50% higher IPC (instructions per clock) than NVIDIA’s own Grace CPU
- Spatial Multithreading: a new technique enabling more concurrent agent environments per chip
NVIDIA is also releasing a Vera CPU rack that fits 256 liquid-cooled Vera CPUs in a single enclosure, capable of sustaining more than 22,500 concurrent, fully isolated CPU environments simultaneously. For businesses thinking about running agent fleets rather than single agents, that density matters considerably.
Who Is Already Planning to Use It
The list of early adopters is not a collection of enthusiastic startups. It includes:
- Anthropic, OpenAI, SpaceXAI: the organisations building the frontier models that agents run on
- ByteDance, CoreWeave, Oracle Cloud Infrastructure: hyperscale infrastructure providers
- Dell Technologies, HPE, Lenovo, Supermicro: the hardware OEMs who will ship Vera in standalone CPU servers
- NYSE: a financial services giant where agentic processing carries real-time risk implications
- Manufacturing partners including Foxconn, Quanta Cloud Technology, Wistron and Wiwynn
The involvement of financial sector infrastructure (NYSE) and manufacturing partners signals that Vera is being positioned not just for AI labs but for regulated, latency-sensitive enterprise deployments.
Why a CPU, Not a GPU, for Agents?
GPUs have dominated AI infrastructure because model training and inference are highly parallel workloads that map well to GPU architectures. But running agents is different.
An AI agent spends most of its time on tasks that CPUs handle: calling tools, executing code, managing state, routing between sub-agents, processing data, and handling the constant back-and-forth with APIs and databases. The GPU is busy for a fraction of that time. The CPU is the bottleneck.
Jensen Huang described Vera as opening “a market that never existed before.” The market for CPU compute purpose-built to orchestrate and run agents at scale is separate from the backend compute for model inference, and until now it had been served by chips designed for something else entirely.
The Vera CPU rack’s ability to run 22,500 concurrent agent environments from a single enclosure is the practical expression of that vision. Each environment is fully isolated, fully auditable, and running at full performance. For enterprise deployments where security and governance are as important as throughput, that architecture matters.
What This Means for Business
The CPU announcement is a leading indicator that enterprise AI agent deployments are entering a new phase. A few things follow from this:
Scale is now a solvable problem. One of the practical objections to deploying agent fleets has been the infrastructure cost and complexity of running many agents simultaneously. Vera’s density of 22,500 concurrent environments per rack changes the economic calculation significantly.
Governance is built into the architecture. Isolated environments mean agent actions can be logged, audited, and constrained at the hardware level. For businesses in regulated industries like finance, healthcare, and legal, this kind of infrastructure-level governance is not a nice-to-have. It is a requirement before any serious deployment can happen.
The infrastructure stack is maturing fast. When NYSE, Anthropic, and Dell are all named as Day 1 adopters of a purpose-built agent CPU, it confirms that serious capital is now flowing into the production infrastructure layer, not just the model layer. Businesses evaluating AI agents should factor this into their planning timelines. The “we’ll wait until it’s more mature” window is shortening.
Vendor decisions made now will compound. The companies building on NVIDIA’s agent infrastructure stack today, using the Agent Toolkit, Nemotron models, OpenShell runtime, and now Vera, are accumulating advantages in deployment experience and workflow optimisation that will be difficult for late movers to close quickly.
Enterprise DNA’s Omni practice helps businesses design and deploy AI agent workforces, from workflow identification through to production-ready systems. If you’re thinking about what an AI agent strategy actually looks like for your organisation, book a discovery call to start the conversation.
Source
NVIDIA Newsroom