NVIDIA Builds the First CPU Designed Entirely for AI Agents

At Computex in Taipei last week, NVIDIA CEO Jensen Huang made a statement that would have seemed overstated just two years ago: we need a new class of CPU built entirely for AI agents, not for humans.

The result is NVIDIA Vera, the company’s first CPU purpose-built for agentic workloads. Announced on June 1, 2026 at GTC Taipei, Vera is already heading into production with some of the most recognisable names in cloud computing and AI research. The signal it sends is hard to miss: enterprise AI agents are no longer an experiment. They are infrastructure.

What Vera Actually Is

Vera is not a rebranded data centre chip. NVIDIA designed it from scratch around the specific demands of running AI agents at scale: the Python runtimes, sandboxed code execution, orchestration logic, analytics pipelines, and memory-heavy tool calls that characterise real agentic workloads.

Key specifications:

88 custom Olympus cores: a new NVIDIA-designed CPU architecture built for branch-heavy, memory-sensitive agentic code
1.2 TB/s memory bandwidth: via an LPDDR5X memory subsystem designed to keep agent environments fed with data
1.8x faster task completion versus x86 CPUs on agentic benchmarks
3x bandwidth per core compared to x86 processors
50% higher IPC (instructions per clock) than NVIDIA’s own Grace CPU
Spatial Multithreading: a new technique enabling more concurrent agent environments per chip

NVIDIA is also releasing a Vera CPU rack that fits 256 liquid-cooled Vera CPUs in a single enclosure, capable of sustaining more than 22,500 concurrent, fully isolated CPU environments simultaneously. For businesses thinking about running agent fleets rather than single agents, that density matters considerably.

Who Is Already Planning to Use It

The list of early adopters is not a collection of enthusiastic startups. It includes:

Anthropic, OpenAI, SpaceXAI: the organisations building the frontier models that agents run on
ByteDance, CoreWeave, Oracle Cloud Infrastructure: hyperscale infrastructure providers
Dell Technologies, HPE, Lenovo, Supermicro: the hardware OEMs who will ship Vera in standalone CPU servers
NYSE: a financial services giant where agentic processing carries real-time risk implications
Manufacturing partners including Foxconn, Quanta Cloud Technology, Wistron and Wiwynn

The involvement of financial sector infrastructure (NYSE) and manufacturing partners signals that Vera is being positioned not just for AI labs but for regulated, latency-sensitive enterprise deployments.

Why a CPU, Not a GPU, for Agents?

GPUs have dominated AI infrastructure because model training and inference are highly parallel workloads that map well to GPU architectures. But running agents is different.

An AI agent spends most of its time on tasks that CPUs handle: calling tools, executing code, managing state, routing between sub-agents, processing data, and handling the constant back-and-forth with APIs and databases. The GPU is busy for a fraction of that time. The CPU is the bottleneck.

Jensen Huang described Vera as opening “a market that never existed before.” The market for CPU compute purpose-built to orchestrate and run agents at scale is separate from the backend compute for model inference, and until now it had been served by chips designed for something else entirely.

The Vera CPU rack’s ability to run 22,500 concurrent agent environments from a single enclosure is the practical expression of that vision. Each environment is fully isolated, fully auditable, and running at full performance. For enterprise deployments where security and governance are as important as throughput, that architecture matters.

What This Means for Business

The CPU announcement is a leading indicator that enterprise AI agent deployments are entering a new phase. A few things follow from this:

Scale is now a solvable problem. One of the practical objections to deploying agent fleets has been the infrastructure cost and complexity of running many agents simultaneously. Vera’s density of 22,500 concurrent environments per rack changes the economic calculation significantly.

Governance is built into the architecture. Isolated environments mean agent actions can be logged, audited, and constrained at the hardware level. For businesses in regulated industries like finance, healthcare, and legal, this kind of infrastructure-level governance is not a nice-to-have. It is a requirement before any serious deployment can happen.

The infrastructure stack is maturing fast. When NYSE, Anthropic, and Dell are all named as Day 1 adopters of a purpose-built agent CPU, it confirms that serious capital is now flowing into the production infrastructure layer, not just the model layer. Businesses evaluating AI agents should factor this into their planning timelines. The “we’ll wait until it’s more mature” window is shortening.

Vendor decisions made now will compound. The companies building on NVIDIA’s agent infrastructure stack today, using the Agent Toolkit, Nemotron models, OpenShell runtime, and now Vera, are accumulating advantages in deployment experience and workflow optimisation that will be difficult for late movers to close quickly.

Want the practical version of this? The free Working With Claude field guide covers the full Claude ecosystem, Claude Code, and how to roll it out across a real business. Download it here.

Source

NVIDIA Newsroom

Free Resource

Going deeper with Claude?

Get the free 32-page implementation guide for ANZ teams.

Enterprise DNA Resources