|

Morgan Stanley Says Agentic AI Will Supercharge Data-Center CPU Demand by 2030 — Here’s What That Means

If you thought the AI hardware race began and ended with GPUs, think again. Morgan Stanley just threw a curveball into the AI infrastructure narrative: the next big wave of spending won’t be all about Nvidia-style GPU buildouts. Instead, the bank forecasts a massive surge in demand for data-center CPUs and memory — driven by the rise of “agentic AI,” a new breed of autonomous systems that plan, verify, and coordinate multi-step tasks with persistent state. According to their analysis, agentic AI could add $32.5 billion to $60 billion in incremental data-center CPU demand by 2030, on top of a market already exceeding $100 billion.

Why the pivot? Because reliable, workflow-oriented agents depend on orchestration and sequential logic — areas where CPUs shine — not just raw generative power. In other words: GPUs won the first wave (model training and high-throughput inference); CPUs and memory may lead the second (planning, coordination, verification, and state management).

Let’s unpack what’s changing, why it matters, and who stands to benefit.

Source: See the TechStartups report summarizing Morgan Stanley’s latest view on AI infrastructure and agentic systems.

What Is Agentic AI — And Why Is It Different?

Agentic AI describes AI systems that autonomously execute multi-step tasks, make decisions based on context and goals, and maintain state across steps. Instead of producing a single output to a single prompt, these agents act more like software workers: they plan, call tools or APIs, check their work, adapt when they hit errors, and keep going until they’ve achieved an objective.

  • They’re goal-driven: “Plan a product launch,” “Prepare a Q2 revenue model,” “Migrate this data pipeline.”
  • They’re persistent: they keep track of intermediate results and state across steps.
  • They’re orchestration-heavy: they decide which tools to call, when to verify, and how to route subtasks.

This is a very different workload profile than traditional LLM prompting. It borrows from established research on decision-making and reasoning (e.g., ReAct prompting) and real-world simulations of agent behavior (e.g., Generative Agents), while also drawing on practical engineering from frameworks like LangChain Agents and open-source projects like AutoGPT. If the first wave of generative AI was about prediction and synthesis, the second wave is about coordination, verification, and reliability.

Why CPUs Suddenly Matter More

GPUs are phenomenal at parallelism. They power the most compute-intensive parts of AI — training and high-throughput inference. But agentic workflows introduce a different bottleneck: lots of branching, I/O, tool use, and step-by-step control flow. That favors CPUs, which excel at:

  • Sequential logic and orchestration
  • Context switching and OS-level task management
  • Handling diverse I/O with lower latency sensitivity
  • Running the “glue code” that calls tools, databases, and microservices

In agentic systems, GPUs are still critical for generation and perception (e.g., large model calls), but a growing share of the overall compute pie belongs to CPUs coordinating the process. Morgan Stanley argues this expanding role could translate into tens of billions in incremental CPU demand — and, by extension, a stronger outlook for firms like AMD, Intel, and large hyperscalers designing their own silicon.

The Second Wave: Reliability Over Raw Generation

Early generative AI dazzled with fluency and creativity. But enterprise adoption demands something else: reliability. Agentic AI emphasizes:

  • Planning: breaking big goals into executable steps
  • Tool use: calling APIs, RPA, databases, code interpreters, and internal systems
  • Verification: self-checks, formal constraints, or human-in-the-loop approvals
  • Persistence: keeping state across long-running tasks and sessions

That stack is compute-hungry in a different way than training. It’s not just FLOPs — it’s control flow, lots of memory access, and orchestration at scale. Morgan Stanley’s thesis is that these characteristics push data-center operators to rebalance capex toward CPUs and memory, complementing ongoing GPU investments rather than replacing them.

From “GPU-Only” to “GPU + CPU + Memory” AI Architectures

Think of agentic workloads as a pipeline:

  1. User request hits an agent controller (CPU-heavy)
  2. Controller plans steps and chooses tools (CPU-heavy)
  3. Generation or perception call executes on a GPU (GPU-heavy)
  4. Results are validated and stitched back into state (CPU + memory)
  5. Next action is scheduled; external APIs or services are called (CPU + I/O)
  6. Repeat until the goal is met, possibly over hours or days (CPU + persistent storage)

In aggregate, that architecture looks less like a single giant model call and more like a distributed application. The practical implication: data centers need balanced fleets and memory-rich nodes, not just racks of accelerators. This is why Morgan Stanley spotlights memory vendors as big winners alongside CPU providers. Expect outsized interest in:

  • High-capacity DRAM (e.g., DDR5) for in-memory state and caching
  • High-bandwidth memory (HBM) for accelerator-attached workloads
  • Fast NVMe SSDs for scratch space and agent “memories”
  • Emerging interconnects that make memory more composable and shareable

For investors, the firm urges watching Micron and other memory leaders as agentic architectures mature.

Why Workflows Like “GPT-5.4 Agents” Change the Game

Morgan Stanley’s note cites enterprise-grade agent workflows — including examples like OpenAI’s GPT-5.4-based orchestration — as a sign that hyperscalers are standardizing on an agents-first design pattern. While the headline models garner attention, the backbone is increasingly the workflow engine: task graphs, schedulers, queues, verification loops, and tool adapters. That’s where CPU cycles pile up.

And as businesses automate broader spans of work — think research-to-draft-to-review to publish — the number of steps, verifications, and tool calls can grow exponentially. Result: even if each model call is cheaper over time, the total compute footprint per business process can still skyrocket.

For a sense of the direction, see:

Capex Priorities Are Shifting Inside the Data Center

Morgan Stanley’s analysis challenges the “GPU monopoly” narrative. Yes, GPUs remain essential — especially for training. But enterprise agent adoption redistributes spend across:

  • General-purpose CPUs for orchestration, feature engineering, and verification layers
  • Memory (DRAM/HBM) and storage to hold state, caches, and artifacts
  • Networking to connect microservices, vector databases, and tool chains
  • GPUs and specialized accelerators for generation, vision, speech, and retrieval

This leads to more diversified hardware ecosystems and multi-vendor strategies. Hyperscalers — think AWS, Microsoft Azure, and Google Cloud — are already building configurable stacks that let customers mix and match accelerators, CPU instances, and high-memory nodes. Expect that menu to expand.

Economic Outlook: A 20–30% CAGR for Agentic Infrastructure

The report synthesizes hyperscaler roadmaps, enterprise adoption patterns, and model trends to project a robust 20–30% compound annual growth rate for agentic infrastructure. The headline number is striking: $32.5–$60 billion in incremental CPU demand by 2030 tied to agentic AI — on top of a CPU market that already tops $100 billion.

That scale implies three things:

  1. “Next trillion-dollar opportunity” isn’t hyperbole when you consider the full stack: compute, memory, networking, orchestration software, and managed services.
  2. Vendor mix will broaden. Watch AMD, Intel, and memory leaders like Micron, as well as hyperscalers investing in custom silicon.
  3. Value migrates upward to reliability services — guardrails, verification layers, eval pipelines, and SLAs. These depend on CPU-heavy control planes.

As always, these are projections, not certainties. But they’re directionally consistent with what we’re seeing: more “software-like” AI systems, more complex workflows, and more dollar-weighted importance on data center glue.

Risks: Supply Chains and Energy Constraints

Two caveats in Morgan Stanley’s thesis deserve attention:

  • Supply chain bottlenecks: Ramps for CPUs, memory (especially HBM), and networking components can lag demand spikes. Lead times, packaging capacity, and substrate availability all matter.
  • Power and cooling: Agentic workloads can increase overall utilization and wall-clock run times. Even if peak loads are lower than training clusters, aggregate energy draw can be large — and constrained by data-center power envelopes and grid limits.

Expect operators to push aggressively into energy-efficient designs, workload scheduling, and liquid cooling where it pays off. We’re likely to see further innovation in memory hierarchy, caching strategies, and data locality to cut unnecessary hops.

How Enterprises Should Prepare for Agentic AI

If you’re an enterprise leader, the takeaway isn’t to “buy CPUs instead of GPUs.” It’s to architect for workflows, not just model calls. A practical path:

  • Start with high-value, bounded processes: financial report assembly, customer onboarding, compliance checks.
  • Standardize on an orchestration layer that supports tool use, retries, and verification gates.
  • Segment workloads across fit-for-purpose compute:
  • GPUs (or domain accelerators) for generation, perception, and embedding
  • CPUs for controllers, schedulers, evaluators, and data wrangling
  • High-memory nodes for stateful caches and vector retrieval
  • Build evaluation and observability in from day one: measure reliability, not just latency.
  • Plan for growth in memory and storage capacity; agent state can balloon quickly.
  • Optimize for energy and cost: batch non-urgent steps, pre-compute caches, and monitor agent “wander.”

This strategy also makes your stack portable across clouds — a hedge against vendor concentration and capacity shortages.

The Hardware Implications in Plain English

Here’s an intuitive way to think about it:

  • GPUs are the creative muscles — they do heavy lifting when you need raw generative power.
  • CPUs are the brain’s executive function — they decide what to do next, keep the calendar, check the work, and talk to everyone else.
  • Memory is the short-term working memory — the bigger it is, the more context your agents can juggle.
  • Storage is long-term memory — where agents put artifacts and logs they’ll need later.
  • Networking is the nervous system — if it’s slow or unreliable, your agents stumble.

Agentic AI stresses all five. That’s why diversified spend — not just GPUs — is the rational outcome.

Who Stands to Benefit?

  • CPU vendors: AMD (EPYC) and Intel (Xeon) are front and center. Expect more SKUs optimized for orchestration, I/O, and memory bandwidth.
  • Memory leaders: Micron is called out by Morgan Stanley. Capacity, speed, and power efficiency will be differentiators.
  • Hyperscalers: AWS, Azure, Google Cloud will capture value with managed agent platforms, orchestration, and observability tooling.
  • Ecosystem software: orchestration frameworks, evaluation tools, prompt/plan verifiers, and RAG platforms stand to see rising demand.
  • Networking and storage: NVMe, low-latency fabrics, and scalable object stores support agent state and artifact pipelines.

It’s a broader, more competitive market than the “GPU-only” narrative suggests — exactly Morgan Stanley’s point.

Case Example: Planning-and-Verification Loops at Scale

Consider an enterprise support agent:

  1. Intake: classify a customer issue and retrieve history (CPU + vector DB + small GPU embedding)
  2. Plan: break down the resolution path, pick tools (CPU)
  3. Execute: query billing system, check entitlements, fetch contract (CPU + I/O)
  4. Generate: draft response or update knowledge article (GPU)
  5. Verify: run policy checks, retrieve citations, score confidence (CPU)
  6. Approve: send to human if confidence < threshold (CPU + workflow)
  7. Persist: log steps, decisions, and artifacts for audit (CPU + storage)

Multiply by thousands of tickets per hour across regions, and you’ve got a CPU- and memory-intensive backbone with GPU bursts — a canonical agentic workload.

How to Spot the Shift in the Wild

Signals to watch if you want to validate this thesis:

  • Cloud SKU mix: more “high-memory CPU” instances in AI reference architectures.
  • Vendor roadmaps: CPU designs touting orchestration, I/O, and cache optimizations for AI agents.
  • Managed services: new “agent reliability” offerings, evaluation SLAs, and policy-as-code for AI workflows.
  • Procurement RFPs: line items for verification pipelines, not just LLM inference throughput.
  • App telemetry: rising share of CPU time in agent controllers relative to GPU time in generation.

When these show up together, you’re seeing agentic AI move from prototypes to production.

Energy and Sustainability: A Real Constraint

Even if agentic systems spread work out over time, the sheer number of steps means your energy budget matters. Practical best practices:

  • Use retrieval and caching to reduce redundant model calls.
  • Pre-compute embeddings and common sub-tasks during off-peak hours.
  • Right-size your instances and autoscale controllers thoughtfully.
  • Monitor “verification depth” — the number of checks per step — to avoid diminishing returns.

Expect sustainability to be part of the buying criteria, not an afterthought.

What About Nvidia?

Nvidia isn’t going away — far from it. Training frontier models, vision, speech, and high-throughput inference remain GPU-centric. The Morgan Stanley view doesn’t predict a GPU collapse; it predicts a more balanced capex profile and a richer ecosystem. In the near term, that’s good news for buyers who want optionality and for vendors who can differentiate beyond pure TFLOPs.

For context on Nvidia’s role in AI acceleration, see NVIDIA’s platform overview.

Strategic Moves for 2026–2030

  • Build on open standards and modular designs to avoid lock-in.
  • Invest early in evaluation tooling; it’s the heartbeat of reliability.
  • Pilot agent workflows where you can quantify ROI (SLAs, cost per ticket, cycle time).
  • Co-design with IT: memory footprints, storage tiers, and network topology all matter.
  • Keep a close eye on supply chain signals — especially for memory and packaging.

The winners won’t just deploy bigger models; they’ll run smarter systems.

FAQs

Q: What exactly is “agentic AI”?
A: Agentic AI refers to AI systems that autonomously plan and execute multi-step tasks, maintain state across steps, and use tools or APIs to achieve goals. Unlike a single prompt-and-response, agents orchestrate workflows with planning, verification, and persistence.

Q: Why would CPUs see a demand spike from agentic AI?
A: Agents are orchestration-heavy. They involve branching logic, context switching, I/O, and verification loops — tasks that map well to CPU architectures. GPUs still handle generation and perception, but CPUs run the control plane.

Q: Does this mean GPUs become less important?
A: No. GPUs remain indispensable for training and high-throughput inference. The shift is toward a balanced architecture where CPUs and memory play a larger role as agentic workloads scale.

Q: What’s the size of the opportunity according to Morgan Stanley?
A: They estimate agentic AI could add $32.5–$60 billion in incremental data-center CPU demand by 2030, on top of a CPU market already exceeding $100 billion. They also project a 20–30% CAGR for agentic infrastructure overall. Source: TechStartups.

Q: Who are the likely beneficiaries?
A: CPU vendors like AMD and Intel, memory leaders like Micron, hyperscalers building agent platforms (AWS, Azure, Google Cloud), and software providers specializing in orchestration, evaluation, and observability.

Q: What are the main risks to this outlook?
A: Supply chain constraints (especially for memory and advanced packaging) and energy limits in data centers. Both could slow deployments or raise costs.

Q: How should enterprises get started with agents?
A: Begin with bounded, high-ROI processes; choose an orchestration framework that supports tool use and verification; separate GPU-heavy steps from CPU-heavy control; invest early in eval pipelines; and plan for significant memory and storage growth.

Q: Where can I learn more about agentic techniques?
A: Check out ReAct prompting, Generative Agents, and practical docs like LangChain Agents. For broader AI updates, see OpenAI’s blog.

The Bottom Line

Agentic AI is pushing the AI stack beyond single-shot generation into reliable, long-running workflows. That evolution makes CPUs and memory first-class citizens again — not at the expense of GPUs, but alongside them. Morgan Stanley’s forecast — $32.5–$60 billion in additional data-center CPU demand by 2030 — underscores how fast the center of gravity is moving toward orchestration, verification, and state.

If you’re building or buying AI, optimize for workflows, not just models. If you’re allocating capex, think balanced fleets and big memory. And if you’re watching the market, look past headline models to the control planes quietly running the show. That’s where the next decade’s AI infrastructure value will compound.

Discover more at InnoVirtuoso.com

I would love some feedback on my writing so if you have any, please don’t hesitate to leave a comment around here or in any platforms that is convenient for you.

For more on tech and other topics, explore InnoVirtuoso.com anytime. Subscribe to my newsletter and join our growing community—we’ll create something magical together. I promise, it’ll never be boring! 

Stay updated with the latest news—subscribe to our newsletter today!

Thank you all—wishing you an amazing day ahead!

Read more related Articles at InnoVirtuoso

Browse InnoVirtuoso for more!