Insight Guru Inc.

03/19/2026 | Press release | Distributed by Public on 03/19/2026 08:23

Nvidia’s Inference Power Play: Outmaneuvering Google And Amazon

Nvidia's Inference Power Play: Outmaneuvering Google And Amazon

March 19th, 2026by Trefis Team
-3.69%
Downside
180
Market
174
Trefis
NVDA
NVIDIA

Nvidia's GTC (GPU Technology Conference) event on Monday packed in a lot of announcements. The biggest shift? Notable progress on the AI inference strategy, with Nvidia focusing on countering custom chips from rivals such as Google's TPU v7 Ironwood and Amazon's Inferentia/Trainium.

Image by PublicDomainPictures from Pixabay

Investors have long worried that inference, which is the stage where AI models generate responses, answer questions, or enable agents to perform tasks, could erode Nvidia's dominance. While Nvidia has led in model training, which is largely a one-time process, inference is constant and repetitive, raising fears that cheaper, specialized rivals could capture share.

While Nvidia's revenue grew by about 65% last year and is on track to rise by similar levels this year, the stock trades at a mere 22x forward earnings. Is the low multiple a warning sign for Nvidia stock that we are nearing the peak of a classic silicon cycle?

Why Is Inference The New Battleground?

The inference market is likely to be massive. Jensen Huang highlighted the shift: computing demand for inference is now growing 100,000 times faster than for training. By the end of 2026, inference is expected to account for the bulk of AI compute, and it's very likely that it will eventually represent 90% of the lifetime cost of any AI system. Winning in inference is crucial for Nvidia's staying power.

Even a hint of rivals winning orders has rattled Nvidia stock. These concerns were amplified months ago when NVDA dropped almost 5% on reports that Meta was exploring using Google's custom chips.

Previously, this shift posed a major risk to Nvidia. Their chips were designed to handle everything. This made them expensive and power-hungry for simple tasks. Tech giants like Google and Amazon saw this weakness. They built custom chips like the TPU v7 (Ironwood) and Inferentia to cut costs. Google's TPU v7 is considerably cheaper at scale than general GPUs. This threatened Nvidia's dominance in the most profitable part of the market.

Inference demand may not just come from hyperscalers. Governments are also emerging as a new buyer class, building out national AI infrastructure. Nvidia is likely to play a bigger role in this sovereign AI wave.

How Is Nvidia Looking To Win Inference?

Nvidia is pushing back hard with the next generation Vera Rubin platform. At the heart of this shift is the new NVIDIA Groq 3 LPX, a dedicated inference accelerator integrated directly into the Rubin architecture. Nvidia licensed/acquired Groq's technology through a roughly $20 billion deal a few months ago.

The Rubin GPU handles the heavy upfront work such as reading and processing long prompts. Then Groq 3 takes over to generate the actual responses, one token (a unit of measurement of data in generative AI) at a time, delivering high speed and consistently low latency. This division of labor is key for real-time, agentic AI that needs instant, smooth performance.

From a customer perspective, the economics could be compelling. Combined, Vera Rubin and Groq 3 racks deliver up to 35x more output per unit of power (throughput per megawatt), enabling far greater scale and unlocking significantly higher revenue potential from massive models. This becomes more important in the era of so-called agentic AI, since agents consume far more tokens than, say, a conversation with a chatbot.

What Other Advantages Does Nvidia Have?

Nvidia has a moat that Google and Amazon lack. Their software stack, CUDA, has emerged as the industry standard. Most developers would not want to rewrite their code for Google TPUs. Nvidia also controls the entire "AI Factory" stack. They sell the chips, the networking, and the software as one unit, and they have been the leaders since the early stage of the AI era.

Nvidia expects this pivot to be lucrative. The company projects $1 trillion in chip sales by the end of 2027 from sales of Blackwell and Vera Rubin systems through the end of 2027 - doubling the $500 billion opportunity the company had previously cited for those platforms through 2026. This massive upward revision reflects surging demand for NVIDIA's inference-optimized infrastructure. That said, it remains to be seen if Nvidia's 70% plus gross margins that it has been generating in the training focused phase can hold up.

While Nvidia stock is poised for solid growth, not everyone is comfortable with stock-specific trades because of inherent volatility and risk involved. When you want to survive market swings to protect wealth and grow your money in the long run, portfolios are the right choice. Trefis High Quality Portfolio can help you stay invested, capture upside and mitigate the downside associated with any individual stock.

Insight Guru Inc. published this content on March 19, 2026, and is solely responsible for the information contained herein. Distributed via Public Technologies (PUBT), unedited and unaltered, on March 19, 2026 at 14:23 UTC. If you believe the information included in the content is inaccurate or outdated and requires editing or removal, please contact us at [email protected]