NVIDIA VR200 (Vera Rubin) already in the hands of customers. 576 GB HBM4 and a powerful performance leap.

Calendar 2/27/2026

NVIDIA confirms its entry into the next phase of the AI race. The first test samples of the Vera Rubin platform (VR200) have already reached customers, and full-scale deliveries are set to begin in the second half of 2026. The company promises a significant increase in performance while simultaneously reducing the costs of training and inference of models. This is a response to record demand: in fiscal year 2025, NVIDIA achieved $215.9 billion in revenue, with $68.1 billion coming from the fourth quarter alone.

576 GB HBM4 and up to 100 PFLOPS in Superchip

The heart of the platform is the Rubin accelerator, offering up to 50 PFLOPS (FP4) per single chip and even 100 PFLOPS in Superchip configuration. Each GPU uses two compute chiplets and eight stacks of HBM4 memory, providing 288 GB per GPU and 576 GB in the Superchip module. The new processor Vera (Armv9.2 "Olympus") has 88 cores and 172 threads and works with up to 1.5 TB LPDDR5X (SOCAMM). NVIDIA claims that with this architecture, training models with a trillion parameters may require up to four times fewer GPUs than in the Blackwell generation, and inference costs are expected to drop as much as 10×.

Cloud, Scale, and Competitive Advantage

Interest in VR200 is expected to be very high, and implementations in the data centers of the largest cloud providers are anticipated following the start of mass production. If the announcements are confirmed, NVIDIA will solidify its position as a leader in AI infrastructure, setting new standards for memory density and computing power. The pace of deliveries and real TCO in production environments will be critical. On paper, VR200 represents a generational leap; in practice, everything will be determined by benchmarks and availability.

Vera Rubin (VR200) is NVIDIA's next step towards extreme AI scale: hundreds of PFLOPS, hundreds of gigabytes of HBM4, and up to 1.5 TB of system memory. With record financial results, the company has the resources and demand to maintain its advantage. The second half of 2026 will show whether the promises of 4× fewer GPUs and 10× cheaper inference will translate into real implementations.

Source: NVIDIA

Katarzyna Petru Avatar
Katarzyna Petru

Journalist, reviewer, and columnist for the "ChooseTV" portal