NVIDIA immediately supports the new Google model. DiffusionGemma is coming to RTX

Calendar 6/11/2026

The competition in the artificial intelligence market shows no signs of slowing down. Google DeepMind officially introduced DiffusionGemma, a new open AI model designed for very fast content generation. Moments after the launch, NVIDIA announced full support for the solution on its RTX and DGX platforms. The manufacturer claims that with proper optimizations, users can expect significantly higher performance and local model deployment without the need for cloud services.

DiffusionGemma is set to accelerate text generation

The new model developed by Google is based on the Gemma 4 architecture and employs a different approach than traditional autoregressive models. Instead of generating single tokens step by step, DiffusionGemma can process larger batches of them simultaneously. In practice, this significantly shortens the time required to generate responses. The model has over 25 billion parameters, but only a portion of them are active during operation, which improves computational efficiency. Google also emphasizes the open nature of the project. DiffusionGemma is made available under the Apache 2.0 license, allowing developers and companies to freely use the solution and develop their own projects based on this technology. The model supports both text and images, with a maximum context of 256,000 tokens. This makes it suitable for many advanced applications related to data analysis, content creation, or building AI agents. According to the creators, however, its greatest advantage remains its speed. In some scenarios, the model is said to be up to four times faster than traditional solutions based on sequential generation.

wccftech

NVIDIA prepared support right at launch

NVIDIA quickly capitalized on the launch of the new model by presenting ready environments for running it on their own hardware. The support includes both GeForce RTX cards for home users as well as professional RTX PRO platforms and AI computers from the DGX family. The company claims that by utilizing Tensor cores and CUDA technology, it is possible to achieve very high performance without the need for additional configuration. The results achieved by DGX systems are particularly impressive. According to the manufacturer, DGX Spark can generate around 150 tokens per second, while more advanced configurations can reach several hundred tokens per second during local model operation. NVIDIA also emphasizes that users do not need to use cloud services or pay for each generated query. The entire setup can operate directly on a computer equipped with the appropriate hardware. This is an important argument for individuals involved in artificial intelligence development, who are increasingly seeking local solutions that provide greater control over data. Already, DiffusionGemma can be run on the GeForce RTX 5090 card, among others, and on DGX platforms equipped with the latest NVIDIA chips.

DiffusionGemma is a new open AI model from Google DeepMind that focuses on very fast content generation and local operation. NVIDIA has provided full support for its RTX graphics cards and DGX systems from day one, offering additional optimizations that enhance performance. Everything suggests that the new model could become an interesting alternative to the popular solutions currently used by developers and artificial intelligence enthusiasts.

source; wccftech

Redakcja Choose TV Avatar
Redakcja Choose TV

ChooseTVteam-title