Alphabet's (GOOG, GOOGL) Google said Wednesday in a blog post that it launched DiffusionGemma, an experimental open-source large language model that uses a diffusion-based approach to text generation instead of the standard token-by-token method.
Instead of generating text one token at a time, the model produces blocks of text in parallel, allowing it to generate up to four times faster than conventional autoregressive models.
The model is designed to run on high-end GPUs, including quantized configurations that fit within about 18GB of VRAM. The architecture is intended for research and speed-focused interactive workflows, including inline editing, code infilling and other non-linear text generation tasks.
The company said DiffusionGemma prioritizes speed over output quality and recommended using its standard Gemma 4 models for applications requiring higher-quality responses.
Price: $358.45, Change: $-5.82, Percent Change: -1.60%