SemanticScuttle - klotz.me » klotz: diffusion

klotz: diffusion*

Interpretable Causal Diffusion Language Models

Steerling-8B is an interpretable causal diffusion language model that combines masked diffusion language modeling with concept decomposition, enabling generation, attribution, steering, and extraction of hidden representations. It offers features like block-causal attention and decomposition of hidden states into known and unknown concepts.

2026-02-24 Tags: attribution, concepts, models, decomposition, features, diffusion, interpretability, explanations, explainability, llms, generative-ai by klotz

The Optimal Architecture for Small Language Models

This article details research into finding the optimal architecture for small language models (70M parameters), exploring depth-width tradeoffs, comparing different architectures, and introducing Dhara-70M, a diffusion model offering 3.8x faster throughput with improved factuality.

2025-12-27 Tags: llm, nlp, small language models, architecture, diffusion, llama, gemma, deep learning by klotz

Introducing Mercury, the first commercial-scale diffusion large language model

Mercury dLLMs are up to 10x faster and cheaper than current LLMs, offering high-quality text generation with improved reasoning and error correction.

2025-02-27 Tags: diffusion, llm, mercury, code generation, text generation, autoregression by klotz

MobileDiffusion: Rapid text-to-image generation on-device – Google Research Blog

2024-02-01 Tags: llm, text, image, mobile, google, diffusion by klotz

First / Previous / Next / Last / Page 1 of 0

SemanticScuttle - klotz.me

klotz: diffusion*

Linked Tags

Related Tags