SemanticScuttle - klotz.me » klotz: explanations

Interpretable Causal Diffusion Language Models

Steerling-8B is an interpretable causal diffusion language model that combines masked diffusion language modeling with concept decomposition, enabling generation, attribution, steering, and extraction of hidden representations. It offers features like block-causal attention and decomposition of hidden states into known and unknown concepts.

2026-02-24 Tags: attribution, concepts, models, decomposition, features, diffusion, interpretability, explanations, explainability, llms, generative-ai by klotz

SemanticScuttle - klotz.me

klotz: explanations*

Linked Tags

Related Tags