SemanticScuttle - klotz.me » klotz: next-token prediction

klotz: next-token prediction*

All for One: LLMs Solve Mental Math at the Last Token With Information Transferred From Other Tokens

This paper investigates how large language models (LLMs) solve mental math problems. It proposes that meaningful computation occurs late in the network (in terms of layer depth) and primarily at the last token, receiving information from other tokens in specific middle layers. The authors introduce techniques (CAMA and ABP) to identify an 'All-for-One' subgraph responsible for this behavior, demonstrating its sufficiency and necessity for high performance across various models and input styles.

2025-09-13 Tags: llm, mental math, next-token prediction, attention, computation, language, cama, abp, datadog by klotz

Large Language Models are Locally Linear Mappings

This paper demonstrates that the inference operations of several open-weight large language models (LLMs) can be mapped to an exactly equivalent linear system for an input sequence. It explores the use of the 'detached Jacobian' to interpret semantic concepts within LLMs and potentially steer next-token prediction.

2025-06-02 Tags: llm, interpretability, jacobian, next-token prediction, transformer models, deep learning, machine learning by klotz

First / Previous / Next / Last / Page 1 of 0

SemanticScuttle - klotz.me

klotz: next-token prediction*

Linked Tags

Related Tags