klotz: abp*

0 bookmark(s) - Sort by: Date ↓ / Title / - Bookmarks from other users for this tag

  1. This paper investigates how large language models (LLMs) solve mental math problems. It proposes that meaningful computation occurs late in the network (in terms of layer depth) and primarily at the last token, receiving information from other tokens in specific middle layers. The authors introduce techniques (CAMA and ABP) to identify an 'All-for-One' subgraph responsible for this behavior, demonstrating its sufficiency and necessity for high performance across various models and input styles.

Top of the page

First / Previous / Next / Last / Page 1 of 0 SemanticScuttle - klotz.me: Tags: abp

About - Propulsed by SemanticScuttle