Tags: 6502 assembly*

0 bookmark(s) - Sort by: Date ↓ / Title /

  1. A specialized implementation of a 25,000-parameter decoder-only transformer designed to run on an unmodified Commodore 64. Written in hand-coded 6502 assembly, the model features real multi-head causal self-attention, RMSNorm, and softmax, achieving functionality similar to modern LLM architectures despite the extreme hardware constraints of a 1 MHz processor.
    Key technical details include:
    - Uses int8 quantized parameters with per-tensor shift scaling.
    - Implements fixed-point arithmetic (Q8.8) for activations.
    - Features a 128-token BPE vocabulary and a 20-token context window.
    - Includes tools for quantization-aware training (QAT) to ensure model accuracy on integer hardware.
    - Capable of running on real C64 hardware or emulators like VICE, with performance averaging 60 seconds per token.

Top of the page

First / Previous / Next / Last / Page 1 of 0 SemanticScuttle - klotz.me: tagged with "6502 assembly"

About - Propulsed by SemanticScuttle