klotz: multi-layer perceptrons* + deep learning*

0 bookmark(s) - Sort by: Date ↓ / Title / - Bookmarks from other users for this tag

  1. The article delves into how large language models (LLMs) store facts, focusing on the role of multi-layer perceptrons (MLPs) in this process. It explains the mechanics of MLPs, including matrix multiplication, bias addition, and the Rectified Linear Unit (ReLU) function, using the example of encoding the fact that Michael Jordan plays basketball. The article also discusses the concept of superposition, which allows models to store a vast number of features by utilizing nearly perpendicular directions in high-dimensional spaces.

Top of the page

First / Previous / Next / Last / Page 1 of 0 SemanticScuttle - klotz.me: Tags: multi-layer perceptrons + deep learning

About - Propulsed by SemanticScuttle