klotz: tiny-llama*

Bookmarks on this page are managed by an admin user.

0 bookmark(s) - Sort by: Date ↓ / Title / - Bookmarks from other users for this tag

  1. This article explores how to boost the performance of small language models by using supervision from larger ones through knowledge distillation. The article provides a step-by-step guide on how to distill knowledge from a teacher model (LLama 2–70B) to a student model (Tiny-LLama) using unlabeled in-domain data and targeted prompting.

Top of the page

First / Previous / Next / Last / Page 1 of 0 SemanticScuttle - klotz.me: Tags: tiny-llama

About - Propulsed by SemanticScuttle