klotz: sft* + llm* + fine tuning*

0 bookmark(s) - Sort by: Date ↓ / Title / - Bookmarks from other users for this tag

  1. The article introduces a new approach to language modeling called test-time scaling, which enhances performance by utilizing additional compute resources during testing. The authors present a method involving a curated dataset and a technique called budget forcing to control compute usage, allowing models to double-check answers and improve reasoning. The approach is demonstrated with the Qwen2.5-32B-Instruct language model, showing significant improvements on competition math questions.

  2. 2024-01-17 Tags: , , by klotz

Top of the page

First / Previous / Next / Last / Page 1 of 0 SemanticScuttle - klotz.me: Tags: sft + llm + fine tuning

About - Propulsed by SemanticScuttle