SemanticScuttle - klotz.me » Tags: llm+reasoning+machine learning

Tags: llm* + reasoning* + machine learning*

0 bookmark(s) - Sort by: Date ↓ / Title /

ByteDance Research Releases DAPO: A Fully Open-Sourced LLM Reinforcement Learning System at Scale

ByteDance Research has released DAPO (Dynamic Sampling Policy Optimization), an open-source reinforcement learning system for LLMs, aiming to improve reasoning abilities and address reproducibility issues. DAPO includes innovations like Clip-Higher, Dynamic Sampling, Token-level Policy Gradient Loss, and Overlong Reward Shaping, achieving a score of 50 on the AIME 2024 benchmark with the Qwen2.5-32B model.

2025-03-21 Tags: llm, reinforcement learning, dapo, open source, bytedance, ai, machine learning, reasoning, aime, qwen2.5 by klotz

What DeepSeek Signals About Where AI Is Headed

The article discusses the implications of DeepSeek's R1 model launch, highlighting five key lessons: the shift from pattern recognition to reasoning in AI models, the changing economics of AI, the coexistence of proprietary and open-source models, innovation driven by silicon scarcity, and the ongoing advantages of proprietary models despite DeepSeek's impact.

2025-02-06 Tags: deepseek r1, hbr, llm, machine learning, reasoning by klotz

First / Previous / Next / Last / Page 1 of 0

SemanticScuttle - klotz.me

Tags: llm* + reasoning* + machine learning*

Linked Tags

Related Tags