SemanticScuttle - klotz.me » klotz: llm+reasoning+ai

klotz: llm* + reasoning* + ai*

ByteDance Research Releases DAPO: A Fully Open-Sourced LLM Reinforcement Learning System at Scale

ByteDance Research has released DAPO (Dynamic Sampling Policy Optimization), an open-source reinforcement learning system for LLMs, aiming to improve reasoning abilities and address reproducibility issues. DAPO includes innovations like Clip-Higher, Dynamic Sampling, Token-level Policy Gradient Loss, and Overlong Reward Shaping, achieving a score of 50 on the AIME 2024 benchmark with the Qwen2.5-32B model.

2025-03-21 Tags: llm, reinforcement learning, dapo, open source, bytedance, ai, machine learning, reasoning, aime, qwen2.5 by klotz

Gary Marcus on LLMs don’t do formal reasoning - Apple

“we found no evidence of formal reasoning in language models …. Their behavior is better explained by sophisticated pattern matching—so fragile, in fact, that changing names can alter results by ~10%!”

2024-10-24 Tags: mehrdad farajtabar, llm, ai, reasoning, apple, gary marcus by klotz

All About AI Agents: Autonomy, Reasoning, Alignment, and More

This article provides a comprehensive overview of AI agents, discussing their core traits, technical aspects, and practical applications. It covers topics like autonomy, reasoning, alignment, and the role of AI agents in daily life.

Emerging Prominence of AI Agents: Agents are increasingly popular for day-to-day tasks but come with confusion about their definition and effective use.
Core Traits and Autonomy: Julia Winn explores the nuances of AI agents' autonomy and proposes a spectrum of agentic behavior to assess their suitability.
AI Alignment and Safety: Tarik Dzekman discusses the challenges of aligning AI agents with creators' goals, particularly focusing on safety and unintended consequences.
Tool Calling and Reasoning: Tula Masterman examines how AI agents bridge tool use with reasoning and the challenges they face in tool calling.
Proprietary vs. Open-Source AI: Gadi Singer compares the advantages and limitations of proprietary and open-source AI products for implementing agents.

2024-10-14 Tags: ai, agents, autonomy, reasoning, alignment, llm, gadi singer, tula masterman, tarik dzekman, julia winn by klotz

LLMs Can’t Plan, But Can Help Planning in LLM-Modulo Frameworks

The article discusses the limitations of Large Language Models (LLMs) in planning and self-verification tasks, and proposes an LLM-Modulo framework to leverage their strengths in a more effective manner. The framework combines LLMs with external model-based verifiers to generate, evaluate, and improve plans, ensuring their correctness and efficiency.

"Simply put, we take the stance that LLMs are amazing giant external non-veridical memories that can serve as powerful cognitive orthotics for human or machine agents, if rightly used."

2024-07-04 Tags: llm, planning, reasoning, ai, framework, modulo, arxiv, memory, ontology by klotz

First / Previous / Next / Last / Page 1 of 0

SemanticScuttle - klotz.me

klotz: llm* + reasoning* + ai*

Linked Tags

Related Tags