SemanticScuttle - klotz.me » Tags: deep learning+machine learning

Tags: deep learning* + machine learning*

0 bookmark(s) - Sort by: Date ↓ / Title /

This article explores the application of reinforcement learning (RL) to Partial Differential Equations (PDEs), highlighting the complexity and challenges involved in controlling systems described by PDEs compared to Ordinary Differential Equations (ODEs). It discusses various approaches, including genetic programming and neural network-based methods, and presents experimental results on controlling PDE systems like the diffusion equation and Kuramoto–Sivashinsky equation. The author emphasizes the potential of machine learning to improve understanding and control of PDE systems, which have wide-ranging applications in fields like fluid dynamics, thermodynamics, and engineering.

2025-02-22 Tags: reinforcement learning, partial differential equations, control systems, genetic programming, machine learning, diffusion equation, kuramoto–sivashinsky equation, neural networks, cybernetics by klotz

How might LLMs store facts | Chapter 7, Deep Learning

The article delves into how large language models (LLMs) store facts, focusing on the role of multi-layer perceptrons (MLPs) in this process. It explains the mechanics of MLPs, including matrix multiplication, bias addition, and the Rectified Linear Unit (ReLU) function, using the example of encoding the fact that Michael Jordan plays basketball. The article also discusses the concept of superposition, which allows models to store a vast number of features by utilizing nearly perpendicular directions in high-dimensional spaces.

2025-02-21 Tags: 3blue1brown, llm, facts storage, multi-layer perceptrons, neural networks, deep learning, attention, gpt by klotz

Deep-Diving & Decoding The Secrets That Make DeepSeek So Good

The article explores the architectural changes that enable DeepSeek's models to perform well with fewer resources, focusing on Multi-Head Latent Attention (MLA). It discusses the evolution of attention mechanisms, from Bahdanau to Transformer's Multi-Head Attention (MHA), and introduces Grouped-Query Attention (GQA) as a solution to MHA's memory inefficiencies. The article highlights DeepSeek's competitive performance despite lower reported training costs.

2025-02-16 Tags: deepseek, multi-head latent attention, mla, attention, transformer, grouped-query attention, gqa, deep learning, llm by klotz

DeepScaleR: Surpassing O1-Preview with a 1.5B Model by Scaling RL

Scaling Reinforcement Learning (RL) to surpass O1 in deep learning models

2025-02-13 Tags: deepscaler, reinforcement learning, scaling, deep learning, o1-preview, 1.5b model, rl by klotz

Arxiv s1: Simple test-time scaling

The article introduces a new approach to language modeling called test-time scaling, which enhances performance by utilizing additional compute resources during testing. The authors present a method involving a curated dataset and a technique called budget forcing to control compute usage, allowing models to double-check answers and improve reasoning. The approach is demonstrated with the Qwen2.5-32B-Instruct language model, showing significant improvements on competition math questions.

2025-02-14 Tags: arxiv, test-time scaling, budget forcing, llm, qwen2.5-32b-instruct, sft, fine tuning, reinforcement learning, machine learning, deepseek-r1 by klotz

From Zero to Reasoning Hero: How DeepSeek-R1 Leverages Reinforcement Learning to Master Complex Reasoning

The article explores the DeepSeek-R1 models, focusing on how reinforcement learning (RL) is used to develop advanced reasoning capabilities in AI. It discusses the DeepSeek-R1-Zero model, which learns reasoning without supervised fine-tuning, and the DeepSeek-R1 model, which combines RL with a small amount of supervised data for improved performance. The article highlights the use of distillation to transfer reasoning patterns to smaller models and addresses challenges and future directions in RL for AI.

2025-02-06 Tags: deepseek-r1, reinforcement learning, distillation, llm, huggingface, machine learning by klotz

Understanding The Self-Attention Mechanism

The self-attention mechanism is used to capture interactions between words within input and output sequences. It involves computing keys, queries, and values vectors, followed by matrix multiplications and a softmax transformation to produce an attention matrix.

2025-02-04 Tags: self-attention, llm, machine learning, neural network by klotz

Deep Dive into Self-Attention by Hand✍︎

Explore the intricacies of the attention mechanism responsible for fueling the transformers.

2025-02-04 Tags: transformers, self-attention, neural networks, llm, machine learning by klotz

The Math Behind DeepSeek-R1

DeepSeek-R1 is a groundbreaking AI model that uses reinforcement learning to teach large language models to reason, outperforming models like GPT4-o1 at a fraction of the computational cost.

2025-02-01 Tags: deepseek-r1, reinforcement learning, llm, machine learning, deepseek by klotz

TinyZero

TinyZero is a reproduction of DeepSeek R1 Zero in countdown and multiplication tasks. It is built upon veRL and allows the 3B base LM to develop self-verification and search abilities through reinforcement learning.

2025-02-01 Tags: deepseek r1, reinforcement learning, tinyzero, llm by klotz

First / Previous / Next / Last / Page 1 of 0

SemanticScuttle - klotz.me

Tags: deep learning* + machine learning*

Linked Tags

Related Tags