Tags: huggingface* + deep learning*

0 bookmark(s) - Sort by: Date ↓ / Title /

  1. This blog post details the training of 'Chess Llama', a small Llama model designed to play chess. It covers the inspiration behind the project (Chess GPT), the dataset used (Lichess Elite database), the training process using Huggingface Transformers, and the model's performance (Elo rating of 1350-1400). It also includes links to try the model and view the source code.
  2. This article details a method for training large language models (LLMs) for code generation using a secure, local WebAssembly-based code interpreter and reinforcement learning with Group Relative Policy Optimization (GRPO). It covers the setup, training process, evaluation, and potential next steps.
  3. The article explores the DeepSeek-R1 models, focusing on how reinforcement learning (RL) is used to develop advanced reasoning capabilities in AI. It discusses the DeepSeek-R1-Zero model, which learns reasoning without supervised fine-tuning, and the DeepSeek-R1 model, which combines RL with a small amount of supervised data for improved performance. The article highlights the use of distillation to transfer reasoning patterns to smaller models and addresses challenges and future directions in RL for AI.
  4. This post is about detecting text sentiment in an unsupervised way, using Hugging Face zero-shot text classification model.

Top of the page

First / Previous / Next / Last / Page 1 of 0 SemanticScuttle - klotz.me: tagged with "huggingface+deep learning"

About - Propulsed by SemanticScuttle