SemanticScuttle - klotz.me » klotz: huggingface+machine learning

klotz: huggingface* + machine learning*

Chess Llama - Training a tiny Llama model to play chess

This blog post details the training of 'Chess Llama', a small Llama model designed to play chess. It covers the inspiration behind the project (Chess GPT), the dataset used (Lichess Elite database), the training process using Huggingface Transformers, and the model's performance (Elo rating of 1350-1400). It also includes links to try the model and view the source code.

2025-07-21 Tags: chess, llama, llm, machine learning, artificial intelligence, deep learning, transformers, huggingface, chessgpt, uci, pgn by klotz

Training Large Language Models with Interpreter Feedback using WebAssembly

This article details a method for training large language models (LLMs) for code generation using a secure, local WebAssembly-based code interpreter and reinforcement learning with Group Relative Policy Optimization (GRPO). It covers the setup, training process, evaluation, and potential next steps.

2025-04-04 Tags: huggingface, llm, training, code generation, webassembly, wasm, grpo, reinforcement learning, axolotl, code interpreter, fine-tuning, python by klotz

Ultrascale Playbook

A comprehensive guide to ultrascale machine learning, covering techniques, tools, and best practices.

2025-03-13 Tags: scale, machine learning, huggingface, production engineering, llm by klotz

Qodo-Embed-1-1.5B

Qodo-Embed-1-1.5B is a state-of-the-art code embedding model designed for retrieval tasks in the software development domain. It supports multiple programming languages and is optimized for natural language-to-code and code-to-code retrieval, making it highly effective for applications such as code search and retrieval-augmented generation.

2025-03-04 Tags: qodo-embed-1, code, embedding, llm, software development, huggingface by klotz

From Zero to Reasoning Hero: How DeepSeek-R1 Leverages Reinforcement Learning to Master Complex Reasoning

The article explores the DeepSeek-R1 models, focusing on how reinforcement learning (RL) is used to develop advanced reasoning capabilities in AI. It discusses the DeepSeek-R1-Zero model, which learns reasoning without supervised fine-tuning, and the DeepSeek-R1 model, which combines RL with a small amount of supervised data for improved performance. The article highlights the use of distillation to transfer reasoning patterns to smaller models and addresses challenges and future directions in RL for AI.

2025-02-06 Tags: deepseek-r1, reinforcement learning, distillation, llm, huggingface, machine learning by klotz

Training and Finetuning Embedding Models with Sentence Transformers v3

This article explains how to use the Sentence Transformers library to finetune and train embedding models for a variety of applications, such as retrieval augmented generation, semantic search, and semantic textual similarity. It covers the training components, dataset format, loss function, training arguments, evaluators, and trainer.

2024-05-28 Tags: sentence transformers, finetune, embedding, models, similarity, llm, huggingface by klotz

Setting up a Text Summarisation Project (Part 2) | by Heiko Hotz | Dec, 2021 | Towards Data Science

2021-12-06 Tags: transformer, summarization, huggingface, gpt-3, zero-shot, machine learning, nlp by klotz

AI Weekly: Researchers attempt an open source alternative to GitHub's Copilot | VentureBeat

2021-09-25 Tags: transformers, fine-tune, deep learning, gpt-3, huggingface, codex by klotz

samrawal/emacs-secondmate: An open-source, mini imitation of GitHub Copilot for Emacs.