SemanticScuttle - klotz.me » klotz: transformers+llama

klotz: transformers* + llama*

Chess Llama - Training a tiny Llama model to play chess

This blog post details the training of 'Chess Llama', a small Llama model designed to play chess. It covers the inspiration behind the project (Chess GPT), the dataset used (Lichess Elite database), the training process using Huggingface Transformers, and the model's performance (Elo rating of 1350-1400). It also includes links to try the model and view the source code.

2025-07-21 Tags: chess, llama, llm, machine learning, artificial intelligence, deep learning, transformers, huggingface, chessgpt, uci, pgn by klotz

The Big LLM Architecture Comparison

A detailed comparison of the architectures of recent large language models (LLMs) including DeepSeek-V3, OLMo 2, Gemma 3, Mistral Small 3.1, Llama 4, Qwen3, SmolLM3, and Kimi 2, focusing on key design choices and their impact on performance and efficiency.

2025-07-19 Tags: llm, large language models, deep learning, ai, architecture, deepseek, olmo, gemma, mistral, llama, qwen, smollm, kimi, moe, attention, transformers by klotz

Transformer Lab: Experiment with Large Language Models

Transformer Lab is an open-source application for advanced LLM engineering, allowing users to interact, train, fine-tune, and evaluate large language models on their own computer. It supports various models, hardware, and inference engines and includes features like RAG, dataset building, and a REST API.

2025-04-11 Tags: electron, transformers, llama, lora, mlx, llms, rlhf, llm, github by klotz

First / Previous / Next / Last / Page 1 of 0

SemanticScuttle - klotz.me

klotz: transformers* + llama*

Linked Tags

Related Tags