SemanticScuttle - klotz.me » Tags: fine-tuning+llm

Tags: fine-tuning* + llm*

0 bookmark(s) - Sort by: Date ↓ / Title /

Tutorial: How to Run & Fine-tune Gemma 3

This document details how to run and fine-tune Gemma 3 models (1B, 4B, 12B, and 27B) using Unsloth, covering setup with Ollama and llama.cpp, and addressing potential float16 precision issues. It also highlights Unsloth's unique ability to run Gemma 3 in float16 on machines like Colab notebooks with Tesla T4 GPUs.

2025-04-09 Tags: gemma 3, unsloth, llama.cpp, ollama, fine-tuning, llm, inference by klotz

Training Large Language Models with Interpreter Feedback using WebAssembly

This article details a method for training large language models (LLMs) for code generation using a secure, local WebAssembly-based code interpreter and reinforcement learning with Group Relative Policy Optimization (GRPO). It covers the setup, training process, evaluation, and potential next steps.

2025-04-04 Tags: huggingface, llm, training, code generation, webassembly, wasm, grpo, reinforcement learning, axolotl, code interpreter, fine-tuning, python by klotz

Are You Still Using LoRA to Fine-Tune Your LLM?

A look at this year’s crop of LoRA alternatives, including SVF, SVFT, MiLoRA, PiSSA, and LoRA-XS, all based on SVD (Singular Value Decomposition). The article compares these techniques to the original LoRA method for fine-tuning Large Language Models.

Method	Description	Key Feature(s)	Reference
LoRA	Freezes the model and trains a small pair of low-rank “adapter” matrices.	Saves memory and compute cycles by reducing the number of trainable parameters.	arxiv.org/abs/2106.09685
SVF	Uses SVD on the model’s weight matrices and fine-tunes the singular values directly.	More economical in parameters than LoRA; makes tuned models composable.	arxiv.org/abs/2501.06252v2
SVFT	Adds more trainable weights on the diagonal and evaluates various alternatives.	Provides more trainable values than just the diagonal, useful for better fine-tuning.	arxiv.org/abs/2405.19597
PiSSA	Tunes only the large principal values.	Designed to approximate full fine-tuning by adapting the principal singular components.	arxiv.org/abs/2404.02948
MiLoRA	Tunes only the small principal values.	Retains base model’s knowledge while adapting to new tasks.	arxiv.org/abs/2406.09044
LoRA-XS	Similar to PiSSA but with a slightly different mechanism.	Shows good results with significantly fewer parameters than LoRA.	arxiv.org/abs/2405.17604
DoRA	Splits weights into magnitudes and directions then tunes those.		arxiv.org/abs/2402.09353
AdaLoRA	Complex mechanism for finding the best tuning rank for a given budget of trainable weights.		arxiv.org/abs/2303.10512

2025-03-14 Tags: lora, llm, fine-tuning by klotz

6 Common LLM Customization Strategies Briefly Explained

The article explains six essential strategies for customizing Large Language Models (LLMs) to better meet specific business needs or domain requirements. These strategies include Prompt Engineering, Decoding and Sampling Strategy, Retrieval Augmented Generation (RAG), Agent, Fine-Tuning, and Reinforcement Learning from Human Feedback (RLHF). Each strategy is described with its benefits, limitations, and implementation approaches to align LLMs with specific objectives.

2025-02-25 Tags: llm, prompt engineering, rag, agent, fine-tuning, rlhf by klotz

Tutorial to Fine-Tuning Mistral 7B with QLoRA Using Axolotl for Efficient LLM Training

This tutorial guides readers on how to fine-tune the Mistral 7B large language model using QLoRA with the Axolotl library, focusing on managing limited GPU resources for efficient training. It covers environment setup, dataset creation, configuration of QLoRA hyperparameters, the fine-tuning process, and testing the fine-tuned model.

2025-02-10 Tags: mistral 7b, qlora, axolotl, fine-tuning, llm, training, lora by klotz

Fine-Tuning of Llama-2 7B Chat for Python Code Generation: Using QLoRA, SFTrainer, and Gradient Checkpointing on the Alpaca-14k Dataset

This tutorial demonstrates how to fine-tune the Llama-2 7B Chat model for Python code generation using QLoRA, gradient checkpointing, and SFTTrainer with the Alpaca-14k dataset.

2025-02-09 Tags: llama-2, python, code generation, qlora, sftrainer, fine-tuning, llm, machine learning by klotz

DeepSeek Fine-Tuning Made Simple: Create Custom AI Models with Python

The article by Krishan Walia provides a beginner-friendly guide on fine-tuning the DeepSeek R1 model using Python. It highlights how developers can transform a general-purpose AI model into a specialized, domain-specific language model for various applications.

2025-02-02 Tags: deepseek r1, fine-tuning, llm, python by klotz

The Next Frontier in LLM Accuracy

The article explores techniques to improve Large Language Model (LLM) accuracy, focusing on Lamini Memory Tuning. It discusses fine-tuning methods like Low-Rank Adaptation (LoRA), the advantages and disadvantages of fine-tuning, and practical steps using Lamini to achieve higher precision in SQL query generation. The author demonstrates a step-by-step approach to creating a high-quality dataset, fine-tuning, and evaluating model accuracy.

2025-01-12 Tags: llm, fine-tuning, lamini memory tuning, lora, sql by klotz

Can a decoder-encoder model be quickly fine-tuned to translate Egyptian Middle Kingdom hieroglyphics?

The post discusses the feasibility of fine-tuning a decoder-encoder model to translate Egyptian Middle Kingdom hieroglyphics into English. The author suggests that with sufficient training data and a tokenizer that includes Egyptian characters, the model could learn to interpret hieroglyphics fluently. Comments from users mention using plugins and existing knowledge in models as alternatives to fine-tuning.

2024-11-01 Tags: fine-tuning, egyptian, hieroglyphics, translation, llm, reddit by klotz

13 Must-know Open-source Software to Build Production-ready AI Apps

A list of 13 open-source software for building and managing production-ready AI applications. The tools cover various aspects of AI development, including LLM tool integration, vector databases, RAG pipelines, model training and deployment, LLM routing, data pipelines, AI agent monitoring, LLM observability, and AI app development.

Composio - Seamless integration of tools with LLMs.
Weaviate - AI-native vector database for AI apps.
Haystack - Framework for building efficient RAG pipelines.
LitGPT - Pretrain, fine-tune, and deploy models at scale.
DsPy - Framework for programming LLMs.
Portkey's Gateway - Reliably route to 200+ LLMs with one API.
AirByte - Reliable and extensible open-source data pipeline.
AgentOps - Agents observability and monitoring.
ArizeAI's Phoenix - LLM observability and evaluation.
vLLM - Easy, fast, and cheap LLM serving for everyone.
Vercel AI SDK - Easily build AI-powered products.
LangGraph - Build language agents as graphs.
Taipy - Build AI apps in Python.

2024-08-16 Tags: foss, llm, production, rag, fine-tuning, vector database, tools by klotz

First / Previous / Next / Last / Page 1 of 0

SemanticScuttle - klotz.me

Tags: fine-tuning* + llm*

Linked Tags

Related Tags