SemanticScuttle - klotz.me » klotz: training

klotz: training*

Tutorial to Fine-Tuning Mistral 7B with QLoRA Using Axolotl for Efficient LLM Training

This tutorial guides readers on how to fine-tune the Mistral 7B large language model using QLoRA with the Axolotl library, focusing on managing limited GPU resources for efficient training. It covers environment setup, dataset creation, configuration of QLoRA hyperparameters, the fine-tuning process, and testing the fine-tuned model.

2025-02-10 Tags: mistral 7b, qlora, axolotl, fine-tuning, llm, training, lora by klotz

Grokfast: Accelerated Grokking by Amplifying Slow Gradients

This paper presents a method to accelerate the grokking phenomenon, where a model's generalization improves with more training iterations after an initial overfitting stage. The authors propose a simple algorithmic modification to existing optimizers that filters out the fast-varying components of the gradients and amplifies the slow-varying components, thereby accelerating the grokking effect.

2024-08-19 Tags: grokking, deep learning, optimization techniques, gradient filtering, llm, training, eric hartford by klotz

Training_PRO

Training PRO extension for oobabooga WebUI - recent dev version. Key features and changes from the main Training in WebUI include:

Chunking: precise raw text slicer (PRTS) uses sentence splitting and making sure things are clean on all ends
Overlapping chunking: this special overlapping will make additional overlap block based on logical rules
Custom scheduler: FP_low_epoch_annealing keeps the LR constant for the first epoch and uses cosine for the rest
Target selector: Normal LORA is q, v, and it should be used with (q k v o) or (q k v)
DEMENTOR LEARNING (experimental) is an experimental chunking to train long-form text in low numbers of epochs

2024-06-29 Tags: training, lln, oobabooga, extension, github by klotz

How to train your large language model: A new technique speeds up the process

This article discusses the process of training a large language model (LLM) using reinforcement learning from human feedback (RLHF) and a new alternative method called Direct Preference Optimization (DPO). The article explains how these methods help align the LLM with human expectations and make it more efficient.

2024-05-15 Tags: llm, reinforcement learning, human feedback, openai, chatgpt, rlhf, dpo, training by klotz

Pluralsight Cloud Guru Pricing

2024-05-06 Tags: training, aws, gcp by klotz

CodeCrafters | Advanced programming challenges

2024-05-04 Tags: codecrafters, training, github by klotz

How To Train Your LLM Efficiently? Best Practices for Small-Scale Implementation - MarkTechPost

2023-11-26 Tags: llm, training, self-hosted by klotz

Mastering LLM Techniques: Training

Delving into transformer networks

2023-11-18 Tags: nvidia, llm, training, transformers, deep learning by klotz

crème de la crème of AI courses

This repository is a curated collection of links to various courses and resources about Artificial Intelligence (AI)