SemanticScuttle - klotz.me » Tags: llama.cpp

uogbuji/OgbujiPT: Toolkit for using self-hosted large language models, through langchain & other means This bookmark is certified by an admin user.

2023-06-22 Tags: uche ogbuji, llama.cpp, llama, chat, gpt, llm by klotz

TheBloke/Wizard-Vicuna-30B-Uncensored-GGML · Hugging Face This bookmark is certified by an admin user.

Explanation of the new k-quant methods
The new methods available are:

GGML_TYPE_Q2_K - "type-1" 2-bit quantization in super-blocks containing 16 blocks, each block having 16 weight. Block scales and mins are quantized with 4 bits. This ends up effectively using 2.5625 bits per weight (bpw)
GGML_TYPE_Q3_K - "type-0" 3-bit quantization in super-blocks containing 16 blocks, each block having 16 weights. Scales are quantized with 6 bits. This end up using 3.4375 bpw.
GGML_TYPE_Q4_K - "type-1" 4-bit quantization in super-blocks containing 8 blocks, each block having 32 weights. Scales and mins are quantized with 6 bits. This ends up using 4.5 bpw.
GGML_TYPE_Q5_K - "type-1" 5-bit quantization. Same super-block structure as GGML_TYPE_Q4_K resulting in 5.5 bpw
GGML_TYPE_Q6_K - "type-0" 6-bit quantization. Super-blocks with 16 blocks, each block having 16 weights. Scales are quantized with 8 bits. This ends up using 6.5625 bpw
GGML_TYPE_Q8_K - "type-0" 8-bit quantization. Only used for quantizing intermediate results. The difference to the existing Q8_0 is that the block size is 256. All 2-6 bit dot products are implemented for this quantization type.

2023-06-08 Tags: huggingface, llama, vicuna, quantization, k-quant, gpu, cpu, acceleration, llama.cpp by klotz

The Most Simple Way to Set Up ChatGPT Locally This bookmark is certified by an admin user.

2024-01-18 Tags: llm, quantization, llama.cpp, self-hosted, tutorial by klotz

Tail Free Sampling – Trenton Bricken – Interested in Machine Learning, Neuroscience, and Original Glazed Krispy Kreme Doughnuts. This bookmark is certified by an admin user.

2023-06-12 Tags: llm, sampling, llama.cpp, llama, text processing by klotz

Show HN: Grammar Generator App for Llama.cpp | Hacker News This bookmark is certified by an admin user.

2024-02-13 Tags: gbnf, llama.cpp, text extraction, functions, json, github by klotz

Reddit LocalLlama GPU / CPU This bookmark is certified by an admin user.

2023-06-09 Tags: llama, llama.cpp, llm, reddit, gpu, nvidia, 3090, 4090, machine learning by klotz

localllm/llm-tool at main · GoogleCloudPlatform/localllm This bookmark is certified by an admin user.

llm-tool provides a command-line utility for running large language models locally. It includes scripts for pulling models from the internet, starting them, and managing them using various commands such as 'run', 'ps', 'kill', 'rm', and 'pull'. Additionally, it offers a Python script named 'querylocal.py' for querying these models. The repository also come

2024-02-08 Tags: llm, localllama, self-hosted, google, gcp, foss, llama.cpp, github by klotz

LLM Large Language Model Toolkit: Google This bookmark is certified by an admin user.

The "LLM" toolkit offers a versatile command-line utility and Python library that allows users to work efficiently with large language models. Users can execute prompts directly from their terminals, store the outcomes in SQLite databases, generate embeddings, and perform various other tasks. In this extensive tutorial, topics covered include setup, usage, OpenAI models, alternative models, embeddings, plugins, model aliases, Python APIs, prompt templates, logging, related tools, CLI references, contributing, and change logs.

2024-02-08 Tags: llm, cli, google, llama.cpp by klotz

Llama2 Models - Hugging Face This bookmark is certified by an admin user.

2023-07-19 Tags: llama.cpp, llama, llama2, facebook, meta, hugging face, thebloke, models, llm by klotz

Llama.cpp vicuna This bookmark is certified by an admin user.

2023-06-17 Tags: llama.cpp, vicuna, llm by klotz

SemanticScuttle - klotz.me

Tags: llama.cpp*

Linked Tags

Related Tags