klotz: self-hosted*

Bookmarks on this page are managed by an admin user.

0 bookmark(s) - Sort by: Date ↓ / Title / - Bookmarks from other users for this tag

  1. Compare the performance of different LLM that can be deployed locally on consumer hardware. The expected good response and scores are generated by GPT-4.
    2023-06-09 Tags: , , , by klotz
  2. # obtain the original LLaMA model weights and place them in ./models
    ls ./models
    65B 30B 13B 7B tokenizer_checklist.chk tokenizer.model

    # install Python dependencies
    python3 -m pip install -r requirements.txt

    # convert the 7B model to ggml FP16 format
    python3 convert.py models/7B/

    # quantize the model to 4-bits (using q4_0 method)
    ./quantize ./models/7B/ggml-model-f16.bin ./models/7B/ggml-model-q4_0.bin q4_0

    # run the inference
    ./main -m ./models/7B/ggml-model-q4_0.bin -n 128
    2023-06-05 Tags: , , , , by klotz

Top of the page

First / Previous / Next / Last / Page 1 of 0 SemanticScuttle - klotz.me: Tags: self-hosted

About - Propulsed by SemanticScuttle