klotz: gpu*

0 bookmark(s) - Sort by: Date ↓ / Title / - Bookmarks from other users for this tag

  1. This article explores the concept of quantization in large language models (LLMs) and its benefits, including reducing memory usage and improving performance. It also discusses various quantization methods and their effects on model quality.
    2024-07-14 Tags: , , , by klotz
  2. A discussion post on Reddit's LocalLLaMA subreddit about logging the output of running models and monitoring performance, specifically for debugging errors, warnings, and performance analysis. The post also mentions the need for flags to output logs as flat files, GPU metrics (GPU utilization, RAM usage, TensorCore usage, etc.) for troubleshooting and analytics.
  3. GPU-accelerated LLMs on Odrange Pi 5, which features a Mali-G610 GPU. The authors used Machine Learning Compilation (MLC) techniques to achieve speeds of 2.3 tok/sec for Llama3-8b, 2.5 tok/sec for Llama2-7b, and 5 tok/sec for RedPajama-3b. They also managed to run a Llama-2 13b model at 1.5 tok/sec on a 16GB version of the Orange Pi 5+.
  4. 2023-12-24 Tags: , , , , by klotz
  5. Purge everything first, then install latest driver from distro repo, last the toolkit from nvidia repo.
    2023-11-21 Tags: , , , , , , , , by klotz
  6. 2023-08-03 Tags: , , , by klotz
  7. 2023-07-15 Tags: , , , , by klotz
  8. 2023-06-09 Tags: , , , , by klotz
  9. 2023-06-09 Tags: , , , , , by klotz

Top of the page

First / Previous / Next / Last / Page 1 of 0 SemanticScuttle - klotz.me: Tags: gpu

About - Propulsed by SemanticScuttle