Tags: cpu* + llm*

0 bookmark(s) - Sort by: Date ↓ / Title /

  1. LocalScore is an open benchmark to evaluate local AI task performance across various hardware configurations, measuring Prompt Processing speed, Token Generation speed, Time-to-First-Token (TTFT), and a combined LocalScore.

  2. NVIDIA DGX Spark is a desktop-friendly AI supercomputer powered by the NVIDIA GB10 Grace Blackwell Superchip, delivering 1000 AI TOPS of performance with 128GB of memory. It is designed for prototyping, fine-tuning, and inference of large AI models.

  3. This article explains how to accurately quantize a Large Language Model (LLM) and convert it to the GGUF format for efficient CPU inference. It covers using an importance matrix (imatrix) and K-Quantization method with Gemma 2 Instruct as an example, while highlighting its applicability to other models like Qwen2, Llama 3, and Phi-3.

    2024-09-14 Tags: , , , , , by klotz
  4. 2023-12-24 Tags: , , , , by klotz
  5. 2023-08-28 Tags: , , , by klotz
  6. 2023-08-03 Tags: , , , by klotz
  7. 2023-07-22 Tags: , , , , , , by klotz
  8. 2023-06-25 Tags: , , , by klotz
  9. 2023-06-14 Tags: , , , by klotz
  10. 2023-06-09 Tags: , , , , , by klotz

Top of the page

First / Previous / Next / Last / Page 1 of 0 SemanticScuttle - klotz.me: tagged with "cpu+llm"

About - Propulsed by SemanticScuttle