klotz: prism ml*

0 bookmark(s) - Sort by: Date ↓ / Title / - Bookmarks from other users for this tag

  1. Bonsai-8B-GGUF-1bit is an end-to-end 1-bit language model designed for high-efficiency deployment using llama.cpp across CUDA, Metal, and CPU architectures. This model provides a massive 14.1x reduction in memory footprint compared to standard FP16, requiring only 1.15 GB of parameter memory. By leveraging the GGUF Q1_0_g128 format, it achieves significant performance boosts, including 6.2x faster throughput on an RTX 4090 and substantially lower energy consumption per token. It is an ideal solution for on-device assistants, mobile applications, and edge robotics where memory, thermal, and power constraints are paramount.

Top of the page

First / Previous / Next / Last / Page 1 of 0 SemanticScuttle - klotz.me: Tags: prism ml

About - Propulsed by SemanticScuttle