klotz: llama2-7b* + gpu*

0 bookmark(s) - Sort by: Date ↓ / Title / - Bookmarks from other users for this tag

  1. GPU-accelerated LLMs on Odrange Pi 5, which features a Mali-G610 GPU. The authors used Machine Learning Compilation (MLC) techniques to achieve speeds of 2.3 tok/sec for Llama3-8b, 2.5 tok/sec for Llama2-7b, and 5 tok/sec for RedPajama-3b. They also managed to run a Llama-2 13b model at 1.5 tok/sec on a 16GB version of the Orange Pi 5+.

Top of the page

First / Previous / Next / Last / Page 1 of 0 SemanticScuttle - klotz.me: Tags: llama2-7b + gpu

About - Propulsed by SemanticScuttle