SemanticScuttle - klotz.me » klotz: llama2-7b+llm

GPU-Accelerated LLM on a $100 Orange Pi: 2.3 tok/sec for Llama3-8b, 2.5 tok/sec for Llama2-7b, and 5 tok/sec for RedPajama-3b

GPU-accelerated LLMs on Odrange Pi 5, which features a Mali-G610 GPU. The authors used Machine Learning Compilation (MLC) techniques to achieve speeds of 2.3 tok/sec for Llama3-8b, 2.5 tok/sec for Llama2-7b, and 5 tok/sec for RedPajama-3b. They also managed to run a Llama-2 13b model at 1.5 tok/sec on a 16GB version of the Orange Pi 5+.

2024-05-20 Tags: llm, orange pi, gpu, mali-g610, llama3-8b, llama2-7b, redpajama-3b, ipt, raspberry pi by klotz

SemanticScuttle - klotz.me

klotz: llama2-7b* + llm*

Linked Tags

Related Tags