SemanticScuttle - klotz.me

Tags: gpu* + llm*

0 bookmark(s) - Sort by: Date ↓ / Title /

How to log output of running models and performance monitoring

A discussion post on Reddit's LocalLLaMA subreddit about logging the output of running models and monitoring performance, specifically for debugging errors, warnings, and performance analysis. The post also mentions the need for flags to output logs as flat files, GPU metrics (GPU utilization, RAM usage, TensorCore usage, etc.) for troubleshooting and analytics.

2024-06-12 Tags: llama, python, logging, performance, monitoring, gpu, metrics, debugging, nvidia, analytics, product lion engineering, llms by klotz
GPU-Accelerated LLM on a $100 Orange Pi: 2.3 tok/sec for Llama3-8b, 2.5 tok/sec for Llama2-7b, and 5 tok/sec for RedPajama-3b

GPU-accelerated LLMs on Odrange Pi 5, which features a Mali-G610 GPU. The authors used Machine Learning Compilation (MLC) techniques to achieve speeds of 2.3 tok/sec for Llama3-8b, 2.5 tok/sec for Llama2-7b, and 5 tok/sec for RedPajama-3b. They also managed to run a Llama-2 13b model at 1.5 tok/sec on a 16GB version of the Orange Pi 5+.

2024-05-20 Tags: llm, orange pi, gpu, mali-g610, llama3-8b, llama2-7b, redpajama-3b, ipt, raspberry pi by klotz
PowerInfer - High-speed Large Language Model Serving on PCs with Consumer-grade GPUs

2023-12-24 Tags: llm, serving, cpu, gpu, github by klotz
Fine-Tune Your LLM Without Maxing Out Your GPU | by John Adeojo | Jul, 2023 | Towards Data Science

2023-08-03 Tags: llm, gpu, cpu, fine-tune by klotz
Reddit LocalLlama GPU / CPU

2023-06-09 Tags: llama, llama.cpp, llm, reddit, gpu, nvidia, 3090, 4090, machine learning by klotz
Rent GPUs | Vast.ai

2023-06-09 Tags: vast. ai, llm, gpu, saas, cloud by klotz

Top of the page

First / Previous / Next / Last / Page 2 of 0

About - Propulsed by SemanticScuttle

SemanticScuttle - klotz.me

Tags: gpu* + llm*

Linked Tags

Related Tags