SemanticScuttle - klotz.me » klotz: serving

klotz: serving*

0 bookmark(s) - Sort by: Date ↓ / Title / - Bookmarks from other users for this tag

End-to-end LLM Workflows Guide

This guide demonstrates how to execute end-to-end LLM workflows for developing and productionizing LLMs at scale. It covers data preprocessing, fine-tuning, evaluation, and serving.

2024-06-21 Tags: llm, workflows, data preprocessing, fine-tuning, evaluation, serving, ray, anyscale by klotz
Meet PowerInfer: A Fast Large Language Model (LLM) on a Single Consumer-Grade GPU that Speeds up Machine Learning Model Inference By 11 Times - MarkTechPost

2023-12-24 Tags: llm, serving, powerinfer by klotz
PowerInfer - High-speed Large Language Model Serving on PCs with Consumer-grade GPUs

PowerInfer is a CPU/GPU LLM inference engine leveraging activation locality for your device.

2026-01-13 Tags: llm, serving, cpu, gpu, github by klotz
Paper page - PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU

2023-12-22 Tags: llm, hierarchical, memory, serving, nvidia, arxiv by klotz
ml-tooling/opyrator:

2021-04-23 Tags: machine learning, python, serving, operator by klotz
Serving a TensorFlow Model | TensorFlow

2018-08-20 Tags: tensorflow, production, serving by klotz

Top of the page

First / Previous / Next / Last / Page 1 of 0

About - Propulsed by SemanticScuttle