SemanticScuttle - klotz.me » klotz: cache+llm

klotz: cache* + llm*

SkyMemory: A LEO Edge Cache for Transformer Inference Optimization and Scale Out

This paper proposes SkyMemory, a LEO satellite constellation hosted key-value cache (KVC) to accelerate transformer-based inference, particularly for large language models (LLMs). It explores different chunk-to-server mapping strategies (rotation-aware, hop-aware, and combined) and presents simulation results and a proof-of-concept implementation demonstrating performance improvements.

2025-08-10 Tags: low-earth orbit, bernardo huberman, cache, transformer, inference, llm, key-value, edge computing, satellite communication by klotz

Implementing semantic cache to improve a RAG system with FAISS

In this notebook, we will explore a typical RAG solution where we will utilize an open-source model and the vector database Chroma DB. However, we will integrate a semantic cache system that will store various user queries and decide whether to generate the prompt enriched with information from the vector database or the cache.

2024-03-12 Tags: llm, rag, chromadb, faiss, cache by klotz

First / Previous / Next / Last / Page 1 of 0

SemanticScuttle - klotz.me

klotz: cache* + llm*

Linked Tags

Related Tags