klotz: on-premise*

0 bookmark(s) - Sort by: Date ↓ / Title / - Bookmarks from other users for this tag

  1. This article details the journey of deploying an on-premise Large Language Model (LLM) server, focusing on security considerations. It explores the rationale behind on-premise deployment for privacy and data control, outlining the goals of creating an air-gapped, isolated infrastructure. The authors delve into the hardware selection process, choosing components like an Nvidia RTX Pro 6000 Max-Q for its memory capacity. The deployment process starts with a minimal setup using llama.cpp, then progresses to containerization with Podman and the use of CDI for GPU access. Finally, the article discusses hardening techniques, including kernel module management and file permission restrictions, to minimize the attack surface and enhance security.

Top of the page

First / Previous / Next / Last / Page 1 of 0 SemanticScuttle - klotz.me: Tags: on-premise

About - Propulsed by SemanticScuttle