This article details the journey of deploying an on-premise Large Language Model (LLM) server, focusing on security considerations. It explores the rationale behind on-premise deployment for privacy and data control, outlining the goals of creating an air-gapped, isolated infrastructure. The authors delve into the hardware selection process, choosing components like an Nvidia RTX Pro 6000 Max-Q for its memory capacity. The deployment process starts with a minimal setup using llama.cpp, then progresses to containerization with Podman and the use of CDI for GPU access. Finally, the article discusses hardening techniques, including kernel module management and file permission restrictions, to minimize the attack surface and enhance security.
Run, validate and execute GitHub Actions locally. WRKFLW is a powerful command-line tool for validating and executing GitHub Actions workflows locally, without requiring a full GitHub environment.
The article discusses Apple Container, a new tool for running Linux containers on macOS, comparing its performance and efficiency to Docker Desktop. It highlights its ease of setup on Silicon Macs, compatibility with Dockerfiles, and potential as a lightweight alternative for home lab enthusiasts.
Learn how to get started with Podman, a daemon-less and secure containerization tool that's a great alternative to Docker.
A user is facing an issue with running WireGuard in a Podman container without using the privileged flag. They encounter an iptables-restore error and have tried troubleshooting steps without success.
For anyone who finds this in the future I needed to add --cap-add=NET_RAW
An open source extension for Podman Desktop to work with large language models (LLMs) on a local environment
Podman AI Lab is an open source extension for Podman Desktop that allows users to work with LLMs on a local environment, featuring a recipe catalog with common AI use cases, a curated set of open source models, and a playground for learning, prototyping, and experimentation. It uses Podman machines to run inference servers for LLM models and supports various formats like GGUF, Pytorch, and Tensorflow.
Red Hat’s Podman, a desktop tool for managing container pods, has been given extended duty as a platform for developers to build generative AI-based applications. Unlike many tools for building generative AI tools, this one was built specifically for developers, not data scientists.
Podman AI Lab is the easiest way to work with Large Language Models (LLMs) on your local developer workstation. It provides a catalog of recipes, a curated list of open source models, experiment and compare the models, get ahead of the curve and take your development to new heights wth Podman AI Lab!