A step-by-step guide to catching real anomalies without drowning in false alerts.
This article details how the author created a dashboard to manage their self-hosted applications, focusing on the use of Homepage and its benefits for organization and access to service information.
Dozzle is a lightweight, self-hosted solution that provides a real-time look into your container logs, offering an intuitive UI, real-time logging, intelligent search, and support for multiple use cases like home labs and local development.
TraceRoot.AI is an AI-native observability platform that helps developers fix production bugs faster by analyzing structured logs and traces. It offers SDK integration, AI agents for root cause analysis, and a platform for comprehensive visualizations.
TraceRoot accelerates the debugging process with AI-powered insights. It integrates seamlessly into your development workflow, providing real-time trace and log analysis, code context understanding, and intelligent assistance. It offers both a cloud and self-hosted version, with SDKs available for Python and JavaScript/TypeScript.
A fancy self-hosted monitoring tool. Monitors uptime for HTTP(s) / TCP / HTTP(s) Keyword / HTTP(s) Json Query / Ping / DNS Record / Push / Steam Game Server / Docker Containers. Offers notifications via Telegram, Discord, Gotify, Slack, Pushover, Email (SMTP), and more.
This article details five Linux terminal utilities โ ncdu, btop++, bandwhich, mtr, and bmon โ that enhance system resource monitoring beyond standard tools.
| **Utility** | **Description** |
|---|---|
| ncdu | Directory disk usage explorer |
| btop++ | System resource monitor with a top-like interface |
| bandwhich | Real-time network monitor |
| mtr | Network traceroute with live statistics |
| bmon | Bandwidth monitor |
K8S-native cluster-wide deployment for vLLM. Provides a reference implementation for building an inference stack on top of vLLM, enabling scaling, monitoring, request routing, and KV cache offloading with easy cloud deployment.
A discussion post on Reddit's LocalLLaMA subreddit about logging the output of running models and monitoring performance, specifically for debugging errors, warnings, and performance analysis. The post also mentions the need for flags to output logs as flat files, GPU metrics (GPU utilization, RAM usage, TensorCore usage, etc.) for troubleshooting and analytics.