This article details building end-to-end observability for LLM applications using FastAPI and OpenTelemetry. It emphasizes a code-first approach, manually designing traces, spans, and semantic attributes to capture the full lifecycle of LLM-powered requests. The guide advocates for a structured approach to tracing RAG workflows, focusing on clear span boundaries, safe metadata capture (hashing prompts/responses), token usage tracking, and integration with observability backends like Jaeger, Grafana Tempo, or specialized LLM platforms. It highlights the importance of understanding LLM behavior beyond traditional infrastructure metrics.
Zvec is an open-source, in-process vector database — lightweight, lightning-fast, and designed to embed directly into applications. Built on Proxima (Alibaba's battle-tested vector search engine), it delivers production-grade, low-latency, scalable similarity search with minimal setup.
Here’s the simplest version — key sentence extraction:
<pre>
```
def extract_relevant_sentences(document, query, top_k=5):
sentences = document.split('.')
query_embedding = embed(query)
scored = »
for sentence in sentences:
similarity = cosine_sim(query_embedding, embed(sentence))
scored.append((sentence, similarity))
scored.sort(key=lambda x: x 1 » , reverse=True)
return '. '.join( s[0 » for s in scored :top_k » ])
```
</pre>
For each sentence, compute similarity to the query. Keep the top 5. Discard the rest
An AI-powered document search agent that explores files like a human would — scanning, reasoning, and following cross-references. Unlike traditional RAG systems that rely on pre-computed embeddings, this agent dynamically navigates documents to find answers.
A polyglot document intelligence framework with a Rust core that extracts text, metadata, and structured information from PDFs, Office documents, images, and 50+ formats. Available for Rust, Python, Ruby, Java, Go, PHP, Elixir, C#, TypeScript (Node/Bun/Wasm/Deno) or use via CLI, REST API, or MCP server.
FailSafe is an open-source, modular framework designed to automate the verification of textual claims. It employs a multi-stage pipeline that integrates Large Language Models (LLMs) with retrieval-augmented generation (RAG) techniques.
This article details how to build a 100% local MCP (Model Context Protocol) client using LlamaIndex, Ollama, and LightningAI. It provides a code walkthrough and explanation of the process, including setting up an SQLite MCP server and a locally served LLM.
A curated repository of AI-powered applications and agentic systems showcasing practical use cases of Large Language Models (LLMs) from providers like Google, Anthropic, OpenAI, and self-hosted open-source models.
A curated collection of Awesome LLM apps built with RAG, AI Agents, Multi-agent Teams, MCP, Voice Agents, and more. This repository features LLM apps that use models from OpenAI, Anthropic, Google, xAI and open-source models like Qwen or Llama.
MarkItDown is an open-source Python utility that simplifies converting diverse file formats into Markdown, designed to prepare data for LLMs and RAG systems. It handles various file types, preserves document structure, and integrates with LLMs for tasks like image description.