SemanticScuttle - klotz.me » klotz: retrieval augmented generation

klotz: retrieval augmented generation*

Retrieval-augmented generation with Nvidia NeMo Retriever

Nvidia’s NeMo Retriever models and RAG pipeline make quick work of ingesting PDFs and generating reports based on them. Chalk one up for the plan-reflect-refine architecture.

2025-08-23 Tags: nvidia, nemo retriever, rag, ai, llms by klotz

Sparse Priming Representations (SPR)

Sparse Priming Representations (SPR) is a research project focused on developing and sharing techniques for efficiently representing complex ideas, memories, or concepts using a minimal set of keywords, phrases, or statements, enabling language models or subject matter experts to quickly reconstruct the original idea with minimal context.

2025-08-20 Tags: sparse priming representations, llm, latent space, in-context learning, rag, vector database, knowledge management by klotz

Hitchhiker’s Guide to RAG: From Tiny Files to Tolstoy with OpenAI’s API and LangChain

Scaling a simple RAG pipeline from simple notes to full books. This post elaborates on how to utilize larger files with your RAG pipeline by adding an extra step to the process — chunking.

2025-08-20 Tags: rag, openai, langchain, llm, vector database, faiss, chunking, medium by klotz

Improving LangChain Knowledge Graph Extraction with BAML Fuzzy Parsing

An end-to-end raw text-to-graph pipelines. This blog explores the limitations of LangChain extraction when using smaller quantized models, and how BAML can improve extraction success rates.

2025-08-09 Tags: langchain, knowledge graph, extraction, baml, fuzzy parsing, llm, rag by klotz

10 Open Source AI Tools Every Developer Should Know

This article details 10 open-source AI tools for developers, covering their benefits, features, and use cases. It emphasizes transparency, offline capabilities, and community support as key advantages of open-source AI.

| **Tool Name** | **Description** | **Key Features** | **What I Like About It** |
|---|---|---|---|
| **Talkd.ai** | Prototyping AI Agents | No-code, JSON/YAML config, API integration | Fast prototyping, no backend needed |
| **Marimo** | Python Notebooks for Apps | Reactive cells, version control, UI widgets | Stable, shareable, version-controlled apps |
| **Unsloth AI** | LLM Fine-Tuning | Memory-optimized training, supports Llama 3 | Accessible fine-tuning on modest hardware |
| **HackingBuddyGPT** | AI for Ethical Hacking | Offline operation, recon tools, payload generation | Offline security, privacy |
| **Giskard** | AI Testing & Debugging | Test case creation, continuous monitoring | Engineering discipline for AI quality |
| **OpenWebUI** | Self-Hosted ChatGPT UI | Local LLMs, plugin support, persistent memory | Privacy, local control |
| **Axolotl** | LLM Fine-Tuning | YAML config, supports QLORA/PEFT/LORA | Simplified fine-tuning, reproducibility |
| **FastRAG** | RAG Pipeline | Local operation, fast query times | Quick, lightweight RAG setup |
| **Nav2** | Robot Navigation Framework | Real-time obstacle detection, multi-robot coordination | Flexible, modern ROS 2 integration |
| **MindsDB** | Machine Learning in Database | SQL-based training/inference, supports various DBs | Easy integration with existing SQL workflows |

2025-07-29 Tags: webdev, llm, rag, foss by klotz

Why Knowledge Graphs are Critical to Agent Context

This article discusses the importance of knowledge graphs in providing context for AI agents, highlighting their advantages over traditional retrieval systems in terms of precision, reasoning, and explainability.

2025-07-12 Tags: knowledge graph, agents, context, rag, vector, search, kuzudb by klotz

MarkItDown: Microsoft’s open-source tool for Markdown conversion

MarkItDown is an open-source Python utility that simplifies converting diverse file formats into Markdown, designed to prepare data for LLMs and RAG systems. It handles various file types, preserves document structure, and integrates with LLMs for tasks like image description.

2025-05-10 Tags: markitdown, microsoft, open source, markdown, llm, rag, data conversion, python, ai, data preparation, document processing by klotz

IBM Granite 3.3: Speech recognition, refined reasoning, and RAG LoRAs

IBM announces Granite 3.3, featuring a new speech-to-text model (Granite Speech 3.3 8B), enhanced reasoning capabilities in Granite 3.3 8B Instruct, and RAG-focused LoRA adapters for Granite 3.2. The release also includes activated LoRAs (aLoRAs) for improved efficiency and all models are open source.

2025-05-07 Tags: ibm, granite, llm, speech recognition, rag, lora, foss, granite 3.3, granite speech, watsonx.ai by klotz

Why Your RAG Embeddings Are Costing You a Fortune (And How I Fixed It)

This article details the often overlooked cost of storing embeddings for RAG systems, and how quantization techniques (int8 and binary) can significantly reduce storage requirements and improve retrieval speed without substantial accuracy loss.

2025-04-30 Tags: rag, embedding, vector database, transformers, llm, quantization by klotz

Let’s Build a RAG-Powered Research Paper Assistant

This article details building a Retrieval-Augmented Generation (RAG) system to assist with research paper tasks, specifically question answering over a PDF document. It covers document loading, splitting, embedding with Sentence Transformers, using ChromaDB as a vector database, and implementing a query interface with LangChain.

2025-04-23 Tags: docker, rag, langchain, sentence transformers, chromadb, vector database, pdf, llm by klotz

First / Previous / Next / Last / Page 1 of 0

SemanticScuttle - klotz.me

klotz: retrieval augmented generation*

Linked Tags

Related Tags