SemanticScuttle - klotz.me » klotz: embeddings+llm

klotz: embeddings* + llm*

Building a RAG System That Runs Completely Offline

A tutorial on building a private, offline Retrieval Augmented Generation (RAG) system using Ollama for embeddings and language generation, and FAISS for vector storage, ensuring data privacy and control.

1. **Document Loader:** Extracts text from various file formats (PDF, Markdown, HTML) while preserving metadata like source and page numbers for accurate citations.
2. **Text Chunker:** Splits documents into smaller text segments (chunks) to manage token limits and improve retrieval accuracy. It uses overlapping and sentence boundary detection to maintain context.
3. **Embedder:** Converts text chunks into numerical vectors (embeddings) using the `nomic-embed-text` model via Ollama, which runs locally without internet access.
4. **Vector Database:** Stores the embeddings using FAISS (Facebook AI Similarity Search) for fast similarity search. It uses cosine similarity for accurate retrieval and saves the database to disk for quick loading in future sessions.
5. **Large Language Model (LLM):** Generates answers using the `llama3.2` model via Ollama, also running locally. It takes the retrieved context and the user's question to produce a response with citations.
6. **RAG System Orchestrator:** Coordinates the entire workflow, managing the ingestion of documents (loading, chunking, embedding, storing) and the querying process (retrieving relevant chunks, generating answers).

2025-11-15 Tags: rag, self-hosted, llm, ollama, faiss, embeddings, vector database, hackernoon by klotz

semtools

Semantic search and document parsing tools for the command line. A collection of high-performance CLI tools for document processing and semantic search, built with Rust for speed and reliability.

2025-08-30 Tags: semantic search, document parsing, rust, cli, embeddings, llama-parse, llm by klotz

Feature Engineering with LLM Embeddings: Enhancing Scikit-Learn Models

The article discusses using Large Language Model (LLM) embeddings as features in traditional machine learning models built with scikit-learn. It covers the process of generating embeddings from text data using models like Sentence Transformers, and how these embeddings can be combined with existing features to improve model performance. It details practical steps including loading data, creating embeddings, and integrating them into a scikit-learn pipeline for tasks like classification.

2025-07-18 Tags: llm, embeddings, feature engineering, scikit-learn, machine learning, sentence transformers, text data, classification, pipelines by klotz

Building software on top of Large Language Models

A summary of a workshop presented at PyCon US on building software with LLMs, covering setup, prompting, building tools (text-to-SQL, structured data extraction, semantic search/RAG), tool usage, and security considerations like prompt injection. It also discusses the current LLM landscape, including models from OpenAI, Gemini, Anthropic, and open-weight alternatives.

2025-05-16 Tags: self-hosted, llm, embeddings, gemini, vision, tools, simon willison by klotz

SQLite RAG Tutorial

A simple project demonstrating Retrieval Augmented Generation (RAG) using SQLite, sqlite-vec, and OpenAI. It embeds text files, stores them in a SQLite database, and retrieves relevant documents using vector search. The project features lightweight single-file SQLite databases, vector search capabilities, and OpenAI integration for embeddings and chat responses.

2025-02-20 Tags: sqlite, rag, sqlite-vec, vector search, embeddings, llm, github, edizaguirre by klotz

PGAI Vectorizer: Automate AI Embeddings With One SQL Command in PostgreSQL

Learn how to automate AI embedding creation using PostgreSQL with pgai Vectorizer. Streamline your AI workflow with simple SQL commands.

ntegration: PGAI Vectorizer integrates AI capabilities into PostgreSQL, enabling users to generate AI embeddings directly within the database.
Ease of Use: It simplifies the process of creating embeddings using a single SQL command, eliminating the need for multiple tools and complex pipelines.
Automatic Sync: Embeddings are automatically updated as data changes, ensuring that embeddings stay current without manual intervention.
Model Flexibility: Users can quickly switch between different AI models without reprocessing data.
Scalability: Optimizes search performance with vector indexes, making it suitable for large datasets.
Customization: Allows users to define chunking and formatting rules to tailor embeddings to their specific needs.

2024-11-22 Tags: llm, vectorizer, postgresql, embeddings, sql by klotz

Discovering Semantic Search and RAG with Large Language Models (LLMs)

Foundational concepts, practical implementation of semantic search, and the workflow of RAG, highlighting its advantages and versatile applications.

The article provides a step-by-step guide to implementing a basic semantic search using TF-IDF and cosine similarity. This includes preprocessing steps, converting text to embeddings, and searching for relevant documents based on query similarity.

2024-10-04 Tags: llm, semantic search, rag, nlp, embeddings, asymmetric by klotz

Working with Embeddings: Closed versus Open Source

An article discussing the use of embeddings in natural language processing, focusing on comparing open source and closed source embedding models for semantic search, including techniques like clustering and re-ranking.

2024-09-27 Tags: embeddings, natural language processing, semantic search, open source, closed source, retrieval applications, clustering, re-ranking, llm by klotz

Embeddings Are Kind of Shallow

The author explores semantic search using embeddings on U.S. Presidents, comparing four models: BGE, ST, Ada, and Large. The findings show that while embeddings capture interesting data, their limitations and inability to understand subtext and perform certain semantic tasks highlight their shallowness compared to full language models.

2024-09-24 Tags: embeddings, semantic search, llm by klotz

Sage: Chat with any codebase

Sage is a tool that allows developers to chat with any codebase using two commands. It provides a functional chat interface for code, supports running locally or on the cloud, and has a modular design for swapping components.

2024-09-19 Tags: sage, codebase, chat, developers, github, embeddings, llm, vector stores, modular by klotz

First / Previous / Next / Last / Page 1 of 0

SemanticScuttle - klotz.me

klotz: embeddings* + llm*

Linked Tags

Related Tags