SemanticScuttle - klotz.me » Tags: embeddings+vector database

Tags: embeddings* + vector database*

0 bookmark(s) - Sort by: Date ↓ / Title /

How to Combine LLM Embeddings, TF-IDF, and Metadata in One Scikit-Learn Pipeline

This tutorial demonstrates how to combine LLM embeddings, TF-IDF vectors, and metadata features into a single Scikit-learn pipeline for document retrieval and search. It covers generating embeddings with Sentence Transformers, calculating TF-IDF, handling metadata, and building a combined retrieval system.

2026-02-28 Tags: llm, embeddings, tf-idf, scikit-learn, pipeline, document retrieval, search, sentence transformers, metadata, vector database by klotz

Amazon S3 Vectors now generally available with increased scale and performance

Amazon S3 Vectors is now generally available with increased scale and production-grade performance capabilities. It offers native support to store and query vector data, potentially reducing costs by up to 90% compared to specialized vector databases.

2025-12-08 Tags: s3 vectors, vector database, ai, machine learning, embeddings, rag, amazon bedrock, amazon opensearch, cloud storage, aws by klotz

Building a RAG System That Runs Completely Offline

A tutorial on building a private, offline Retrieval Augmented Generation (RAG) system using Ollama for embeddings and language generation, and FAISS for vector storage, ensuring data privacy and control.

1. **Document Loader:** Extracts text from various file formats (PDF, Markdown, HTML) while preserving metadata like source and page numbers for accurate citations.
2. **Text Chunker:** Splits documents into smaller text segments (chunks) to manage token limits and improve retrieval accuracy. It uses overlapping and sentence boundary detection to maintain context.
3. **Embedder:** Converts text chunks into numerical vectors (embeddings) using the `nomic-embed-text` model via Ollama, which runs locally without internet access.
4. **Vector Database:** Stores the embeddings using FAISS (Facebook AI Similarity Search) for fast similarity search. It uses cosine similarity for accurate retrieval and saves the database to disk for quick loading in future sessions.
5. **Large Language Model (LLM):** Generates answers using the `llama3.2` model via Ollama, also running locally. It takes the retrieved context and the user's question to produce a response with citations.
6. **RAG System Orchestrator:** Coordinates the entire workflow, managing the ingestion of documents (loading, chunking, embedding, storing) and the querying process (retrieving relevant chunks, generating answers).

2025-11-15 Tags: rag, self-hosted, llm, ollama, faiss, embeddings, vector database, hackernoon by klotz

A VectorDB Doesn’t Actually Work the Way You Think It Does

This article explains the internal workings of vector databases, highlighting that they don't perform a brute-force search as commonly described. It details algorithms like HNSW, IVF, and PQ, the tradeoffs between recall, speed, and memory, and how different RAG patterns impact vector database usage. It also discusses production challenges like filtering, updates, and sharding.

2025-10-03 Tags: vector database, vector search, hnsw, ivf, pq, rag, approximate nearest neighbor, ai, embeddings, semantic search by klotz

First / Previous / Next / Last / Page 1 of 0

SemanticScuttle - klotz.me

Tags: embeddings* + vector database*

Linked Tags

Related Tags