Tags: embeddings*

0 bookmark(s) - Sort by: Date ↓ / Title /

  1. RAG combines language models with external knowledge. This article explores context & retrieval in RAG, covering search methods (keywords, TF-IDF, embeddings/FAISS/Chroma), context length challenges (compression, re-ranking), and contextual retrieval (query & conversation history).
  2. This tutorial explores how to use LLM embeddings as features in time series forecasting models. It covers generating embeddings from time series descriptions, preparing data, and evaluating the performance of models with and without LLM embeddings.
  3. This tutorial demonstrates how to combine LLM embeddings, TF-IDF vectors, and metadata features into a single Scikit-learn pipeline for document retrieval and search. It covers generating embeddings with Sentence Transformers, calculating TF-IDF, handling metadata, and building a combined retrieval system.
  4. This article details seven advanced feature engineering techniques using LLM embeddings to improve machine learning model performance. It covers techniques like dimensionality reduction, semantic similarity, clustering, and more.

    The article explores how to leverage LLM embeddings for advanced feature engineering in machine learning, going beyond simple similarity searches. It details seven techniques:

    1. **Embedding Arithmetic:** Performing mathematical operations (addition, subtraction) on embeddings to represent concepts like "positive sentiment - negative sentiment = overall sentiment".
    2. **Embedding Clustering:** Using clustering algorithms (like k-means) on embeddings to create categorical features representing groups of similar text.
    3. **Embedding Dimensionality Reduction:** Reducing the dimensionality of embeddings using techniques like PCA or UMAP to create more compact features while preserving important information.
    4. **Embedding as Input to Tree-Based Models:** Directly using embedding vectors as features in tree-based models like Random Forests or Gradient Boosting. The article highlights the importance of careful handling of high-dimensional data.
    5. **Embedding-Weighted Averaging:** Calculating weighted averages of embeddings based on relevance scores (e.g., TF-IDF) to create a single, representative embedding for a document.
    6. **Embedding Difference:** Calculating the difference between embeddings to capture changes or relationships between texts (e.g., before/after edits, question/answer pairs).
    7. **Embedding Concatenation:** Combining multiple embeddings (e.g., title and body of a document) to create a richer feature representation.
  5. This post discusses the limitations of using cosine similarity for compatibility matching, specifically in the context of a dating app. The author found that high cosine similarity scores didn't always translate to actual compatibility due to the inability of embeddings to capture dealbreaker preferences. They improved results by incorporating structured features and hard filters.
  6. Amazon S3 Vectors is now generally available with increased scale and production-grade performance capabilities. It offers native support to store and query vector data, potentially reducing costs by up to 90% compared to specialized vector databases.
  7. A tutorial on building a private, offline Retrieval Augmented Generation (RAG) system using Ollama for embeddings and language generation, and FAISS for vector storage, ensuring data privacy and control.

    1. **Document Loader:** Extracts text from various file formats (PDF, Markdown, HTML) while preserving metadata like source and page numbers for accurate citations.
    2. **Text Chunker:** Splits documents into smaller text segments (chunks) to manage token limits and improve retrieval accuracy. It uses overlapping and sentence boundary detection to maintain context.
    3. **Embedder:** Converts text chunks into numerical vectors (embeddings) using the `nomic-embed-text` model via Ollama, which runs locally without internet access.
    4. **Vector Database:** Stores the embeddings using FAISS (Facebook AI Similarity Search) for fast similarity search. It uses cosine similarity for accurate retrieval and saves the database to disk for quick loading in future sessions.
    5. **Large Language Model (LLM):** Generates answers using the `llama3.2` model via Ollama, also running locally. It takes the retrieved context and the user's question to produce a response with citations.
    6. **RAG System Orchestrator:** Coordinates the entire workflow, managing the ingestion of documents (loading, chunking, embedding, storing) and the querying process (retrieving relevant chunks, generating answers).
  8. This article details the process of building a fast vector search system for a large legal dataset (Australian High Court decisions). It covers choosing embedding providers, performance benchmarks, using USearch and Isaacus embeddings, and the importance of API terms of service. It focuses on achieving speed and scalability while maintaining reasonable accuracy.
  9. This article explains the internal workings of vector databases, highlighting that they don't perform a brute-force search as commonly described. It details algorithms like HNSW, IVF, and PQ, the tradeoffs between recall, speed, and memory, and how different RAG patterns impact vector database usage. It also discusses production challenges like filtering, updates, and sharding.
  10. Google DeepMind research reveals a fundamental architectural limitation in Retrieval-Augmented Generation (RAG) systems related to fixed-size embeddings. The research demonstrates that retrieval performance degrades as database size increases, with theoretical limits based on embedding dimensionality. They introduce the LIMIT benchmark to empirically test these limitations and suggest alternatives like cross-encoders, multi-vector models, and sparse models.

Top of the page

First / Previous / Next / Last / Page 1 of 0 SemanticScuttle - klotz.me: tagged with "embeddings"

About - Propulsed by SemanticScuttle