SemanticScuttle - klotz.me » klotz: sentence transformers

klotz: sentence transformers*

Document Clustering with LLM Embeddings in scikit-learn

This tutorial demonstrates how to perform document clustering using LLM embeddings with scikit-learn. It covers generating embeddings with Sentence Transformers, reducing dimensionality with PCA, and applying KMeans clustering to group similar documents.

2026-02-11 Tags: document clustering, llm embeddings, sentence transformers, scikit-learn, pca, kmeans, dimensionality reduction, natural language processing, nlp by klotz

Command Line Utility | Embedding Atlas

This page details the command-line utility for the Embedding Atlas, a tool for exploring large text datasets with metadata. It covers installation, data loading (local and Hugging Face), visualization of embeddings using SentenceTransformers and UMAP, and usage instructions with available options.

2025-08-13 Tags: embedding, text, data, visualization, umap, sentence transformers, command line, hugging face, parquet, duckdb by klotz

Feature Engineering with LLM Embeddings: Enhancing Scikit-Learn Models

The article discusses using Large Language Model (LLM) embeddings as features in traditional machine learning models built with scikit-learn. It covers the process of generating embeddings from text data using models like Sentence Transformers, and how these embeddings can be combined with existing features to improve model performance. It details practical steps including loading data, creating embeddings, and integrating them into a scikit-learn pipeline for tasks like classification.

2025-07-18 Tags: llm, embeddings, feature engineering, scikit-learn, machine learning, sentence transformers, text data, classification, pipelines by klotz

Let’s Build a RAG-Powered Research Paper Assistant

This article details building a Retrieval-Augmented Generation (RAG) system to assist with research paper tasks, specifically question answering over a PDF document. It covers document loading, splitting, embedding with Sentence Transformers, using ChromaDB as a vector database, and implementing a query interface with LangChain.

2025-04-23 Tags: docker, rag, langchain, sentence transformers, chromadb, vector database, pdf, llm by klotz

Training and Finetuning Embedding Models with Sentence Transformers v3

This article explains how to use the Sentence Transformers library to finetune and train embedding models for a variety of applications, such as retrieval augmented generation, semantic search, and semantic textual similarity. It covers the training components, dataset format, loss function, training arguments, evaluators, and trainer.

2024-05-28 Tags: sentence transformers, finetune, embedding, models, similarity, llm, huggingface by klotz

First / Previous / Next / Last / Page 1 of 0

SemanticScuttle - klotz.me

klotz: sentence transformers*

Linked Tags

Related Tags