SemanticScuttle - klotz.me » Tags: machine learning+embedding+nlp

Tags: machine learning* + embedding* + nlp*

0 bookmark(s) - Sort by: Date ↓ / Title /

This Space demonstrates a simple method for embedding text using a LLM (Large Language Model) via the Hugging Face Inference API. It showcases how to convert text into numerical vector representations, useful for semantic search and similarity comparisons.

2025-03-28 Tags: llm, embedding, hugging face, inference, api, semantic search, vector representation, text embedding by klotz

A Coding Implementation to Build a Document Search Agent (DocSearchAgent) with Hugging Face, ChromaDB, and Langchain

This tutorial demonstrates how to build a powerful document search engine using Hugging Face embeddings, Chroma DB, and Langchain for semantic search capabilities.

2025-03-21 Tags: document, search, hugging face, chromadb, langchain, vector database, embedding, agents, llm by klotz

Qodo-Embed-1-1.5B

Qodo-Embed-1-1.5B is a state-of-the-art code embedding model designed for retrieval tasks in the software development domain. It supports multiple programming languages and is optimized for natural language-to-code and code-to-code retrieval, making it highly effective for applications such as code search and retrieval-augmented generation.

2025-03-04 Tags: qodo-embed-1, code, embedding, llm, software development, huggingface by klotz

Qodo’s open code embedding model sets new enterprise standard, beating OpenAI, Salesforce

Qodo releases Qodo-Embed-1-1.5B, an open-source code embedding model that outperforms competitors from OpenAI and Salesforce, enhancing code search, retrieval, and understanding for enterprise development teams.

2025-03-04 Tags: qodo, code, embedding, llm, search, retrieval, software engineering by klotz

A Complete Introduction to Using BERT Models

This article provides a comprehensive guide on the basics of BERT (Bidirectional Encoder Representations from Transformers) models. It covers the architecture, use cases, and practical implementations, helping readers understand how to leverage BERT for natural language processing tasks.

2025-02-07 Tags: bert, natural language processing, machine learning, transformers, text classification, sentiment analysis, llm by klotz

Understanding Encoder And Decoder LLMs

An explanation of the differences between encoder- and decoder-style large language model (LLM) architectures, including their roles in tasks such as classification, text generation, and translation.

2024-12-28 Tags: encoder, decoder, llm, transformer, bert, roberta, gpt, bart, t5, seq2seq by klotz

Snowflake Releases Arctic Embed L 2.0 and Arctic Embed M 2.0: A Set of Extremely Strong Yet Small Embedding Models for English and Multilingual Retrieval

Snowflake recently announced the launch of Arctic Embed L 2.0 and Arctic Embed M 2.0, two small and powerful embedding models tailored for multilingual search and retrieval. The models are available in medium and large variants, with the medium model incorporating 305 million parameters and the large variant with 568 million parameters. Both models support context lengths of up to 8,192 tokens. They demonstrate high-quality retrieval across multiple languages and excel in benchmarks like MTEB and CLEF.

2024-12-09 Tags: snowflake, arctic embed, text, embedding, llm, multilingual, retrieval by klotz

Elasticsearch Was Great, But Vector Databases Are the Future

The article discusses the evolution of search databases and how vector databases are emerging as a powerful alternative to traditional search engines like Elasticsearch.

2024-11-19 Tags: elasticsearch, vector database, search engine, bm25, tf-idf, embedding by klotz

BEAL: A Bayesian Deep Active Learning Method for Efficient Deep Multi-Label Text Classification

BEAL is a deep active learning method that uses Bayesian deep learning with dropout to infer the model’s posterior predictive distribution and introduces an expected confidence-based acquisition function to select uncertain samples. Experiments show that BEAL outperforms other active learning methods, requiring fewer labeled samples for efficient training.

2024-11-18 Tags: beal, bayesian, deep learning, active learning, multi-label, text, classification, bert, machine learning by klotz

OpenAI Embeddings and Clustering for Survey Analysis — A How-To Guide

A guide on how to use OpenAI embeddings and clustering techniques to analyze survey data and extract meaningful topics and actionable insights from the responses.

The process involves transforming textual survey responses into embeddings, grouping similar responses through clustering, and then identifying key themes or topics to aid in business improvement.

2024-10-26 Tags: embedding, clustering, survey analysis, data science, visualization, k-means, tsne by klotz

First / Previous / Next / Last / Page 1 of 0

SemanticScuttle - klotz.me

Tags: machine learning* + embedding* + nlp*

Linked Tags

Related Tags