SemanticScuttle - klotz.me » Tags: machine learning+nlp

Tags: machine learning* + nlp*

0 bookmark(s) - Sort by: Date ↓ / Title /

This article demonstrates how to perform text summarization using the scikit-llm library, which provides a simple interface for utilizing large language models within a scikit-learn style workflow. The guide walks through installing the necessary dependencies and implementing both extractive and abstractive summarization techniques on sample text data.
Key topics include:
- Introduction to the scikit-llm library
- Implementing abstractive summarization using LLMs
- Using scikit-llm for text classification and clustering tasks
- Practical code examples for integrating LLM capabilities into machine learning pipelines

2026-04-28 Tags: text summarization, scikit-llm, llm, nlp, python, machine learning by klotz

Using a Local LLM as a Zero-Shot Classifier

A practical pipeline for classifying messy free-text data into meaningful categories using a locally hosted LLM, no labeled training data required.

2026-04-24 Tags: braden riggs, localllama, llm, zero-shot, classification, text, nlp by klotz

Getting Started with Zero-Shot Text Classification

Learn how to label text without the need for task-specific training data by using zero-shot text classification. This guide explains how pretrained transformer models, such as BART, reframe classification as a reasoning task where labels are treated as natural language statements.
Key topics include:
* The core concept of zero-shot classification and its advantages for rapid prototyping.
* Using the Hugging Face transformers pipeline with the facebook/bart-large-mnli model.
* Implementing multi-label classification for texts belonging to multiple categories.
* Improving accuracy through custom hypothesis template tuning and clear label wording.

2026-04-23 Tags: zero-shot text classification, transformer models, nlp, hugging face, bart, machine learning, text, solon by klotz

Maths, CS & AI Compendium

This is an open, unconventional textbook covering mathematics, computing, and artificial intelligence from foundational principles. It's designed for practitioners seeking a deep understanding, moving beyond exam preparation and focusing on real-world application. The author, drawing from years of experience in AI/ML, has compiled notes that prioritize intuition, context, and clear explanations, avoiding dense notation and outdated material.
The compendium covers a broad range of topics, from vectors and matrices to machine learning, computer vision, and multimodal learning, with future chapters planned for areas like data structures and AI inference.

2026-03-28 Tags: python, nlp, computer science, machine learning, statistics, reinforcement learning, computer vision, deep learning, math, algorithms, linear algebra, probability, mathematics, artificial intelligence, speech processing, multimodal-learning, jax, ai textbook by klotz

A Beginner’s Reading List for Large Language Models for 2026

A curated reading list for those starting to learn about Large Language Models (LLMs), covering foundational concepts, practical applications, and future trends, updated for 2026.

2026-02-06 Tags: llm, machine learning, deep learning, nlp, reading list, 2026 by klotz

GenAI_Agents

This repository provides tutorials and implementations for various Generative AI Agent techniques, from basic to advanced. It serves as a comprehensive guide for building intelligent, interactive AI systems.

2026-01-02 Tags: agents, nlp, llm, machine learning, natural language processing by klotz

The Optimal Architecture for Small Language Models

This article details research into finding the optimal architecture for small language models (70M parameters), exploring depth-width tradeoffs, comparing different architectures, and introducing Dhara-70M, a diffusion model offering 3.8x faster throughput with improved factuality.

2025-12-27 Tags: llm, nlp, small language models, architecture, diffusion, llama, gemma, deep learning by klotz

Choosing the Right Chunking Strategy: A Comprehensive Guide to RAG Optimization

This article explores different chunking strategies for Retrieval-Augmented Generation (RAG) systems, comparing nine approaches using the agenticmemory library to improve retrieval accuracy and reduce hallucinations.

2025-12-22 Tags: llm, performance, rag, chunking, embedding, vector database, rag optimization by klotz

Command Line Utility | Embedding Atlas

This page details the command-line utility for the Embedding Atlas, a tool for exploring large text datasets with metadata. It covers installation, data loading (local and Hugging Face), visualization of embeddings using SentenceTransformers and UMAP, and usage instructions with available options.

2025-08-13 Tags: embedding, text, data, visualization, umap, sentence transformers, command line, hugging face, parquet, duckdb by klotz

Topic Model Labelling with LLMs

Python tutorial for reproducible labeling of cutting-edge topic models with GPT4-o-mini. The article details training a FASTopic model and labeling its results using GPT-4.0 mini, emphasizing reproducibility and control over the labeling process.

2025-07-15 Tags: llm, machine learning, nlp, python, topic modeling, fastopic, turftopic, gpt-4, classification by klotz

First / Previous / Next / Last / Page 1 of 0

SemanticScuttle - klotz.me

Tags: machine learning* + nlp*

Linked Tags

Related Tags