SemanticScuttle - klotz.me » klotz: nlp+python

klotz: nlp* + python*

Google’s LangExtract: A Critical Review from the Trenches

This review examines Google’s LangExtract, a library designed to solve the "production nightmare" of inconsistent data extraction from large documents using standard LLM APIs.

* **Source Grounding:** Maps entities back to original text to prevent hallucinations.
* **Smart Chunking:** Splits long text at natural boundaries to preserve context.
* **Parallel Processing:** Uses `max_workers` to reduce latency.
* **Multi-pass Extraction:** Runs multiple cycles and merges results for higher accuracy.
* **Visual Interface:** Provides interactive highlighting of extracted data.
**Result:** The author successfully transformed a messy 15,000-character meeting transcript into clean, structured JSON.

2026-04-04 Tags: langextract, llm, python, google, named entity recognition, text processing, extraction, nlp by klotz

Maths, CS & AI Compendium

This is an open, unconventional textbook covering mathematics, computing, and artificial intelligence from foundational principles. It's designed for practitioners seeking a deep understanding, moving beyond exam preparation and focusing on real-world application. The author, drawing from years of experience in AI/ML, has compiled notes that prioritize intuition, context, and clear explanations, avoiding dense notation and outdated material.
The compendium covers a broad range of topics, from vectors and matrices to machine learning, computer vision, and multimodal learning, with future chapters planned for areas like data structures and AI inference.

2026-03-28 Tags: python, nlp, computer science, machine learning, statistics, reinforcement learning, computer vision, deep learning, math, algorithms, linear algebra, probability, mathematics, artificial intelligence, speech processing, multimodal-learning, jax, ai textbook by klotz

Atomic Language Model

An extremely lightweight universal grammar implementation with provable recursion, based on Chomsky's Minimalist Grammar theory, fitting in under 50kB with zero runtime dependencies. It includes a probabilistic language model extension and formal verification.

2025-08-22 Tags: language model, recursive grammar, minimalist grammar, formal verification, chomsky, rust, python, probabilistic, github, sky purps, nlp, enterprise neurosystems, uganda by klotz

Topic Model Labelling with LLMs

Python tutorial for reproducible labeling of cutting-edge topic models with GPT4-o-mini. The article details training a FASTopic model and labeling its results using GPT-4.0 mini, emphasizing reproducibility and control over the labeling process.

2025-07-15 Tags: llm, machine learning, nlp, python, topic modeling, fastopic, turftopic, gpt-4, classification by klotz

Dolphin MCP

A flexible Python library and CLI tool for interacting with Model Context Protocol (MCP) servers using OpenAI, Anthropic, and Ollama models.

2025-03-13 Tags: python, model context protocol, nlp, cli, lidolphin eric hartford by klotz

txtai-text-classify.py

A Github Gist containing a Python script for text classification using the TxTail API

2024-07-13 Tags: gist, python, txtail, text classification, github, benchmark, llm, gpt, bert by klotz

Diving into Word Embeddings with EDA

Exploratory data analysis (EDA) is a powerful technique to understand the structure of word embeddings, the basis of large language models. In this article, we'll apply EDA to GloVe word embeddings and find some interesting insights.

2024-07-12 Tags: word, embeddings, eda, glove, pca, dimensionality reduction, nlp, text, python by klotz

A Step-by-Step Guide to Representation Finetuning LLAMA3

"The paper introduces a technique called LoReFT (Low-rank Linear Subspace ReFT). Similar to LoRA (Low Rank Adaptation), it uses low-rank approximations to intervene on hidden representations. It shows that linear subspaces contain rich semantics that can be manipulated to steer model behaviors."

2024-05-26 Tags: linear subspace, lora, representation, fine tuning, reft, stanford, nlp, python, llm by klotz

microsoft/guidance: A guidance language for controlling large language models.

2023-07-26 Tags: llm, microsoft, guidance, information schema, text extraction, nlp, python, github by klotz

Getting Started with LangChain

2023-06-29 Tags: langchain, llm, python, nlp by klotz

First / Previous / Next / Last / Page 1 of 0

SemanticScuttle - klotz.me

klotz: nlp* + python*

Linked Tags

Related Tags