Extracting structured information effectively and accurately from long unstructured text with LangExtract and LLMs. This article explores Google’s LangExtract framework and its open-source LLM, Gemma 3, demonstrating how to parse an insurance policy to surface details like exclusions.
This tutorial explores implementing the LLM Arena-as-a-Judge approach to evaluate large language model outputs using head-to-head comparisons. It demonstrates using OpenAI’s GPT-4.1 and Gemini 2.5 Pro, judged by GPT-5, in a customer support scenario.
An Apple study shows that large language models (LLMs) can improve performance by using a checklist-based reinforcement learning scheme, similar to a simple productivity trick of checking one's work.
This article explains how derivatives, gradients, Jacobians, and Hessians fit together and shows examples of what they are used for, including optimization and rendering.
This page details the command-line utility for the Embedding Atlas, a tool for exploring large text datasets with metadata. It covers installation, data loading (local and Hugging Face), visualization of embeddings using SentenceTransformers and UMAP, and usage instructions with available options.
Build, enrich, and transform datasets using AI models with no code. This repository provides the source code for Hugging Face AI Sheets, an open-source tool for dataset manipulation using AI.
Google has introduced LangExtract, an open-source Python library designed to help developers extract structured information from unstructured text using large language models such as the Gemini models. The library simplifies the process of converting free-form text into structured data, offering features like controlled generation, text chunking, parallel processing, and integration with various LLMs.
This article provides a gentle introduction to Q-learning, its principles, and the basic characteristics of its algorithms, presented in a clear and illustrative tone.
OpenAI releases gpt-oss-120b and gpt-oss-20b, two state-of-the-art open-weight language models that deliver strong real-world performance at low cost. They outperform similarly sized open models on reasoning tasks and are optimized for efficient deployment.
The article discusses the evolution of model inference techniques from 2017 to a projected 2025, highlighting the progression from simple frameworks like Flask and FastAPI to more advanced solutions like Triton Inference Server and vLLM. It details the increasing demands on inference infrastructure driven by larger and more complex models, and the need for optimization in areas like throughput, latency, and cost.