* **Structured Outputs:** Uses grammar-constrained decoding (logit biasing/masking) to enforce strict JSON schema compliance during inference. Best for deterministic data transformation.
* **Function Calling:** Utilizes instruction tuning to enable model reasoning over tool definitions. Best for agentic workflows and external state mutation.
| Feature | Structured Outputs | Function Calling |
| :--- | :--- | :--- |
| **Mechanism** | Constrained decoding (Grammar/Regex) | Instruction-tuned intent detection |
| **Reliability** | 100% Schema Compliance | Probabilistic (requires retry logic) |
| **Primary Use Case** | ETL, Query Gen, Reasoning traces | API Triggers, RAG, Task Routing |
| **Latency/Cost** | Low overhead; optimized decoding | Higher overhead due to tool-definition tokens |
* **ETL & Extraction:** Use Structured Outputs to ensure downstream parsers never fail on malformed JSON.
* **Agentic Loops:** Use Function Calling for multi-turn interactions where the model must decide *which* tool to invoke based on context.
* **Hybrid Pattern (Controller/Formatter):** Deploy a "Function Calling" agent as the **Controller** to select tools, then pipe results through a "Structured Output" layer as the **Formatter** to ensure clean data ingestion into databases or UIs.
This article explains the concept of 'skills' in the context of language models, detailing how to create and use them to enhance model capabilities. It covers the file structure, YAML configuration, and integration of scripts for task automation, providing a practical guide for developers.
We introduce the Ministral 3 series, a family of parameter-efficient dense language models designed for compute and memory constrained applications, available in three model sizes: 3B, 8B, and 14B parameters. For each model size, we release three variants: a pretrained base model for general-purpose use, an instruction finetuned, and a reasoning model for complex problem-solving.
Ollama has partnered with NVIDIA to optimize performance on the new NVIDIA DGX Spark, powered by the GB10 Grace Blackwell Superchip, enabling fast prototyping and running of local language models.
This Perspective outlines ways in which generative artificial intelligence aligns with and supports the core ideas of generative linguistics, and how generative linguistics can provide criteria to evaluate and improve neural language models.
This paper surveys recent replication studies of DeepSeek-R1, focusing on Supervised Fine-Tuning (SFT) and Reinforcement Learning from Verifiable Rewards (RLVR). It details data construction, method design, and training procedures, offering insights and anticipating future research directions for reasoning language models.
An introduction to evaluating language models with easy-to-understand metrics.
Understand temperature, Top-k, Top-p, frequency, and presence penalty for LLM hyperparameters once and for all with visual examples.