SemanticScuttle - klotz.me » klotz: small language models

klotz: small language models*

A Coding Implementation on Microsoft’s Phi-4-Mini for Quantized Inference Reasoning Tool Use RAG and LoRA Fine-Tuning

This tutorial provides a comprehensive coding walkthrough for building an advanced AI pipeline using Microsoft's Phi-4-mini language model. The guide demonstrates how to leverage this compact model for high-performance tasks within resource-constrained environments like Google Colab.
Key topics covered include:
- Setting up 4-bit quantized inference to optimize GPU memory usage.
- Implementing streaming chat and multi-step chain-of-thought reasoning.
- Executing native tool calling and function calling for agentic interactions.
- Building a retrieval-augmented generation (RAG) pipeline using FAISS and sentence transformers.
- Performing lightweight LoRA fine-tuning to inject new knowledge into the model.

2026-04-26 Tags: microsoft phi-4-mini, quantized inference, llm tutorial, rag, lora fine-tuning, tool use, chain-of-thought reasoning, small language models, llm, hallux by klotz

The Optimal Architecture for Small Language Models

This article details research into finding the optimal architecture for small language models (70M parameters), exploring depth-width tradeoffs, comparing different architectures, and introducing Dhara-70M, a diffusion model offering 3.8x faster throughput with improved factuality.

2025-12-27 Tags: llm, nlp, small language models, architecture, diffusion, llama, gemma, deep learning by klotz

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

The article presents rStar-Math, a method demonstrating that small language models (SLMs) can rival or surpass the math reasoning capabilities of larger models like OpenAI's without distillation. rStar-Math employs Monte Carlo Tree Search (MCTS) for 'deep thinking', using a math policy SLM guided by an SLM-based process reward model. It introduces three innovations: a code-augmented CoT data synthesis method for training the policy SLM, a novel process reward model training method avoiding step-level score annotation, and a self-evolution recipe where both the policy SLM and process preference model are iteratively improved. Through self-evolution with millions of solutions for 747k math problems, rStar-Math achieves state-of-the-art math reasoning, significantly improving performance on benchmarks like MATH and AIME.

2025-01-11 Tags: small language models, llm, math, reasoning, self-evolution, rstar-math by klotz

A Comprehensive Survey of Small Language Models: Architectures, Datasets, and Training Algorithms

The article discusses small language models (SLMs) designed for high-quality machine intelligence on resource-constrained devices like smartphones and wearables. It highlights innovations in architectural designs, datasets, and training algorithms that enhance SLMs' efficiency and performance, making AI more accessible.

2024-09-27 Tags: small language models, llm mistral by klotz

The Tiny JSONist — meet AI NuExtract

This article explores NuExtract, a family of Small Language Models (SLMs) for extracting structured data from text. The author, Fabio Matricardi, discusses using NuExtract to process candidate CVs for a database and highlights its benefits for privacy protection and running on less powerful computers.

2024-08-22 Tags: llm, nuextract, information extraction, small language models, json by klotz

The Next Big Trends in Large Language Model (LLM) Research

Explores recent trends in LLM research, including multi-modal LLMs, open-source LLMs, domain-specific LLMs, LLM agents, smaller LLMs, and Non-Transformer LLMs. Mentions examples such as OpenAI's Sora, LLM360, BioGPT, StarCoder, and Mamba.

2024-07-05 Tags: llm, multimodal, agent, small language models, domain language models by klotz

First / Previous / Next / Last / Page 1 of 0

SemanticScuttle - klotz.me

klotz: small language models*

Linked Tags

Related Tags