SemanticScuttle - klotz.me

Tags: api* + llm*

0 bookmark(s) - Sort by: Date ↓ / Title /

New in llama.cpp: Anthropic Messages API

The llama.cpp server has introduced support for the Anthropic Messages API, a highly requested feature that allows users to run Claude-compatible clients with locally hosted models. This implementation enables powerful tools like Claude Code to interface directly with local GGUF models by internally converting Anthropic's message format to OpenAI's standard. Key features of this update include full support for chat completions with streaming, advanced tool use through function calling, token counting capabilities, vision support for multimodal models, and extended thinking for reasoning models. This development bridges the gap between proprietary AI ecosystems and local, privacy-focused inference pipelines, providing a seamless experience for developers working with agentic workloads and coding assistants.

ANTHROPIC_AUTH_TOKEN, ANTHROPIC_MODEL=

2026-04-11 Tags: llama.cpp, anthropic, api, claude code, local llm, gguf, tool use, function calling, llm by klotz

Tavily: The web access layer for agents

Tavily is a powerful API connecting AI agents to the live web for real-time search, extraction, research, and web crawling. It provides a production-grade retrieval stack to ground LLMs with fresh, factual web context, reducing hallucinations.

Built for scale, Tavily handles millions of requests with low latency and built-in safeguards against PII leakage and prompt injection. Trusted by over one million developers and major enterprises like MongoDB and IBM, it offers seamless integration with leading LLM providers for sophisticated AI applications.

2026-04-10 Tags: search, api, llm, agents, web crawling by klotz

OpenAI Extends the Responses API to Serve as a Foundation for Autonomous Agents

OpenAI has expanded its Responses API to facilitate the development of agentic workflows. This includes support for a shell tool, an agent execution loop, a hosted container workspace, context compaction, and reusable agent skills. The new features aim to offload the complexities of building execution environments from developers, providing a managed infrastructure for handling tasks like file management, prompt optimization, secure network access, and handling timeouts.
A core component is the agent execution loop, where the model proposes actions (running commands, querying data) that are executed in a controlled environment, with the results fed back to refine the process. Skills allow for the creation of reusable task patterns.

2026-03-27 Tags: openai, agents, api, large language models, responses api by klotz

The Case for Running AI Agents on Markdown Files Instead of MCP Servers

Developers are replacing bloated MCP servers with Markdown skill files — cutting token costs by 100x. This article explores a two-layer architecture emerging in production AI systems, separating knowledge from execution. It details how skills (Markdown files) encode stable knowledge, while MCP servers handle runtime API interactions. The piece advocates for a layered approach to optimize context window usage, reduce costs, and improve agent reasoning by prioritizing knowledge representation in a version-controlled, accessible format.

2026-03-07 Tags: llm, agents, mcp, skills, markdown, architecture, agentic ai, model context protocol, platform engineering, api, hallux by klotz

How to Protect Sensitive Data by Running LLMs Locally with Ollama

This article details how to use Ollama to run large language models locally, protecting sensitive data by keeping it on your machine. It covers installation, usage with Python, LangChain, and LangGraph, and provides a practical example with FinanceGPT, while also discussing the tradeoffs of using local LLMs.

2026-03-06 Tags: ollama, llm, local llm, privacy, data protection, langchain, langgraph, python, ai, financegpt, openai, api by klotz

Introducing the Developer Knowledge API and MCP Server

Google is announcing the public preview of the Developer Knowledge API and its associated Model Context Protocol (MCP) server. These tools provide a machine-readable gateway to Google’s official developer documentation, enabling AI assistants to access accurate and up-to-date information for building with Google technologies like Firebase, Android, and Google Cloud.

2026-02-26 Tags: developer knowledge api, mcp server, model context protocol, ai, llm, google cloud, firebase, android, documentation, api, developer tools by klotz

OpenClaw Implementation Prompts

A collection of prompts designed to be used with AI coding assistants to build various use cases, ranging from personal CRM and knowledge bases to content pipelines and social media research.

2026-02-13 Tags: llm, openclaw, prompts, automation, crm, knowledge base, rag, content creation, social media, api by klotz

MCP is Not the Problem, It's your Server: Best Practices for Building MCP Servers

This post breaks down why MCP servers fail, six best practices for building ones that work, and how Skills and MCP complement each other. It emphasizes designing MCP servers as user interfaces for AI agents, focusing on outcomes, flattened arguments, clear instructions, curation, discoverable naming, and pagination.

* **Focus on Outcomes, Not Operations:** Instead of exposing granular API endpoints as tools, create high-level tools that deliver the *result* the agent needs.
* **Flatten Arguments:** Use simple, typed arguments instead of complex nested structures.
* **Instructions are Context:** Leverage docstrings and error messages to provide clear guidance to the agent.
* **Curate Ruthlessly:** Limit the number of tools exposed and focus on essential functionality.
* **Name Tools for Discovery:** Use a consistent naming convention (service_action_resource) to improve discoverability.
* **Paginate Large Results:** Avoid overwhelming the agent with large datasets; use pagination with metadata.

2026-01-23 Tags: mcp, agents, llm, api by klotz

Model Context Protocol (MCP) with Ollama and Llama 3 - A Step-by-Step Guide (Part 2)

This article provides a comprehensive guide on implementing the Model Context Protocol (MCP) with Ollama and Llama 3, covering practical implementation steps and use cases.

2026-01-21 Tags: model context protocol, mcp, ollama, llm, api, integration, tutorial, guide by klotz

Engineer's Guide to Local LLMs with LLaMA.cpp on Linux

A guide to setting up local LLMs on Linux using LLaMA.cpp, llama-server, llama-swap, and QwenCode for various workflows like chat, coding, and data analysis.

2026-01-01 Tags: llama.cpp, tutorial, llm, llama-swap, api by klotz

First / Previous / Next / Last / Page 1 of 0

SemanticScuttle - klotz.me

Tags: api* + llm*

Linked Tags

Related Tags