Developers are replacing bloated MCP servers with Markdown skill files — cutting token costs by 100x. This article explores a two-layer architecture emerging in production AI systems, separating knowledge from execution. It details how skills (Markdown files) encode stable knowledge, while MCP servers handle runtime API interactions. The piece advocates for a layered approach to optimize context window usage, reduce costs, and improve agent reasoning by prioritizing knowledge representation in a version-controlled, accessible format.
This article details how to use Ollama to run large language models locally, protecting sensitive data by keeping it on your machine. It covers installation, usage with Python, LangChain, and LangGraph, and provides a practical example with FinanceGPT, while also discussing the tradeoffs of using local LLMs.
Google is announcing the public preview of the Developer Knowledge API and its associated Model Context Protocol (MCP) server. These tools provide a machine-readable gateway to Google’s official developer documentation, enabling AI assistants to access accurate and up-to-date information for building with Google technologies like Firebase, Android, and Google Cloud.
A collection of prompts designed to be used with AI coding assistants to build various use cases, ranging from personal CRM and knowledge bases to content pipelines and social media research.
This post breaks down why MCP servers fail, six best practices for building ones that work, and how Skills and MCP complement each other. It emphasizes designing MCP servers as user interfaces for AI agents, focusing on outcomes, flattened arguments, clear instructions, curation, discoverable naming, and pagination.
* **Focus on Outcomes, Not Operations:** Instead of exposing granular API endpoints as tools, create high-level tools that deliver the *result* the agent needs.
* **Flatten Arguments:** Use simple, typed arguments instead of complex nested structures.
* **Instructions are Context:** Leverage docstrings and error messages to provide clear guidance to the agent.
* **Curate Ruthlessly:** Limit the number of tools exposed and focus on essential functionality.
* **Name Tools for Discovery:** Use a consistent naming convention (service_action_resource) to improve discoverability.
* **Paginate Large Results:** Avoid overwhelming the agent with large datasets; use pagination with metadata.
This article provides a comprehensive guide on implementing the Model Context Protocol (MCP) with Ollama and Llama 3, covering practical implementation steps and use cases.
A guide to setting up local LLMs on Linux using LLaMA.cpp, llama-server, llama-swap, and QwenCode for various workflows like chat, coding, and data analysis.
This section details how to load and use multiple models with the llama.cpp server. It covers configuring the server to handle multiple models, the model path format, and considerations for memory usage.
This article compares Model Context Protocol (MCP), Function Calling, and OpenAPI Tools for integrating tools and resources with language models, outlining their strengths, limits, security considerations, and ideal use cases.
The Model Context Protocol (MCP) is a new open protocol that allows AI models to interact with external systems in a standardized, extensible way. In this tutorial, you’ll install MCP, explore its client-server architecture, and work with its core concepts: prompts, resources, and tools.