SemanticScuttle - klotz.me » Tags: ollama+llm

Tags: ollama* + llm*

0 bookmark(s) - Sort by: Date ↓ / Title /

Model Context Protocol (MCP) with Ollama and Llama 3 - A Step-by-Step Guide (Part 2)

This article provides a comprehensive guide on implementing the Model Context Protocol (MCP) with Ollama and Llama 3, covering practical implementation steps and use cases.

2026-01-21 Tags: model context protocol, mcp, ollama, llm, api, integration, tutorial, guide by klotz

ollama 0.14 Can Make Use Of Bash For Letting AI/LLMs Run Commands On Your System

The ollama 0.14-rc2 release introduces experimental functionality allowing LLMs to use tools like bash and web searching on your system, with safeguards like interactive approval and command allow/denylists.

2026-01-10 Tags: ollama, llm, bash, linux, agent by klotz

Building a 100% local MCP Client

This article details how to build a 100% local MCP (Model Context Protocol) client using LlamaIndex, Ollama, and LightningAI. It provides a code walkthrough and explanation of the process, including setting up an SQLite MCP server and a locally served LLM.

2026-01-04 Tags: mcp, llamaindex, ollama, llm, local, data science, python, sqlite, agent, rag, dailydoseofds by klotz

Building a RAG System That Runs Completely Offline

A tutorial on building a private, offline Retrieval Augmented Generation (RAG) system using Ollama for embeddings and language generation, and FAISS for vector storage, ensuring data privacy and control.

1. **Document Loader:** Extracts text from various file formats (PDF, Markdown, HTML) while preserving metadata like source and page numbers for accurate citations.
2. **Text Chunker:** Splits documents into smaller text segments (chunks) to manage token limits and improve retrieval accuracy. It uses overlapping and sentence boundary detection to maintain context.
3. **Embedder:** Converts text chunks into numerical vectors (embeddings) using the `nomic-embed-text` model via Ollama, which runs locally without internet access.
4. **Vector Database:** Stores the embeddings using FAISS (Facebook AI Similarity Search) for fast similarity search. It uses cosine similarity for accurate retrieval and saves the database to disk for quick loading in future sessions.
5. **Large Language Model (LLM):** Generates answers using the `llama3.2` model via Ollama, also running locally. It takes the retrieved context and the user's question to produce a response with citations.
6. **RAG System Orchestrator:** Coordinates the entire workflow, managing the ingestion of documents (loading, chunking, embedding, storing) and the querying process (retrieving relevant chunks, generating answers).

2025-11-15 Tags: rag, self-hosted, llm, ollama, faiss, embeddings, vector database, hackernoon by klotz

Using Codex CLI with gpt-oss:120b on an NVIDIA DGX Spark via Tailscale

This article details how the author successfully ran OpenAI's Codex CLI against a gpt-oss:120b model hosted on an NVIDIA DGX Spark, accessed through a Tailscale network. It covers the setup of Tailscale, Ollama configuration, and the process of running the Codex CLI with the remote model, including building a Space Invaders game.

2025-11-07 Tags: llm, codex, gpt-oss, nvidia dgx spark, tailscale, ollama, ai, large language model, space invaders by klotz

How To Deploy a Local LLM via Docker

Learn to deploy your own local LLM service using Docker containers for maximum security and control, whether you're running on CPU, NVIDIA GPU or AMD GPU.

2025-10-24 Tags: docker, llm, self hostedi, containers, nvidia gpu, amd gpu, ollama by klotz

I set up an email triage system using Home Assistant and a local LLM, here's how you can too

This article details how to set up an email triage system using Home Assistant and a local Large Language Model (LLM) to summarize and categorize incoming emails, reducing inbox clutter and improving email management. It covers the setup of a REST command to interface with Ollama, the automation process, and the benefits of using a local LLM for privacy.

2025-08-25 Tags: home assistant, llm, ollama, email triage, automation, smart home, email, summarization, hallux, solon by klotz

I get a perfect weather report on my Home Assistant dashboard, here's how I do it with a local LLM

This article details how to set up a weather report on a Home Assistant dashboard using a local LLM (Ollama) for more user-friendly summaries and clothing suggestions, avoiding cloud-based services for privacy reasons. It covers the setup process, prompt engineering, and hardware considerations.

2025-08-06 Tags: home assistant, ollama, llm, localllama, smart home, weather, iot, raspberry pi by klotz

7 things I wish I knew when I started self-hosting LLMs

This article details 7 lessons the author learned while self-hosting Large Language Models (LLMs), covering topics like the importance of memory bandwidth, quantization, electricity costs, hardware choices beyond Nvidia, prompt engineering, Mixture of Experts models, and starting with simpler tools like LM Studio.

2025-07-23 Tags: llm, self-hosting, gpu, quantization, memory bandwidth, ollama, lm studio, mixture of experts by klotz

Devstral: How to Run & Fine-tune | Unsloth Documentation

Learn how to run and fine-tune Mistral Devstral 1.1, including Small-2507 and 2505. This guide covers official recommended settings, tutorials for running Devstral in Ollama and llama.cpp, experimental vision support, and fine-tuning with Unsloth.

2025-07-11 Tags: devstral, mistral, unsloth, fine-tuning, llm, ollama, llama.cpp, vision by klotz

First / Previous / Next / Last / Page 1 of 0

SemanticScuttle - klotz.me

Tags: ollama* + llm*

Linked Tags

Related Tags