This article guides readers through building an OCR application using the Llama 3.2-Vision model from Ollama, using Python as the programming language. It includes steps for setting up the environment, installing necessary tools, and writing the OCR script.
A tutorial to set up a local, open-source virtual assistant and a code assist feature similar to Copilot using Ollama, Llama3, Continue, and Open WebUI.
A comparison of frameworks, models, and costs for deploying Llama models locally and privately.
- Four tools were analyzed: HuggingFace, vLLM, Ollama, and llama.cpp.
- HuggingFace has a wide range of models but struggles with quantized models.
- vLLM is experimental and lacks full support for quantized models.
- Ollama is user-friendly but has some customization limitations.
- llama.cpp is preferred for its performance and customization options.
- The analysis focused on llama.cpp and Ollama, comparing speed and power consumption across different quantizations.
Ollama now supports HuggingFace GGUF models, making it easier for users to run AI models locally without internet. The GGUF format allows for the use of AI models on modest-sized consumer hardware.
NuExtract is a 3.8B parameter information extraction model fine-tuned from phi-3, designed to extract structured data from text using a JSON template.
A step-by-step guide to run Llama3 locally with Python. Discusses the benefits of running local LLMs, including data privacy, cost-effectiveness, customization, offline functionality, and unrestricted use.
This article explains how to install Ollama, an open-source project for running large language models (LLMs) on a local machine, on Ubuntu Linux. It also covers the system requirements, installation process, and usage of various available LLMs.
This article guides you through the process of building a local RAG (Retrieval-Augmented Generation) system using Llama 3, Ollama for model management, and LlamaIndex as the RAG framework. The tutorial demonstrates how to get a basic local RAG system up and running with just a few lines of code.
Retrochat is chat application that supports Llama.cpp, Kobold.cpp, and Ollama. It highlights new features, commands for configuration, chat management, and models, and provides a download link for the release.
This is a GitHub repository for a Discord bot named discord-llm-chatbot. This bot allows you to chat with Large Language Models (LLMs) directly in your Discord server. It supports various LLMs, including those from OpenAI API, Mistral API, Anthropic API, and local models like ollama, oobabooga, Jan, LM Studio, etc. The bot offers a reply-based chat system, customizable system prompt, and seamless threading of conversations. It also supports image and text file attachments, and streamed responses.