This article guides readers through building an OCR application using the Llama 3.2-Vision model from Ollama, using Python as the programming language. It includes steps for setting up the environment, installing necessary tools, and writing the OCR script.
A tutorial to set up a local, open-source virtual assistant and a code assist feature similar to Copilot using Ollama, Llama3, Continue, and Open WebUI.
A comparison of frameworks, models, and costs for deploying Llama models locally and privately.
- Four tools were analyzed: HuggingFace, vLLM, Ollama, and llama.cpp.
- HuggingFace has a wide range of models but struggles with quantized models.
- vLLM is experimental and lacks full support for quantized models.
- Ollama is user-friendly but has some customization limitations.
- llama.cpp is preferred for its performance and customization options.
- The analysis focused on llama.cpp and Ollama, comparing speed and power consumption across different quantizations.
Ollama now supports HuggingFace GGUF models, making it easier for users to run AI models locally without internet. The GGUF format allows for the use of AI models on modest-sized consumer hardware.
NuExtract is a 3.8B parameter information extraction model fine-tuned from phi-3, designed to extract structured data from text using a JSON template.
A step-by-step guide to run Llama3 locally with Python. Discusses the benefits of running local LLMs, including data privacy, cost-effectiveness, customization, offline functionality, and unrestricted use.
This article explains how to install Ollama, an open-source project for running large language models (LLMs) on a local machine, on Ubuntu Linux. It also covers the system requirements, installation process, and usage of various available LLMs.
pgai brings AI workflows to your PostgreSQL database. It simplifies the process of building search and Retrieval Augmented Generation (RAG) AI applications with PostgreSQL by bringing embedding and generation AI models closer to the database.
This article guides you through the process of building a local RAG (Retrieval-Augmented Generation) system using Llama 3, Ollama for model management, and LlamaIndex as the RAG framework. The tutorial demonstrates how to get a basic local RAG system up and running with just a few lines of code.
Retrochat is chat application that supports Llama.cpp, Kobold.cpp, and Ollama. It highlights new features, commands for configuration, chat management, and models, and provides a download link for the release.