This article guides you through the process of building a local RAG (Retrieval-Augmented Generation) system using Llama 3, Ollama for model management, and LlamaIndex as the RAG framework. The tutorial demonstrates how to get a basic local RAG system up and running with just a few lines of code.
A CLI tool for interacting with local or remote LLMs to retrieve information about files, execute queries, and perform other tasks in a Retrieval-Augmented Generation (RAG) fashion.
LlamaIndex comes with a built-in indexing feature, which allows developers to index large datasets efficiently. This makes it easier to search and retrieve information from these datasets, ultimately improving the overall performance of LLM-based applications.
A step-by-step guide on deploying LlamaIndex RAGs to AWS ECS fargate
A deep dive into model quantization with GGUF and llama.cpp and model evaluation with LlamaIndex