This tutorial provides a step-by-step guide on building an LLM router to balance the use of high-quality closed LLMs like GPT-4 and cost-effective open-source LLMs, achieving high response quality while minimizing costs. The approach includes preparing labeled data, finetuning a causal LLM classifier, and offline evaluation using the RouteLLM framework.
Llama-agents is an async-first framework for building, iterating, and productionizing multi-agent systems, including multi-agent communication, distributed tool execution, human-in-the-loop, and more!
Automates conversion of various file types and GitHub repositories into LLM-ready Markdown documents.
A mini python based tool designed to convert various types of files and GitHub repositories into LLM-ready Markdown documents with metadata, table of contents, and consistent heading styles. Supports multiple file types, handles zip files, and has GitHub integration.
The llmsherpa project provides APIs to accelerate Large Language Model (LLM) projects. It includes features like LayoutPDFReader for PDF text parsing, smart chunking for vector search and Retrieval Augmented Generation, and table analysis. It is open-sourced under Apache 2.0 license.
Unblocked can not only ingest your code repositories, but also related material — your website, your product documentation, your conversations in GitHub issues and Slack — in order to provide a service that I call context assembly. I picked up that term from Jack Ozzie, back when he was working with his brother Ray on Groove, a peer-to-peer successor to Ray’s greatest hit, Lotus Notes, which pioneered what became known as knowledge management. Like Notes, Groove brought information work into shared spaces where you could search your mail, calendars, documents, and data all at once.
txtai is an open-source embeddings database for various applications such as semantic search, LLM orchestration, language model workflows, and more. It allows users to perform vector search with SQL, create embeddings for text, audio, images, and video, and run pipelines powered by language models for question-answering, transcription, translation, and more.
This article explains Retrieval Augmented Generation (RAG), a method to reduce the risk of hallucinations in Large Language Models (LLMs) by limiting the context in which they generate answers. RAG is demonstrated using txtai, an open-source embeddings database for semantic search, LLM orchestration, and language model workflows.
This post highlights how the GitHub Copilot Chat VS Code Extension was vulnerable to data exfiltration via prompt injection when analyzing untrusted source code.
Retrochat is chat application that supports Llama.cpp, Kobold.cpp, and Ollama. It highlights new features, commands for configuration, chat management, and models, and provides a download link for the release.