High-performance deployment of the vLLM serving engine, optimized for serving large language models at scale.
This article discusses how to overcome limitations of retrieval-augmented generation (RAG) models by creating an AI assistant using advanced SQL vector queries. The author uses tools such as MyScaleDB, OpenAI, LangChain, Hugging Face and the HackerNews API to develop an application that enhances the accuracy and efficiency of data retrieval process.
A tutorial showing you how how to bring real-time data to LLMs through function calling, using OpenAI's latest LLM GTP-4o.