This article discusses the integration of Large Language Models (LLMs) into Vespa, a full-featured search engine and vector database. It explores the benefits of using LLMs for Retrieval-augmented Generation (RAG), demonstrating how Vespa can efficiently retrieve the most relevant data and enrich responses with up-to-date information.