Scaling a simple RAG pipeline from simple notes to full books. This post elaborates on how to utilize larger files with your RAG pipeline by adding an extra step to the process — chunking.
In this notebook, we will explore a typical RAG solution where we will utilize an open-source model and the vector database Chroma DB. However, we will integrate a semantic cache system that will store various user queries and decide whether to generate the prompt enriched with information from the vector database or the cache.