This pull request adds initial support for reranking to libllama, llama-embeddings, and llama-server using two models: BAAI/bge-reranker-v2-m3 and jinaai/jina-reranker-v1-tiny-en. The reranking is implemented as a classification head added to the model graph. Testing and benchmarking were performed with server integration.
This page provides documentation for the rerank API, including endpoints, request parameters, and response formats.
Maximize search relevancy and RAG accuracy with Jina Reranker. Features include multilingual retrieval, code search, and a 6x speedup over the previous version.
A web search extension for Oobabooga's text-generation-webui (now with nougat) that allows for web search integration with the AI.
A mapping of Vespa terminology to equivalent concepts in Elasticsearch, OpenSearch, and Solr.
SerpApi provides a web scraping API to access Google Search and other search engine results. Get structured data for SEO, market research, and more.
This blog post discusses strategies for staying up-to-date on the rapidly evolving field of AI, covering resources, tools, and techniques for tracking news, research, and developments.
A post discussing new techniques developed for parsing and searching PDFs, focusing on turning them into a hierarchical structure for RAG search. The approach involves dynamically generating chunks for searches, sending headers and sub-headers to the Language Model along with relevant chunks.
LangChain's ElasticsearchRetriever enables full flexibility in defining retrieval strategies, allowing users to experiment with different approaches.
This article provides a step-by-step guide on building a generative search engine for local files using Qdrant, NVidia NIM API, or Llama 3. It includes system design, indexing local files, and creating a user interface.