This pull request adds initial support for reranking to libllama, llama-embeddings, and llama-server using two models: BAAI/bge-reranker-v2-m3 and jinaai/jina-reranker-v1-tiny-en. The reranking is implemented as a classification head added to the model graph. Testing and benchmarking were performed with server integration.
This page provides documentation for the rerank API, including endpoints, request parameters, and response formats.
Maximize search relevancy and RAG accuracy with Jina Reranker. Features include multilingual retrieval, code search, and a 6x speedup over the previous version.
This blog post demonstrates how to create a reusable retrieval evaluation dataset using an LLM to judge query-document pairs. It discusses the process, including building a small labeled dataset, aligning LLM judgments with human preferences, and using the LLM to judge a large set of queries and documents.
"On May 5th, I received an email from an anonymous source claiming to have access to a massive leak of API documentation from inside Google’s Search division. The documents were confirmed as authentic by ex-Google employees and contained extraordinary claims about Google’s search operations."