Tags: document* + github*

0 bookmark(s) - Sort by: Date ↓ / Title /

  1. Rensa is a high-performance MinHash suite written in Rust with Python bindings. It's designed for efficient similarity estimation and deduplication of large datasets. It offers R-MinHash, C-MinHash, and OptDensMinHash variants, significantly faster than datasketch while maintaining comparable accuracy.
  2. Docling simplifies document processing, parsing diverse formats — including advanced PDF understanding — and providing seamless integrations with the gen AI ecosystem.
    2025-05-25 Tags: , , , , , by klotz
  3. Docling is a tool that parses documents and exports them to desired formats like Markdown and JSON. It supports various document formats and provides advanced PDF understanding, metadata extraction, and integration with LlamaIndex and LangChain for RAG / QA applications.
    2024-11-01 Tags: , , , , , , , , , by klotz
  4. An open-source project offering a functional RAG UI for document QA, suitable for both end-users and developers. It supports various LLM providers, is customizable, and offers multi-modal QA, citations, and complex reasoning methods.
    2024-10-13 Tags: , , , , , , , by klotz
  5. 2018-08-12 Tags: , , , by klotz

Top of the page

First / Previous / Next / Last / Page 1 of 0 SemanticScuttle - klotz.me: tagged with "document+github"

About - Propulsed by SemanticScuttle