Tags: python* + document* + github* + deduplication*

0 bookmark(s) - Sort by: Date ↓ / Title /

  1. Rensa is a high-performance MinHash suite written in Rust with Python bindings. It's designed for efficient similarity estimation and deduplication of large datasets. It offers R-MinHash, C-MinHash, and OptDensMinHash variants, significantly faster than datasketch while maintaining comparable accuracy.

Top of the page

First / Previous / Next / Last / Page 1 of 0 SemanticScuttle - klotz.me: tagged with "python+document+github+deduplication"

About - Propulsed by SemanticScuttle