klotz: pdf‑parser*

0 bookmark(s) - Sort by: Date ↓ / Title / - Bookmarks from other users for this tag

  1. LiteParse is a lightweight, open‑source PDF parsing tool that delivers fast, high‑quality spatial text extraction with bounding boxes. Built on PDF.js and Tesseract.js, it runs entirely locally without cloud dependencies, supporting PDF, Office, and image formats via automatic conversion. Users can parse documents via a CLI or as a library, generate high‑resolution screenshots, and integrate custom OCR servers through a simple API. Ideal for production pipelines, LiteParse offers JSON or text outputs, precise bounding boxes, and multi‑platform support across Linux, macOS, and Windows.

Top of the page

First / Previous / Next / Last / Page 1 of 0 SemanticScuttle - klotz.me: Tags: pdf‑parser

About - Propulsed by SemanticScuttle