0 bookmark(s) - Sort by: Date ↓ / Title / - Bookmarks from other users for this tag
Docling is a tool that parses documents and exports them to desired formats like Markdown and JSON. It supports various document formats including PDF, DOCX, PPTX, Images, HTML, AsciiDoc, and Markdown.
apt install tesseract-ocr-fra
pdfocr adds an OCR text layer to scanned PDF files, allowing them to be searched. It currently depends on Ruby 1.8.7 or above, and uses ocropus, cuneiform, or tesseract for performing OCR.
To use, run:
pdfocr -i input.pdf -o output.pdf
First / Previous / Next / Last / Page 1 of 0