0 bookmark(s) - Sort by: Date ↓ / Title /
This article details a method for converting PDFs to Markdown using a local LLM (Gemma 3 via Ollama), focusing on privacy and efficiency. It involves rendering PDF pages as images and then using the LLM for content extraction, even from scanned PDFs.
Microsoft has open-sourced MarkItDown, a state-of-the-art application designed to convert various file types into Markdown format for seamless integration, collaboration, and accessibility. The tool supports multiple file formats, including PDFs, PowerPoint presentations, Word documents, Excel spreadsheets, images, audio, HTML, text-based formats, and ZIP files, making it a versatile utility for users across different domains.
Docling is a tool that parses documents and exports them to desired formats like Markdown and JSON. It supports various document formats including PDF, DOCX, PPTX, Images, HTML, AsciiDoc, and Markdown.
First / Previous / Next / Last
/ Page 1 of 0