- WKHTMLTOPDF is a set of open source command line tools for converting HTML pages into PDFs or images.
- It uses Qt WebKit rendering engine and runs headlessly without requiring a display.
- A C library is available too.
PDFwhisper allows you to have a conversation with your PDF docs. Finding info on your PDF files is now easier than ever.
We introduce LayoutLM, one of the renowned models for extracting information from documents, developed by Microsoft. To tailor a solution for our specific needs, we label our documents using Label Studio, an open-source labeling tool, connected to our remote storage AWS S3.