SemanticScuttle - klotz.me » klotz: vision language models

klotz: vision language models*

IBM Granite-Docling: End-to-end document understanding

IBM is releasing Granite-Docling-258M, an ultra-compact and cutting-edge open-source vision-language model (VLM) for converting documents to machine-readable formats while preserving layout, tables, equations, and more. It's designed for accurate and efficient document conversion and excels beyond simple text extraction.

2025-10-14 Tags: vision language models, docling, ibm llm document, conversion, granite-docling, ocr, rag, foss by klotz

Using Vision Language Models to Process Millions of Documents

This article discusses how to apply vision language models (VLMs) to document understanding, covering application areas like agentic use cases, question answering, classification, and information extraction, as well as limitations like cost and processing long documents.

2025-09-27 Tags: vision language models, vlm, document understanding, question answering, classification, information extraction by klotz

SGLang - Home

SGLang is a fast serving framework for large language models and vision language models. It focuses on efficient serving and controllable interaction through co-designed backend runtime and frontend language.

2025-04-30 Tags: llm, vision language models, inference engineering, quantization, sglang by klotz

Scaling ColPali to billions of PDFs with Vespa

This blog post explores scaling ColPali for efficient document retrieval across large collections of PDFs using Vespa's phased retrieval and ranking pipeline, including the use of a hamming-based MaxSim similarity function.

2024-09-23 Tags: colpali, document retrieval, vespa, maxsim, hamming distance, vlm, binary quantization, pdf, vision language models, llm by klotz

First / Previous / Next / Last / Page 1 of 0

SemanticScuttle - klotz.me

klotz: vision language models*

Linked Tags

Related Tags