This article discusses how to apply vision language models (VLMs) to document understanding, covering application areas like agentic use cases, question answering, classification, and information extraction, as well as limitations like cost and processing long documents.