Tags: vision-language model* + human*

0 bookmark(s) - Sort by: Date ↓ / Title /

  1. This study introduces a domain-specific Large Vision-Language Model, Human-Scene Vision-Language Model (HumanVLM), designed to provide a foundation for human-scene Vision-Language tasks. They create a large-scale human-scene multimodal image-text dataset (HumanCaption-10M), develop a captioning approach for human-centered images, and train a HumanVLM.

Top of the page

First / Previous / Next / Last / Page 1 of 0 SemanticScuttle - klotz.me: tagged with "vision-language model+human"

About - Propulsed by SemanticScuttle