0 bookmark(s) - Sort by: Date ↓ / Title / - Bookmarks from other users for this tag
SmolVLM2 represents a shift in video understanding technology by introducing efficient models that can run on various devices, from phones to servers. The release includes models of three sizes (2.2B, 500M, and 256M) with Python and Swift API support. These models offer video understanding capabilities with reduced memory consumption, supported by a suite of demo applications for practical use.
LLM 0.17 release enables multi-modal input, allowing users to send images, audio, and video files to Large Language Models like GPT-4o, Llama, and Gemini, with a Python API and cost-effective pricing.
First / Previous / Next / Last
/ Page 1 of 0