klotz: vision*

0 bookmark(s) - Sort by: Date ↓ / Title / - Bookmarks from other users for this tag

  1. This study reveals a role for the superior colliculus in higher-order cognition, independent of its role in spatial orienting. Researchers found that the superior colliculus exhibits robust encoding of learned visual categories and its inactivation markedly impaired category decisions in rhesus macaques.
  2. AI-On-The-Edge-Cam is an ESP32-S3 board with a 2MP camera, microSD card, WiFi, BLE, and PoE Ethernet connectivity designed to digitize legacy utility meters such as water meters, gas meters, or electricity meters that require manual onsite reading. It supports Tensorflow Lite, has a web interface, and integrates with Home Assistant.
  3. Mistral Small 3.2 is a minor update to the Mistral Small 3.1 model, offering improvements in instruction following, repetition errors, and function calling. The article details the author's experience running the model locally using Ollama and GGUF quantizations, including generating an SVG image and describing it with the model itself.
    2025-06-21 Tags: , , , by klotz
  4. A summary of a workshop presented at PyCon US on building software with LLMs, covering setup, prompting, building tools (text-to-SQL, structured data extraction, semantic search/RAG), tool usage, and security considerations like prompt injection. It also discusses the current LLM landscape, including models from OpenAI, Gemini, Anthropic, and open-weight alternatives.
  5. This article details a new plugin, llm-video-frames, that allows users to feed video files into long context vision LLMs (like GPT-4.1) by converting them into a sequence of JPEG frames. It showcases how to install and use the plugin, provides examples with the Cleo video, and discusses the cost and technical details of the process. It also covers the development of the plugin using an LLM and highlights other features in LLM 0.25.
    2025-05-06 Tags: , , , , , by klotz
  6. A review of the Qwen2.5-VL-32B large language model, noting its performance, capabilities, and how it runs on a 64GB Mac. Includes a demonstration with a map image and performance statistics.
    2025-03-26 Tags: , , , by klotz
  7. SenseCraft AI is a free, web-based platform designed for beginners, focusing on a no-code approach and application-orientation to simplify and accelerate the creation of AI applications.
  8. This article provides a step-by-step guide on how to configure model output using MQTT for the XIAO ESP32S3 Sense board on the SenseCraft AI platform.
  9. Learn how to build Llama 3.2-Vision locally in a chat-like mode, and explore its Multimodal skills on a Colab notebook.
  10. Google DeepMind introduced PaliGemma 2, a new family of Vision-Language Models with parameter sizes ranging from 3 billion to 28 billion, designed to address challenges in generalizing across different tasks and adapting to various input data types, including diverse image resolutions.

Top of the page

First / Previous / Next / Last / Page 1 of 0 SemanticScuttle - klotz.me: Tags: vision

About - Propulsed by SemanticScuttle