0 bookmark(s) - Sort by: Date ↓ / Title / - Bookmarks from other users for this tag
Report of investigations of computing/video setting for unstructured collaborative work among people separated by space and time.
Consider Phlebas, a new show about a war between a civilization ruled by AI and one dominated by religion, is in development at Prime Video.
SmolVLM2 represents a shift in video understanding technology by introducing efficient models that can run on various devices, from phones to servers. The release includes models of three sizes (2.2B, 500M, and 256M) with Python and Swift API support. These models offer video understanding capabilities with reduced memory consumption, supported by a suite of demo applications for practical use.
Qwen2.5-VL-3B-Instruct is the latest addition to the Qwen family of vision-language models by Hugging Face, featuring enhanced capabilities in understanding visual content and generating structured outputs. It is designed to directly interact with tools and use computer and phone functions as a visual agent. Qwen2.5-VL can comprehend videos up to an hour long and localize objects within images using bounding boxes or points. It is available in three sizes: 3, 7, and 72 billion parameters.
LLM 0.17 release enables multi-modal input, allowing users to send images, audio, and video files to Large Language Models like GPT-4o, Llama, and Gemini, with a Python API and cost-effective pricing.
The author records a screen capture of their Gmail account and uses Google Gemini to extract numeric values from the video.
A tool to transcribe and summarize videos from multiple sources using AI models in Google Colab or locally.
This video features an interview with Professor Hal Abelson, a pioneer in computer science education. He reflects on his career, starting from his early work with Logo programming language and its use in education. He emphasizes the importance of computer education for everyone, particularly for children who can use technology to make a real-world impact.
Abelson also discusses the risks associated with artificial intelligence and MIT's decision to make educational materials freely available online, which led to MIT OpenCourseWare. He believes computer scientists should not only focus on technical advancements but also consider the ethical implications of their work, asking "What, in fact, is worth making?". The video also highlights resources like Logo, Scratch, and MIT App Inventor, encouraging viewers to explore these tools.
Game designer Will Wright and musician Brian Eno discuss the generative systems used in their respective creative works. This clip features original music by Brian Eno.
First / Previous / Next / Last
/ Page 1 of 0