SemanticScuttle - klotz.me » Tags: snowflake

Snowflake manager explains the 'Spider-Man' theory of AI agent data access

Snowflake is focusing on data interoperability and governance to overcome the bottlenecks hindering AI agent development. By leveraging open standards like the Apache Iceberg table format, the company aims to provide a unified layer that ensures data is clean, accessible, and secure for various AI engines. This approach allows for a "multi-reader, multi-writer" environment where different compute engines can access the same data stored in cloud object storage without compromising governance.
Key points:
* Emphasis on data quality and accessibility as the primary bottleneck for AI agents.
* Use of Apache Iceberg and Iceberg REST to enable interoperable data stacks.
* The Spider-Man analogy regarding the responsibility that comes with direct data access.
* Support for multi-engine access, including third-party tools like Apache Spark.
* Roadmap includes Iceberg v3 support and Snowflake-managed storage for Iceberg tables.

2026-04-11 Tags: snowflake, llm, agents, apache iceberg, data governance, interoperability, data analytics by klotz

Snowflake Releases Arctic Embed L 2.0 and Arctic Embed M 2.0: A Set of Extremely Strong Yet Small Embedding Models for English and Multilingual Retrieval

Snowflake recently announced the launch of Arctic Embed L 2.0 and Arctic Embed M 2.0, two small and powerful embedding models tailored for multilingual search and retrieval. The models are available in medium and large variants, with the medium model incorporating 305 million parameters and the large variant with 568 million parameters. Both models support context lengths of up to 8,192 tokens. They demonstrate high-quality retrieval across multiple languages and excel in benchmarks like MTEB and CLEF.

2024-12-09 Tags: snowflake, arctic embed, text, embedding, llm, multilingual, retrieval by klotz

How to Generate Unique IDs in Distributed Systems

This article explores 7 popular approaches to generating unique IDs in distributed systems, including UUIDs, database auto-increment, Snowflake IDs, Redis-based generation, NanoID, hash-based IDs, and ULIDs.

2024-11-15 Tags: unique id, uuid, snowflake, distributed systems by klotz

Building a Panel Dashboard with Snowpark for Python | by Sophia Yang | Apr, 2022 | Towards Data Science

2022-04-25 Tags: snowflake, snowpark, python, data science by klotz

Snow Park python

2022-04-01 Tags: snowflake, python by klotz

The modern data pattern. Replyable data processing and ingestion… | by Luca Bigon | Jan, 2022 | Towards Data Science

2022-01-31 Tags: lambda architecture, repeatability, snowflake, dbt, pulumi, data engineering, machine learning, data warehouse by klotz

Modern Data Stack: Which Place for Spark ? | by Furcy Pin | Jan, 2022 | Towards Data Science

2022-01-31 Tags: spark, bigquery, snowflake, data engineering by klotz

Ticket Servers: Distributed Unique Primary Keys on the Cheap | code.flickr.com

2019-05-22 Tags: flickr, snowflake, unique id, distributed systems by klotz

Announcing Snowflake