klotz: data engineering*

0 bookmark(s) - Sort by: Date ↓ / Title / - Bookmarks from other users for this tag

  1. The article discusses the rise of Apache Iceberg as the dominant open table format, backed by major endorsements, and outlines key developments expected for 2025 such as Role-Based Access Control (RBAC) catalogs, Change Data Capture (CDC) capabilities, and materialized views.
  2. This article explains how to quickly detect data quality issues and identify their causes using Python for ETL pipelines. It discusses strategies to minimize the time required to fix data quality problems.
  3. How to ensure data quality and integrity using open-source tools for observability in data pipelines.
  4. Data pipelines are essential for connecting data across systems and platforms. This article provides a deep dive into how data pipelines are implemented, their use cases, and how they're evolving with generative AI.
  5. A guide to tracking in MLOps, covering code, data, and machine learning model tracking
  6. Airbyte is an open-source data integration engine that helps you consolidate your data in your data warehouses, lakes and databases.
  7. This article provides Python tricks and techniques for data ingestion, validation, processing, and testing in data engineering projects. It offers practical solutions for streamlining the code, including tips for data validation, handling errors, and testing.
    2024-06-13 Tags: , by klotz
  8. An exploration of the benefits of switching from the popular Python library Pandas to the newer Polars for data manipulation tasks, highlighting improvements in performance, concurrency, and ease of use.
  9. An in-process analytics database, DuckDB can work with surprisingly large data sets without having to maintain a distributed multiserver system. Best of all? You can analyze data directly from your Python app.
  10. An article discussing a simple and free way to automate data workflows using Python and GitHub Actions, written by Shaw Talebi.

Top of the page

First / Previous / Next / Last / Page 1 of 0 SemanticScuttle - klotz.me: Tags: data engineering

About - Propulsed by SemanticScuttle