This article explains how to quickly detect data quality issues and identify their causes using Python for ETL pipelines. It discusses strategies to minimize the time required to fix data quality problems.
This article provides Python tricks and techniques for data ingestion, validation, processing, and testing in data engineering projects. It offers practical solutions for streamlining the code, including tips for data validation, handling errors, and testing.
An exploration of the benefits of switching from the popular Python library Pandas to the newer Polars for data manipulation tasks, highlighting improvements in performance, concurrency, and ease of use.
An in-process analytics database, DuckDB can work with surprisingly large data sets without having to maintain a distributed multiserver system. Best of all? You can analyze data directly from your Python app.
An article discussing a simple and free way to automate data workflows using Python and GitHub Actions, written by Shaw Talebi.
Intro to Streamlit
- Simple and complex Streamlit example
- Data and state management in Streamlit apps
- Data widgets for Streamlit apps
- Deploying Streamlit apps