Pandas 3.0 will significantly boost performance by replacing NumPy with PyArrow as its default engine, enabling faster loading and reading of columnar data.
A deep dive into the structure and performance benefits of Parquet files, including columnar storage, partitioning strategies, and row groups.
PyStore is a simple (yet powerful) datastore for Pandas dataframes, designed with storing timeseries data in mind. It leverages Pandas, Numpy, Dask, and Parquet (via pyarrow) for efficient data handling.