Tags: parquet*

0 bookmark(s) - Sort by: Date ↓ / Title /

  1. PyStore is a simple (yet powerful) datastore for Pandas dataframes, designed with storing timeseries data in mind. It leverages Pandas, Numpy, Dask, and Parquet (via pyarrow) for efficient data handling.
  2. usersDF.write.format("orc")
    .option("orc.bloom.filter.columns", "favorite_color")
    .option("orc.dictionary.key.threshold", "1.0")
    .option("orc.column.encoding.direct", "name")
    .save("users_with_options.orc")
    Find full example code at "examples/src/main/scala/org/apache/spark/examples/sql/SQLDataSourceExample.scala" in the Spark repo
    2021-12-01 Tags: , , , , by klotz
  3. // set Parquet file block size and page size values
    int blockSize = 256 * 1024 * 1024;
    int pageSize = 64 * 1024;
     
    2017-02-15 Tags: , , , by klotz
  4. 2017-02-13 Tags: , , , , , by klotz
  5. 2014-02-19 Tags: , , , by klotz
  6. 2014-02-19 Tags: , , , by klotz
  7. 2013-09-26 Tags: , , by klotz
  8. 2013-08-27 Tags: , , , by klotz
  9. 2013-07-01 Tags: , , , by klotz

Top of the page

First / Previous / Next / Last / Page 1 of 0 SemanticScuttle - klotz.me: tagged with "parquet"

About - Propulsed by SemanticScuttle