klotz: hadoop*

The open source, distributed, parallel computation framework developed by Doug Cutting and Mike Cafarella and based on functional programming operations Map and Reduce, as described in the Google MapReduce paper.

0 bookmark(s) - Sort by: Date ↓ / Title / - Bookmarks from other users for this tag

  1. Apache Iceberg is emerging as a cornerstone for data lakes and lakehouses in the modern data stack, drawing parallels to the rise of Hadoop a decade ago. This article explores these similarities, highlighting both the opportunities and challenges that Iceberg presents for data engineering.

  2. 2022-05-16 Tags: , , , , , , by klotz
  3. The reload4j project offers a clear and easy migration path for the thousands of users who have an urgent need to fix vulnerabilities in log4j 1.2.17.

    2022-01-25 Tags: , , , by klotz
  4. usersDF.write.format("orc") .option("orc.bloom.filter.columns", "favorite_color") .option("orc.dictionary.key.threshold", "1.0") .option("orc.column.encoding.direct", "name") .save("users_with_options.orc") Find full example code at "examples/src/main/scala/org/apache/spark/examples/sql/SQLDataSourceExample.scala" in the Spark repo

    2021-12-01 Tags: , , , , by klotz
  5. 2021-01-28 Tags: , , , , , by klotz
  6. sparkSession.conf .set(“spark.sql.sources.partitionOverwriteMode”, “dynamic”)

  7. QUOTA REMAINING_QUOTA SPACE_QUOTA REMAINING_SPACE_QUOTA DIR_COUNT FILE_COUNT CONTENT_SIZE FILE_NAME

    2019-05-31 Tags: , by klotz

Top of the page

First / Previous / Next / Last / Page 1 of 0 SemanticScuttle - klotz.me: Tags: hadoop

About - Propulsed by SemanticScuttle