Apache Iceberg is emerging as a cornerstone for data lakes and lakehouses in the modern data stack, drawing parallels to the rise of Hadoop a decade ago. This article explores these similarities, highlighting both the opportunities and challenges that Iceberg presents for data engineering.
   
    
 
 
  
   
   The reload4j project offers a clear and easy migration path for the thousands of users who have an urgent need to fix vulnerabilities in log4j 1.2.17.
   
    
 
 
  
   
   usersDF.write.format("orc")
  .option("orc.bloom.filter.columns", "favorite_color")
  .option("orc.dictionary.key.threshold", "1.0")
  .option("orc.column.encoding.direct", "name")
  .save("users_with_options.orc")
Find full example code at "examples/src/main/scala/org/apache/spark/examples/sql/SQLDataSourceExample.scala" in the Spark repo
   
    
 
 
  
   
   sparkSession.conf
 .set(“spark.sql.sources.partitionOverwriteMode”, “dynamic”)
   
    
 
 
  
   
   QUOTA  REMAINING_QUOTA     SPACE_QUOTA  REMAINING_SPACE_QUOTA    DIR_COUNT  FILE_COUNT      CONTENT_SIZE FILE_NAME