klotz: scala* + spark*

Bookmarks on this page are managed by an admin user.

0 bookmark(s) - Sort by: Date ↓ / Title / - Bookmarks from other users for this tag

  1. val newDf = df.withColumn("D", when($"B".isNull or $"B" === "", 0).otherwise(1))
    2018-04-16 Tags: , by klotz
  2. Unfortunately text8 has had periods stripped out so you can't just split on them, but you can find the raw version here as well as the perl script used to process it, and it isn't hard to edit the script to not remove periods.
    2017-04-21 Tags: , , by klotz
  3. Example Bloom Filter use in Spark 2.0
    2017-04-04 Tags: , , by klotz
  4. import org.apache.spark.sql.functions.{udf, explode}

    val zip = udf((xs: Seq Long » , ys: Seq Long » ) => xs.zip(ys))

    df.withColumn("vars", explode(zip_udf($"varA", $"varB"))).select(
    $"userId", $"someString",
    $"vars._1".alias("varA"), $"vars._2".alias("varB")).show
    2017-03-09 Tags: , , , by klotz

Top of the page

First / Previous / Next / Last / Page 2 of 0 SemanticScuttle - klotz.me: Tags: scala + spark

About - Propulsed by SemanticScuttle