Tags: scala* + spark*

0 bookmark(s) - Sort by: Date ↓ / Title /

  1. val newDf = df.withColumn("D", when($"B".isNull or $"B" === "", 0).otherwise(1))
    2018-04-16 Tags: , by klotz
  2. Unfortunately text8 has had periods stripped out so you can't just split on them, but you can find the raw version here as well as the perl script used to process it, and it isn't hard to edit the script to not remove periods.
    2017-04-21 Tags: , , by klotz
  3. Example Bloom Filter use in Spark 2.0
    2017-04-04 Tags: , , by klotz
  4. import org.apache.spark.sql.functions.{udf, explode}

    val zip = udf((xs: Seq Long » , ys: Seq Long » ) => xs.zip(ys))

    df.withColumn("vars", explode(zip_udf($"varA", $"varB"))).select(
    $"userId", $"someString",
    $"vars._1".alias("varA"), $"vars._2".alias("varB")).show
    2017-03-09 Tags: , , , by klotz

Top of the page

First / Previous / Next / Last / Page 2 of 0 SemanticScuttle - klotz.me: tagged with "scala+spark"

About - Propulsed by SemanticScuttle