0 bookmark(s) - Sort by: Date ↓ / Title /
df.withColumn("rank", rank().over(Window.partitionBy("Dept_id").orderBy($"salary".desc))) .filter($"rank" <= 3) .drop("rank")
I confirm that when I do window functions on df2 partitioned by userid there is no shuffle! Thanks @user8371915!
val w = Window.partitionBy($"id") val df2 = df.withColumn("maxCharge", max("charge").over(w)) .filter($"maxCharge" === $"charge") .drop("charge") .withColumnRenamed("maxCharge", "charge")
First / Previous / Next / Last
/ Page 1 of 0