df.withColumn("rank", rank().over(Window.partitionBy("Dept_id").orderBy($"salary".desc)))
.filter($"rank" <= 3)
.drop("rank")
I confirm that when I do window functions on df2 partitioned by userid there is no shuffle! Thanks @user8371915!
val w = Window.partitionBy($"id")
val df2 = df.withColumn("maxCharge", max("charge").over(w))
.filter($"maxCharge" === $"charge")
.drop("charge")
.withColumnRenamed("maxCharge", "charge")