Skip to content

Tag: apache-spark

How to sort by value efficiently in PySpark?

I want to sort my K,V tuples by V, i.e. by the value. I know that TakeOrdered is good for this if you know how many you need: Using TakeOrdered: Using Lambda I’ve checked out the question here, which suggests the latter. I find it hard to believe that takeOrdered is so succinct and yet it requires the s…