How to transpose a dataframe in pyspark?

Question

How do I transpose columns in Pyspark? I want to make columns become rows, and rows become the columns. Here is the input: Expected Outcome: Answer You can combine stack function to unpivot vin, mean and cur columns then pivot column idx: You apply the transformation one by one to see how it works and what do…

Accepted Answer

You can combine stack function to unpivot vin, mean and cur columns then pivot column idx:from pyspark.sql import functions as Fdf1 = df.selectExpr("idx", "stack(3, 'vin',vin, 'cur',cur, 'mean',mean)")     .select("idx", "col0", "col1")     .groupBy("col0")     .pivot("idx").agg(F.first("col1"))     .withColumnRenamed("col0", "idx")df1.show(truncate=False)#+----+-----+-----+#|idx |Type1|Type2|#+----+-----+-----+#|vin |D    |C    |#|mean|6.0  |7.0  |#|cur |5.0  |null |#+----+-----+-----+You apply the transformation one by one to see how it works and what do each part.

Advertisement

Answer