Skip to content
Advertisement

SAS Proc Transpose to Pyspark

I am trying to convert a SAS proc transpose statement to pyspark in databricks. With the following data as a sample:

JavaScript

I would expect the result to look like this

I tried using the pandas pivot_table() function with the following code however I ran into some performance issues with the size of the data:

JavaScript

Is there a way to translate the PROC Transpose SAS logic to Pyspark instead of using pandas?

I am trying something like this but am getting an error

JavaScript

If you could help me out I would so appreciate it! Thank you so much.

Advertisement

Answer

I don’t know how you create df from data but here is what I did:

JavaScript

Then your pandas code worked.

To use PySpark method, here is what I did:

JavaScript
User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement