Skip to content
Advertisement

Tag: dataframe

Optimal way to acquire percentiles of DataFrame rows

Problem I have a pandas DataFrame df: My desired output, i.e. new_df, contains the 9 different percentiles including the median, and should have the following format: Attempt The following was my initial attempt: However, instead of returning the percentiles of all columns, it calculated these percentiles for each val column and therefore returned 1000 columns. As it calculated the percentiles

Pivotting DataFrame with fixed column names

Let’s say I have below dataframe: and by design each user has 3 rows. I want to turn my DataFrame into: I was trying to groupBy(col(‘user’)) and then pivot by ticker but it returns as many columns as different tickers there are so instead I wish I could have fixed number of columns. Is there any other Spark operator I

Advertisement