Pandas: Pivot a DataFrame, columns to rows

Question

I have a DataFrame defined like this: The DataFrame is now this: I want to pivot the DataFrame so that it then looks like this: I think I want to do this via pivoting, but I've not yet worked out how to do this using the pivot() or pivot_table()functions. How can I do this, with or without using a pivot?

Accepted Answer

You can use melt, but first rename columns by dict:d = {'first_day':1,'second_day':2,'third_day':3}df = pd.melt(df.rename(columns=d), id_vars=['variable','year','month'], var_name='day')df = df.sort_values(['variable','year','month', 'day']).reset_index(drop=True)print (df)   variable  year  month day  value0      PRCP  1900      1   1      51      PRCP  1900      1   2      52      PRCP  1900      1   3      13      PRCP  1900      2   1      84      PRCP  1900      2   2      85      PRCP  1900      2   3      76      PRCP  1901      1   1      97      PRCP  1901      1   2      98      PRCP  1901      1   3      39      PRCP  1901      2   1      210     PRCP  1901      2   2      211     PRCP  1901      2   3      512     TAVG  1900      1   1      713     TAVG  1900      1   2      714     TAVG  1900      1   3      515     TAVG  1900      2   1      316     TAVG  1900      2   2      317     TAVG  1900      2   3      718     TAVG  1901      1   1      419     TAVG  1901      1   2      520     TAVG  1901      1   3      821     TAVG  1901      2   1      122     TAVG  1901      2   2      823     TAVG  1901      2   3      9Or map column day by dict:d = {'first_day':1,'second_day':2,'third_day':3}df = pd.melt(df, id_vars=['variable','year','month'], var_name='day')df.day = df.day.map(d)df = df.sort_values(['variable','year','month', 'day']).reset_index(drop=True)print (df)   variable  year  month  day  value0      PRCP  1900      1    1      51      PRCP  1900      1    2      52      PRCP  1900      1    3      13      PRCP  1900      2    1      84      PRCP  1900      2    2      85      PRCP  1900      2    3      76      PRCP  1901      1    1      97      PRCP  1901      1    2      98      PRCP  1901      1    3      39      PRCP  1901      2    1      210     PRCP  1901      2    2      211     PRCP  1901      2    3      512     TAVG  1900      1    1      713     TAVG  1900      1    2      714     TAVG  1900      1    3      515     TAVG  1900      2    1      316     TAVG  1900      2    2      317     TAVG  1900      2    3      718     TAVG  1901      1    1      419     TAVG  1901      1    2      520     TAVG  1901      1    3      821     TAVG  1901      2    1      122     TAVG  1901      2    2      823     TAVG  1901      2    3      9Another solution with stack:d = {'first_day':1,'second_day':2,'third_day':3}df = df.rename(columns=d).set_index(['variable','year','month'])       .stack()       .reset_index(name='value')       .rename(columns={'level_3':'day'})print (df)   variable  year  month  day  value0      PRCP  1900      1    1      51      PRCP  1900      1    2      52      PRCP  1900      1    3      13      PRCP  1900      2    1      84      PRCP  1900      2    2      85      PRCP  1900      2    3      76      TAVG  1900      1    1      77      TAVG  1900      1    2      78      TAVG  1900      1    3      59      TAVG  1900      2    1      310     TAVG  1900      2    2      311     TAVG  1900      2    3      712     PRCP  1901      1    1      913     PRCP  1901      1    2      914     PRCP  1901      1    3      315     PRCP  1901      2    1      216     PRCP  1901      2    2      217     PRCP  1901      2    3      518     TAVG  1901      1    1      419     TAVG  1901      1    2      520     TAVG  1901      1    3      821     TAVG  1901      2    1      122     TAVG  1901      2    2      823     TAVG  1901      2    3      9

Advertisement

Answer