Skip to content
Advertisement

Python: Rows to Column in Pandas Dataframe

I have following dataframe:

data.head()

Out:

     metric_name            metric_date     warehouse   value          week  year   day
0   Crossdock Transfer Out  2022-05-10      WR1        1.370313e+06     19  2022    2
21  New Vendor Freight      2022-04-19      WR1        1.583337e+06     16  2022    2
59  Crossdock Transfer Out  2021-12-26      WR1        3.805000e+03     51  2021    0
80  New Vendor Freight      2021-12-30      WR1        2.832327e+06     52  2021    4
90  Crossdock Transfer In   2022-05-22      WR1        0.000000e+00     20  2022    0
...     ...     ...     ...     ...     ...     ...     ...
127699  LF Forecasted New Crossdock Transfer Out    2021-11-01  WR1     2.595843e+06    44  2021    1

There are different metrics at column “metric_name” with its own value. To have a better dataset for time series forecasting I want to convert my Dataframe. I want every single metric in “metric_name” as new column in the dataset.

See here all different metrics in “metric_name”:

data.metric_name.value_counts()

Out:

LF Forecasted End Vendor Freight + End Transfer In Backlog    364
LF Forecasted New Crossdock Transfer Out                      364
LF Forecasted New Vendor Freight                              364
LF Forecasted New Crossdock Transfer In                       364
Forecasted New Crossdock Transfer Out                         359
Forecasted End Vendor Freight + End Transfer In Backlog       359
Forecasted New Crossdock Transfer In                          359
Forecasted New Vendor Freight                                 359
Crossdock Transfer Out                                        345
New Vendor Freight                                            345
Crossdock Transfer In                                         345
End Vendor Freight + End Transfer In Backlog                  345
Name: metric_name, dtype: int64

What can I do to solve this?

I tried this, but it is not giving me the values, just the counts:

bhx4.groupby('metric_date')['metric_name'].value_counts().unstack()

metric_name     Crossdock Transfer In   Crossdock Transfer Out  End Vendor Freight + End Transfer In Backlog    Forecasted End Vendor Freight + End Transfer In Backlog     Forecasted New Crossdock Transfer In    Forecasted New Crossdock Transfer Out   Forecasted New Vendor Freight   LF Forecasted End Vendor Freight + End Transfer In Backlog  LF Forecasted New Crossdock Transfer In     LF Forecasted New Crossdock Transfer Out    LF Forecasted New Vendor Freight    New Vendor Freight
 metric_date                                                
 2021-07-25     1.0     1.0     1.0     1.0     1.0     1.0     1.0         1.0     1.0     1.0     1.0     1.0
 2021-07-26     1.0     1.0     1.0     1.0     1.0     1.0     1.0     1.0     1.0     1.0     1.0     1.0
 2021-07-27     1.0     1.0     1.0     1.0     1.0     1.0     1.0     1.0     1.0     1.0     1.0     1.0
 2021-07-28     1.0     1.0     1.0     1.0     1.0     1.0     1.0     1.0     1.0     1.0     1.0     1.0
 2021-07-29     1.0     1.0     1.0     1.0     1.0     1.0     1.0     1.0     1.0     1.0     1.0     1.0

I want the DF to look something like this: enter image description here Thanks!

Advertisement

Answer

If I got you right:

data.groupby(['metric_date', 'metric_name'])['value'].sum().unstack().fillna(0)
User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement