I have following dataframe:
JavaScript
x
2
1
data.head()
2
Out:
JavaScript
1
9
1
metric_name metric_date warehouse value week year day
2
0 Crossdock Transfer Out 2022-05-10 WR1 1.370313e+06 19 2022 2
3
21 New Vendor Freight 2022-04-19 WR1 1.583337e+06 16 2022 2
4
59 Crossdock Transfer Out 2021-12-26 WR1 3.805000e+03 51 2021 0
5
80 New Vendor Freight 2021-12-30 WR1 2.832327e+06 52 2021 4
6
90 Crossdock Transfer In 2022-05-22 WR1 0.000000e+00 20 2022 0
7
8
127699 LF Forecasted New Crossdock Transfer Out 2021-11-01 WR1 2.595843e+06 44 2021 1
9
There are different metrics at column “metric_name” with its own value. To have a better dataset for time series forecasting I want to convert my Dataframe. I want every single metric in “metric_name” as new column in the dataset.
See here all different metrics in “metric_name”:
JavaScript
1
2
1
data.metric_name.value_counts()
2
Out:
JavaScript
1
14
14
1
LF Forecasted End Vendor Freight + End Transfer In Backlog 364
2
LF Forecasted New Crossdock Transfer Out 364
3
LF Forecasted New Vendor Freight 364
4
LF Forecasted New Crossdock Transfer In 364
5
Forecasted New Crossdock Transfer Out 359
6
Forecasted End Vendor Freight + End Transfer In Backlog 359
7
Forecasted New Crossdock Transfer In 359
8
Forecasted New Vendor Freight 359
9
Crossdock Transfer Out 345
10
New Vendor Freight 345
11
Crossdock Transfer In 345
12
End Vendor Freight + End Transfer In Backlog 345
13
Name: metric_name, dtype: int64
14
What can I do to solve this?
I tried this, but it is not giving me the values, just the counts:
JavaScript
1
10
10
1
bhx4.groupby('metric_date')['metric_name'].value_counts().unstack()
2
3
metric_name Crossdock Transfer In Crossdock Transfer Out End Vendor Freight + End Transfer In Backlog Forecasted End Vendor Freight + End Transfer In Backlog Forecasted New Crossdock Transfer In Forecasted New Crossdock Transfer Out Forecasted New Vendor Freight LF Forecasted End Vendor Freight + End Transfer In Backlog LF Forecasted New Crossdock Transfer In LF Forecasted New Crossdock Transfer Out LF Forecasted New Vendor Freight New Vendor Freight
4
metric_date
5
2021-07-25 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
6
2021-07-26 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
7
2021-07-27 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
8
2021-07-28 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
9
2021-07-29 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
10
I want the DF to look something like this:
Thanks!
Advertisement
Answer
If I got you right:
JavaScript
1
2
1
data.groupby(['metric_date', 'metric_name'])['value'].sum().unstack().fillna(0)
2