I’m trying to get from a
to b
. I got a Pandas data frame similar to the a below.
JavaScript
x
17
17
1
data={'col1':['N1','N1','N2','N2', 'N2','N3'],
2
'col2':['DE','NO','DE','NO', 'IT','DE'],
3
'col3':[7, 5, 4, 1, 2, 8],
4
'col3_sum':[12, 12, 7, 7, 7, 8],
5
'col4':[0.6, 0.2, 0.7, 0.1, 0.2, 0.6],
6
'col4_sum':[0.8, 0.8, 1.0, 1.0, 1.0, 0.6],
7
'col5':[1,2,3,4,5,6]}
8
a=pd.DataFrame(data)
9
print(a)
10
col1 col2 col3 col3_sum col4 col4_sum col5
11
0 N1 DE 7 12 0.6 0.8 1
12
1 N1 NO 5 12 0.2 0.8 2
13
2 N2 DE 4 7 0.7 1.0 3
14
3 N2 NO 1 7 0.1 1.0 4
15
4 N2 IT 2 7 0.2 1.0 5
16
5 N3 DE 8 8 0.6 0.6 6
17
I realize I’ve backed myself into a corner by computing sums in a flat file. I’m new to Python. I guess I should create the sums when I’m done pivoting?
What I am stuck in is this wrong b
struggle,
JavaScript
1
10
10
1
b = df.pivot_table(index=['col1'],
2
values=['col3', 'col3_sum','col4', 'col4_sum'],
3
columns='col2')
4
# or
5
b = pd.pivot_table(a,index=['col1', 'col2'], columns=['col3', 'col4'],
6
aggfunc=len).reset_index()
7
# this makes senst to me, but not to python
8
a.pivot(index=['col1', 'col2'], columns='col2', values=['col3', 'col4'])
9
# print(b) # where I'm stuck at ...
10
I would like to get to something like this b
,
JavaScript
1
9
1
print(b) # my goal
2
col1 var var_sum DE NO IT
3
N1 col3 12 7 5
4
N1 col4 0.8 0.6 0.2
5
N2 col3 7 4 1 2
6
N2 col4 1.0 0.7 0.1 0.2
7
N3 col3 8 8
8
N3 col4 0.6 0.6
9
I’m not sure what to search for (some of the maybe relevant questions has way too much complexity for my to be able to extract what I need, at least at the moment). I’ve looked a lot at this answer, maybe I should find a way using .groupby()
Advertisement
Answer
Maybe you can compute the sum afterwards:
JavaScript
1
13
13
1
out = pd.melt(a, ["col1", "col2"], ["col3", "col4"]).pivot(
2
["col1", "variable"], "col2"
3
)
4
out["var_sum"] = out.sum(axis=1)
5
6
out = out.reset_index()
7
out.index.name, out.columns.name = None, None
8
out.columns = [
9
f"{a}_{b}".replace("value", "").strip("_") for a, b in out.columns
10
]
11
12
print(out)
13
Prints:
JavaScript
1
8
1
col1 variable DE IT NO var_sum
2
0 N1 col3 7.0 NaN 5.0 12.0
3
1 N1 col4 0.6 NaN 0.2 0.8
4
2 N2 col3 4.0 2.0 1.0 7.0
5
3 N2 col4 0.7 0.2 0.1 1.0
6
4 N3 col3 8.0 NaN NaN 8.0
7
5 N3 col4 0.6 NaN NaN 0.6
8