I have the following pandas dataframe X
in long format:
JavaScript
x
15
15
1
alt.var group
2
1 1
3
2 1
4
3 1
5
4 1
6
1 2
7
2 2
8
1 3
9
2 3
10
1 4
11
1 5
12
2 5
13
3 5
14
1 6
15
And I would like to change to the following wide format according the number of alternatives in the group:
JavaScript
1
8
1
group 1 2 3 4
2
1 4 3 2 1
3
2 4 3 NA NA
4
3 4 3 NA NA
5
4 4 NA NA NA
6
5 4 3 2 NA
7
6 4 NA NA NA
8
i.e. I want to create 4 columns (indexed by 1,2,3,4, the names as in the alt.var
column) (the maximum number of alternatives in a group) and column i
get assigned 5-i
if element i
exists in group i
and NA
or null value if element i
does not exists in group i
.
I have tried to do a little digging in stackoverflow but it doesn’t seem to match any result.
Advertisement
Answer
You can create a val
column with 5
– value in alt.var
then pivot
JavaScript
1
3
1
out = (df.assign(val=5-df['alt.var'])
2
.pivot(index='group', columns='alt.var', values='val'))
3
JavaScript
1
11
11
1
print(out)
2
3
alt.var 1 2 3 4
4
group
5
1 4.0 3.0 2.0 1.0
6
2 4.0 3.0 NaN NaN
7
3 4.0 3.0 NaN NaN
8
4 4.0 NaN NaN NaN
9
5 4.0 3.0 2.0 NaN
10
6 4.0 NaN NaN NaN
11