from a database I get the following table into a python dataframe df
:
FunctionID | FunctionText | FunctionModule | UserGroup |
---|---|---|---|
1 | Fct1 | ModX | GroupA |
2 | Fct2 | ModX | GroupA |
2 | Fct2 | ModX | GroupB |
3 | Fct3 | ModY | GroupB |
3 | Fct3 | ModY | GroupC |
. | … | … | … |
3000 | Fct3000 | ModZ | GroupF |
My goal is to get a pivot-like table that looks like this one:
FunctionID | FunctionText | FunctionModule | GroupA | GroupB | GroupC | … | GroupF |
---|---|---|---|---|---|---|---|
1 | Fct1 | ModX | X | … | |||
2 | Fct2 | ModX | X | X | … | ||
3 | Fct3 | ModY | X | X | … | ||
. | … | … | … | … | … | … | … |
3000 | Fct3000 | ModZ | … | X |
So, the first 3 columns shall stay as is whereas the entries of UserGroup column shall be the added columns of the pivot table.
I tried some approaches like
JavaScript
x
2
1
df2 = df.pivot_table(index=df.iloc[:3],columns='UserGroup',values='UserGroup')
2
but without success. Maybe I have to use pivot()
or stack()
or unstack()
? But none of those led me to the desired target table.
Advertisement
Answer
IIUC, you could add an extra column and pivot_table
with aggfunc='first'
:
JavaScript
1
10
10
1
(df
2
.assign(value='x') # or value=df['UserGroup'] if you want the names as value
3
.pivot_table(index=list(df.columns[:3]),
4
columns='UserGroup',
5
values='value',
6
aggfunc='first',
7
fill_value='',
8
)
9
)
10
output:
JavaScript
1
6
1
UserGroup GroupA GroupB GroupC
2
FunctionID FunctionText FunctionModule
3
1 Fct1 ModX x
4
2 Fct2 ModX x x
5
3 Fct3 ModY x x
6
NB. note that index=df.iloc[:3]
will also select ‘UserGroup’, I used index=list(df.columns[:3])
here