from a database I get the following table into a python dataframe df
:
FunctionID | FunctionText | FunctionModule | UserGroup |
---|---|---|---|
1 | Fct1 | ModX | GroupA |
2 | Fct2 | ModX | GroupA |
2 | Fct2 | ModX | GroupB |
3 | Fct3 | ModY | GroupB |
3 | Fct3 | ModY | GroupC |
. | … | … | … |
3000 | Fct3000 | ModZ | GroupF |
My goal is to get a pivot-like table that looks like this one:
FunctionID | FunctionText | FunctionModule | GroupA | GroupB | GroupC | … | GroupF |
---|---|---|---|---|---|---|---|
1 | Fct1 | ModX | X | … | |||
2 | Fct2 | ModX | X | X | … | ||
3 | Fct3 | ModY | X | X | … | ||
. | … | … | … | … | … | … | … |
3000 | Fct3000 | ModZ | … | X |
So, the first 3 columns shall stay as is whereas the entries of UserGroup column shall be the added columns of the pivot table.
I tried some approaches like
df2 = df.pivot_table(index=df.iloc[:3],columns='UserGroup',values='UserGroup')
but without success. Maybe I have to use pivot()
or stack()
or unstack()
? But none of those led me to the desired target table.
Advertisement
Answer
IIUC, you could add an extra column and pivot_table
with aggfunc='first'
:
(df .assign(value='x') # or value=df['UserGroup'] if you want the names as value .pivot_table(index=list(df.columns[:3]), columns='UserGroup', values='value', aggfunc='first', fill_value='', ) )
output:
UserGroup GroupA GroupB GroupC FunctionID FunctionText FunctionModule 1 Fct1 ModX x 2 Fct2 ModX x x 3 Fct3 ModY x x
NB. note that index=df.iloc[:3]
will also select ‘UserGroup’, I used index=list(df.columns[:3])
here