I have a dataframe which has duplicate column names:
JavaScript
x
4
1
Accepted Accepted Accepted Reject Accepted Reject
2
ABC IJK JKL XYJ LMN UIO
3
BCD PQR EFG YVG GHIJ PLK
4
…and want to convert it into two dataframes; one only of “Accepted” columns and other for “Reject” Columns:
df1:
JavaScript
1
4
1
Accepted Accepted Accepted Accepted
2
ABC IJK JKL LMN
3
BCD PQR EFG GHIJ
4
df2:
JavaScript
1
4
1
Reject Reject
2
XYJ UIO
3
YVG PLK
4
Tried:
JavaScript
1
3
1
df1=df["Accepted"]
2
df2=df["Reject"]
3
… but this only gives the first column matching this name.
Advertisement
Answer
If select one column with same name are selected all columns with same name in DataFrame:
JavaScript
1
3
1
df1 = df['Accepted']
2
df2 = df['Reject']
3
Then is possible deduplicate columns:
JavaScript
1
3
1
df1.columns = [f'{x}_{i}' for i, x in enumerate(df1.columns, 1)]
2
df2.columns = [f'{x}_{i}' for i, x in enumerate(df2.columns, 1)]
3
EDIT: If get only first column name it means there are not duplicated columns names, so is possible use DataFrame.filter
:
JavaScript
1
3
1
df1 = df.filter(like='Accepted')
2
df2 = df.filter(like='Reject')
3