I have a dataframe which has duplicate column names:
Accepted Accepted Accepted Reject Accepted Reject ABC IJK JKL XYJ LMN UIO BCD PQR EFG YVG GHIJ PLK
…and want to convert it into two dataframes; one only of “Accepted” columns and other for “Reject” Columns:
df1:
Accepted Accepted Accepted Accepted ABC IJK JKL LMN BCD PQR EFG GHIJ
df2:
Reject Reject XYJ UIO YVG PLK
Tried:
df1=df["Accepted"] df2=df["Reject"]
… but this only gives the first column matching this name.
Advertisement
Answer
If select one column with same name are selected all columns with same name in DataFrame:
df1 = df['Accepted'] df2 = df['Reject']
Then is possible deduplicate columns:
df1.columns = [f'{x}_{i}' for i, x in enumerate(df1.columns, 1)] df2.columns = [f'{x}_{i}' for i, x in enumerate(df2.columns, 1)]
EDIT: If get only first column name it means there are not duplicated columns names, so is possible use DataFrame.filter
:
df1 = df.filter(like='Accepted') df2 = df.filter(like='Reject')