what would be the most efficient way to do this in pandas

Question

I'm trying to figure out the most efficient way to join two dataframes such as below. I've tried pd.merge and maybe using the rank function but cannot seem to figure a way. Thanks in advance df1 What I'm trying to achieve is this df2 Answer You might want to use groupby with unstack as advised in this answer:

Accepted Answer

You might want to use groupby with unstack as advised in this answer:import pandas as pdfrom string import ascii_uppercase# Reproduce the datadf = pd.DataFrame()df['A'] = ['TBK1','TBK1', 'TBK3']df['B'] = ['2022-01-01' , '2022-02-02', '2022-05-07']df['C'] = ['2022-04-04', '2021-01-09', '2023-02-04']# Count how many rows exists per unique entrys = df.groupby(['A']).cumcount() # Unstackdf1 = df.set_index(['A', s]).unstack().sort_index(level=1, axis=1)# Rename columnsdf1.columns = [l for l in ascii_uppercase[1:len(df1.columns)+1]]# Flatten columns names (aesthetics)df1 = df1.reset_index()print(df1)      A           B           C           D           E0  TBK1  2022-01-01  2022-04-04  2022-02-02  2021-01-091  TBK3  2022-05-07  2023-02-04         NaN         NaN

Advertisement

Answer