Most efficient way to place a Pandas data frame into a list of dictionaries with a certain format

Question

I have a Pandas data frame that contains one column and an index of timestamps. The code for the data frame looks something like this: I want to create a list of dictionaries from the rows of df in a certain way. For each row of the data frame, I want to create a dictionary with the keys "Timestamp" and

Accepted Answer

You need to rename your column and give your Index a name and turn it into a column. Then you want DataFrame.to_dict using the 'records'  ('r') orientation.df = df.rename(columns={'colname': 'Value'}).rename_axis(index='Timestamp').reset_index()dict_list = df.to_dict('records')#[{'Timestamp': Timestamp('2020-12-12 06:00:00'), 'Value': 0.2},# {'Timestamp': Timestamp('2020-12-12 06:01:00'), 'Value': 0.4},# {'Timestamp': Timestamp('2020-12-12 06:02:00'), 'Value': 0.6}]For larger DataFrames it gets a bit faster than simple looping, but it still gets slow as things are largeimport perfplotimport pandas as pdimport numpy as npdef loop(df):    dict_list = []    for i in range(df.shape[0]):        new_dict = {'Timestamp': df.index[i],'Value': df.iloc[i,0]}        dict_list.append(new_dict)    return dict_list                def df_to_dict(df):    df = df.rename(columns={'colname': 'Value'}).rename_axis(index='Timestamp').reset_index()    return df.to_dict('records')perfplot.show(    setup=lambda n: pd.DataFrame({'colname': np.random.normal(0,1,n)},                                 index=pd.date_range('12-12-2020', freq = 'T', periods=n)),    kernels=[        lambda df: loop(df),        lambda df: df_to_dict(df),    ],    labels=['Loop', 'df.to_dict'],    n_range=[2 ** k for k in range(20)],    equality_check=None,    xlabel='len(df)')

Advertisement

Answer