I have 3 pandas dataframes. I would like to append one row from each in each iteration to an existing dataframe.
Example shown below:
DF1 = col1 col2 col3 a a a d d d g g g
DF2= col1 col2 col3 b b b e e e h h h
DF3= col1 col2 col3 c b b f f f i i i
clean_DF = col1 col2 col3 a a a b b b c c c d d d e e e f f f g g g h h h i i i
Dummy code:
for i,j in df1.itterows(): for a,b in df2.itterows(): for c,d in df2.itterrows(): clean_df.append(i,j,a,b,c,d)
Please could someone point me in the right direction?
Advertisement
Answer
Concatenate them, using the keys
argument to associate an index with rows from each original dataframe, then swap the index levels and sort the dataframe by this index.
df1 = pd.DataFrame([["a", "a", "a"], ["d", "d", "d"], ["g", "g", "g"]], columns=["col1", "col2", "col3"]) df2 = pd.DataFrame([["b", "b", "b"], ["e", "e", "e"], ["h", "h", "h"]], columns=["col1", "col2", "col3"]) df3 = pd.DataFrame([["c", "c", "c"], ["f", "f", "f"], ["i", "i", "i"]], columns=["col1", "col2", "col3"]) clean_df = pd.concat([df1, df2, df3], keys=range(3)).swaplevel().sort_index()
This assumes that each dataframe currently has a single index and is sorted by that index. If you have dataframes that may not be sorted by index, and you want to preserve their current sort orders, then you could reset their indices before concatenating them.
dfs = [df.reset_index() for df in [df1, df2, df3]] clean_df = pd.concat(dfs, keys=range(len(dfs))).swaplevel().sort_index()