Skip to content
Advertisement

Python – Iterate through multiple dataframes and append data to a new dataframe

I have 3 pandas dataframes. I would like to append one row from each in each iteration to an existing dataframe.

Example shown below:

DF1 =

col1 col2 col3
a     a    a
d     d    d
g     g    g

DF2=

col1 col2 col3
b     b    b
e     e    e
h     h    h

DF3=

col1 col2 col3
c     b    b
f     f    f
i     i    i

clean_DF =

col1 col2 col3
a     a    a
b     b    b 
c     c    c
d     d    d
e     e    e
f     f    f
g     g    g
h     h    h 
i     i    i

Dummy code:

for i,j in df1.itterows():
   for a,b in df2.itterows():
       for c,d in df2.itterrows():
clean_df.append(i,j,a,b,c,d)

Please could someone point me in the right direction?

Advertisement

Answer

Concatenate them, using the keys argument to associate an index with rows from each original dataframe, then swap the index levels and sort the dataframe by this index.

df1 = pd.DataFrame([["a", "a", "a"], ["d", "d", "d"], ["g", "g", "g"]], columns=["col1", "col2", "col3"])
df2 = pd.DataFrame([["b", "b", "b"], ["e", "e", "e"], ["h", "h", "h"]], columns=["col1", "col2", "col3"])
df3 = pd.DataFrame([["c", "c", "c"], ["f", "f", "f"], ["i", "i", "i"]], columns=["col1", "col2", "col3"])
clean_df = pd.concat([df1, df2, df3], keys=range(3)).swaplevel().sort_index()

This assumes that each dataframe currently has a single index and is sorted by that index. If you have dataframes that may not be sorted by index, and you want to preserve their current sort orders, then you could reset their indices before concatenating them.

dfs = [df.reset_index() for df in [df1, df2, df3]]
clean_df = pd.concat(dfs, keys=range(len(dfs))).swaplevel().sort_index()
User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement