I have two DataFrames of 20 rows and 4 columns. The names and value types of the columns are the same.
One of the columns is the title
, the other 3 are values.
df1 title col1 col2 col3 apple a d g pear b e h grape c f i df2 title col1 col2 col3 carrot q t w pumpkin r u x sprouts s v y
Now I would like to create 3 separate tables/lists subtracting each value of df1.col1 - df2.col1
| df1.col2 - df2.col2
| df1.col3 - df2.col3
. For df1.col1 - df2.col1
I expect an output that looks something among the lines of:
df1.title df2.title score apple carrot (a - q) apple pumpkin (a - r) apple sprouts (a - s) pear carrot (b - t) pear pumpkin (b - u) pear sprouts (b - v) grape carrot (c - w) grape pumpkin (c - x) grape sprouts (c - y)
I tried to create a for loop using the following code:
for i in df1.iterrows(): score_col1 = df1.col1[[i]] - df2.col2[[j]] score_col2 = df1.col2[[i]] - df2.col2[[j]] score_col3 = df1.col3[[i]] - df2.col3[[j]] score_total = score_col1 + score_col2 + score_col3 i = i + 1
In return, I received an output for score_col1
looking like this:
df1.title df2.title score apple carrot (a - q) pear carrot (b - t) grape carrot (c - w)
Can someone help me to obtain the expected output?
Advertisement
Answer
a1 = ['apple','pear', 'banana'] b1 = [56,32,23] c1 = [12,34,90] d1 = [87,65,23] a2 = ['carrot','pumpkin','sprouts'] b2 = [16,12,93] c2 = [12,32,70] d2 = [81,55,21] df1 = pd.DataFrame({'title':a1, 'col1':b1, 'col2':c1, 'col3':d1}) df2 = pd.DataFrame({'title':a2, 'col1':b2, 'col2':c2, 'col3':d2}) res_df = pd.DataFrame([]) cols = ['col1','col2','col3'] for c in cols: res_df = pd.DataFrame([]) for i,j in df1.iterrows(): for k,l in df2.iterrows(): res_df = res_df.append(pd.DataFrame({'title_df1':j.title, 'title_df2':l.title, 'score':j[str(c)] - l[str(c)]},index=[0]), ignore_index=True) print(res_df)