I have 2 dataframes with different columns:
DF A - DF B - number | a | b | c |||| a | c | d | e | f 1 | 12 | 13 | 15 |||| 22 | 33 | 44 | 55 | 77
I would like to add the missing columns for the 2 dataframes – so each one will have each own columns + the other DFs columns (without column “number”). And the new columns will have initial number for our choice (let’s say 0).
So the final output:
DF A - number | a | b | c | d | e | f 1 | 12 | 13 | 15 | 0 | 0 | 0 DF B - a | b | c | d | e | f 22 | 0 | 33 | 44 | 55 | 77
What’s the best way to achieve this result? I’ve got messed up with getting the columns and trying to create new ones.
Thank!
Advertisement
Answer
First, you need to create a superset of all the columns which are present in both the dataframes. This you can do using the below code.
all_columns = list(set(A.columns.to_list() + B.columns.to_list()))
Then for each dataframes, you need to check which columns are missing that you can do using the below code.
col_missing_from_A = [col for col in all_columns if col not in A.columns] col_missing_from_B = [col for col in all_columns if col not in B.columns]
Then add the missing columns in both the dataframes
A[col_missing_from_A] = 0 A[col_missing_from_B] = 0
Hope this solves your query!