I have 2 dataframes with different columns:
JavaScript
x
4
1
DF A - DF B -
2
number | a | b | c |||| a | c | d | e | f
3
1 | 12 | 13 | 15 |||| 22 | 33 | 44 | 55 | 77
4
I would like to add the missing columns for the 2 dataframes – so each one will have each own columns + the other DFs columns (without column “number”). And the new columns will have initial number for our choice (let’s say 0).
So the final output:
JavaScript
1
7
1
DF A -
2
number | a | b | c | d | e | f
3
1 | 12 | 13 | 15 | 0 | 0 | 0
4
DF B -
5
a | b | c | d | e | f
6
22 | 0 | 33 | 44 | 55 | 77
7
What’s the best way to achieve this result? I’ve got messed up with getting the columns and trying to create new ones.
Thank!
Advertisement
Answer
First, you need to create a superset of all the columns which are present in both the dataframes. This you can do using the below code.
JavaScript
1
2
1
all_columns = list(set(A.columns.to_list() + B.columns.to_list()))
2
Then for each dataframes, you need to check which columns are missing that you can do using the below code.
JavaScript
1
3
1
col_missing_from_A = [col for col in all_columns if col not in A.columns]
2
col_missing_from_B = [col for col in all_columns if col not in B.columns]
3
Then add the missing columns in both the dataframes
JavaScript
1
3
1
A[col_missing_from_A] = 0
2
A[col_missing_from_B] = 0
3
Hope this solves your query!