Skip to content
Advertisement

Pandas how to mege the list of columns with NaN?

I want to merge the columns that have the list objects.

The problem is, I need to remove duplicate parts.

How am I able to get the columns that have the merged list like below?

Source:

     col_0      col_a      col_b      col_c

0      aa        [1]        NaN       [2,3]
1      bb       [a, b]     [b, c]      [c]
2      cc        NaN        NaN        NaN

Expected:

     col_0      col_a      col_b      col_c     merged_a_to_c

0      aa        [1]        NaN       [2,3]        [1,2,3]
1      bb       [a, b]     [b, c]      [c]        [a, b, c]
2      cc        NaN        NaN        NaN           NaN

Advertisement

Answer

def merge(df):
    merged_a_to_c = []
    for row in range(len(df)):
        merge_tmp = []
        for columns in range(len(df.columns)):
            if type(df.iloc[row, columns]) == list: 
                for element in df.iloc[row, columns]:
                    if element not in merge_tmp:
                        merge_tmp.append(element)
                
        if merge_tmp != []:
            merged_a_to_c.append(merge_tmp)
        else:
            merged_a_to_c.append(np.nan)
    
    df['merged_a_to_c'] = merged_a_to_c
    return(df)
  col_0   col_a   col_b   col_c merged_a_to_c
0    aa     [1]     NaN  [2, 3]     [1, 2, 3]
1    bb  [a, b]  [b, c]     [c]     [a, b, c]
2    cc     NaN     NaN     NaN           NaN


You can use this code regardless of the size(column lengths, row lengths) of dataframes.


I edited some codes cuz I didn’t realize that I should concern the duplicate problems.

Advertisement