I would like to join two files by key but I would like some columns to be joined together for example:
File1:
    df1 = pd.DataFrame({'List' : ['P111', 'P999', 'P111;P999;P777', 'P555', 'P666;P111;P333'],
               'Color' : ['red', 'red', 'blue','yellow', 'red']})
        List        Color
0  P111              red
1  P999              red
2  P111;P999;P777    blue
3  P555              yellow
4  P666;P111;P333    red
File2:
    df2 = pd.DataFrame({'Cod' : ['P111', 'P222', 'P333', 'P444', 'P555', 'P666', 'P777'],
               'Animal' : ['DOG', 'CAT', 'BUG','SNAKE,DOG', 'CAT,BUG', 'DOG', 'SNAKE'],
               'Letter' : ['A,F', 'C', 'S,M', 'F,L', 'C,A','M,C', 'Z,L']})
    Cod     Animal     Letter
 0  P111    DOG         A,F
 1  P222    CAT         C
 2  P333    BUG         S,M
 3  P444    SNAKE,DOG   F,L
 4  P555    CAT,BUG     C,A
 5  P666    DOG         M,C
 6  P777    SNAKE       Z,L
I would like to merge with primary key List (from file1) and Cod (from file2), to get:
List Color Animal Letter 0 P111 red DOG,FROG A,F 1 P999 red - - 2 P111;P999;P777 blue DOG,FROG|-|SNAKE A,F|-|Z,L 3 P555 yellow CAT,BUG C,A 4 P666;P111;P333 red DOG|DOG,FROG|BUG M,C|A,F|S,M
I think we need something like a left join and an agragation but I don’t know how. In the final table I would like that in the values not found there was a – . While I would like a | to separate values in “aggregated” columns
Which is the best way?
Advertisement
Answer
Idea is use DataFrame.explode by splitted values of List, then use left join and aggregate first values with join for expected ouput:
df = (df1.assign(Cod = df1['List'].str.split(';'))
         .explode('Cod')
         .reset_index()
         .merge(df2, how='left', on='Cod')
         .fillna('-')
         .groupby('index')
         .agg(List=('List','first'),
              Color=('Color','first'),
              Animal=('Animal','|'.join),
              Letter=('Letter','|'.join))
         .rename_axis(None))
print (df)
             List   Color       Animal       Letter
0            P111     red          DOG          A,F
1            P999     red            -            -
2  P111;P999;P777    blue  DOG|-|SNAKE    A,F|-|Z,L
3            P555  yellow      CAT,BUG          C,A
4  P666;P111;P333     red  DOG|DOG|BUG  M,C|A,F|S,M