Skip to content
Advertisement

How to combine string from one column to another column at same index in pandas DataFrame?

I was doing a project in nlp. My input is:

index  name  lst 
0      a     c    
0            d    
0            e    
1            f    
1      b     g   

I need output like this:

index  name  lst combine  
0      a     c    a c 
0            d    a d  
0            e    a e  
1            f    b f  
1      b     g    b g 

How can I achieve this?

Advertisement

Answer

You can use groupby+transform('max') to replace the empty cells with the letter per group as the letters have precedence over space. The rest is a simple string concatenation per column:

df['combine'] = df.groupby('index')['name'].transform('max') + ' ' + df['lst']

Used input:

df = pd.DataFrame({'index': [0,0,0,1,1],
                   'name': ['a','','','','b'],
                   'lst': list('cdefg'),
                  })

NB. I considered “index” to be a column here, if this is the index you should use df.index in the groupby

Output:

   index name lst combine
0      0    a   c     a c
1      0        d     a d
2      0        e     a e
3      1        f     b f
4      1    b   g     b g
User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement