I was doing a project in nlp. My input is:
JavaScript
x
7
1
index name lst
2
0 a c
3
0 d
4
0 e
5
1 f
6
1 b g
7
I need output like this:
JavaScript
1
7
1
index name lst combine
2
0 a c a c
3
0 d a d
4
0 e a e
5
1 f b f
6
1 b g b g
7
How can I achieve this?
Advertisement
Answer
You can use groupby
+transform('max')
to replace the empty cells with the letter per group as the letters have precedence over space. The rest is a simple string concatenation per column:
JavaScript
1
2
1
df['combine'] = df.groupby('index')['name'].transform('max') + ' ' + df['lst']
2
Used input:
JavaScript
1
5
1
df = pd.DataFrame({'index': [0,0,0,1,1],
2
'name': ['a','','','','b'],
3
'lst': list('cdefg'),
4
})
5
NB. I considered “index” to be a column here, if this is the index you should use df.index
in the groupby
Output:
JavaScript
1
7
1
index name lst combine
2
0 0 a c a c
3
1 0 d a d
4
2 0 e a e
5
3 1 f b f
6
4 1 b g b g
7