Skip to content
Advertisement

pandas: manage duplicated sentences on different columns

I have a dataframe as follows:

JavaScript

I want to add the first column value to a sentence if that sentence is repeated somewhere else in the next three columns. so my desired output would be

col1 col2 col3 col4
1_a 1_aJoe waited for the train. the weather is nice the house looks amazing
2_a The train was late. the weather is cold his profession is unknown
3_a Mary and Samantha took the bus. i like going out it is a beautiful day
4_a I looked for Mary and Samantha at the bus station 4_aJoe waited for the train. we just moved to this house

and this is what I did so far,

JavaScript

the problem is that the append would add the sentences to the end of the list, which means it would mess up the original dataframes order (as wanted in the desired output).

Advertisement

Answer

You can actually do some fancy numpy broadcasting here.

JavaScript

Output:

JavaScript
User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement