Skip to content
Advertisement

Problems Removing Duplicated Words from Pandas Row

I am working on an NLP assignment and having some problems removing duplicated strings from a pandas column.

The data I am using is tagged, so some of the rows of data were repeated because the same comment could have multiple tags. So what I did was group the data by ID and Comment and aggregated based on tags, like so:

JavaScript

After grouping the data, the tags column had duplicates or more of the same tag. I have tried to remove the duplicated tags, to get unique tags, but have not been successful. First, I tried

JavaScript

but it did not remove the duplicated tags. So I tried a simple function to get the unique tags, but that was also not successful. The function is below:

JavaScript

Sample data is below:

JavaScript

Advertisement

Answer

Is this what you want?

JavaScript
User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement