Skip to content
Advertisement

Dropping duplicate rows ignoring case (lowercase or Uppercase)

I have a data frame with one column (col). I’m trying to remove duplicate records regardless of lowercase or Uppercase, for example

    df = pd.DataFrame({'Col': ['Appliance Identification', 'Natural Language','Social networks',
                                  'natural language', 'Personal robot', 'Social Networks', 'Natural language']})

output:

Col
0   Appliance Identification
1   Natural Language
2   Social networks
3   natural language
4   Personal robot
5   Social Networks
6   Natural language

Expected Output:

Col
0   Appliance Identification
1   Social networks
2   Personal robot
3   Natural language

How can this Dropping be done regardless of case-insensitively?

Advertisement

Answer

You could use:

df.groupby(df['Col'].str.lower(), as_index=False, sort=False).first()

output:

                        Col
0  Appliance Identification
1          Natural Language
2           Social networks
3            Personal robot
User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement