I’d like to find duplicated rows in a Pandas dataframe. When I use df.duplicated()
it returns the following error:
TypeError: unhashable type: ‘list’
To resolve this error, I tried the following:
df2 = df[df.applymap(lambda x: x[0] if isinstance(x, list) else x).duplicated()]
However, I receive a new but similar error: “TypeError: unhashable type: ‘dict'”
Does anyone know how I can use applymap lambda
with two conditions? (the conditions are if isinstance(x, list)
OR if isinstance(x, dict)
)?
UPDATE: Here is the sample of the data (first few rows of the df):
Thank you!
Advertisement
Answer
With the apply
you create a column with dictionary in it, the issue is that the method duplicated
use hashing to compare values and in Python dictionaries (and lists) are not hashable.
But strings are hashable so you could add .astype(str)
:
df2 = df.applymap(lambda x: x[0] if isinstance(x, list) else x).astype(str).duplicated()