Find a substring in cells across multiple columns in a Pandas dataframe

Question

I have a large DataFrame with 50+ columns which I'm simplifying here below: I'm trying to find a) whether there are any instances of '--->' in any of the cells across the DataFrame? b) if so where? (optional) So far I've tried 2 approaches this only works for strings not substrings I get: (I believe this may only work for

Accepted Answer

You can use .applymap() to test each individual value in a dataframe.>>> df      Name  Age Balance    Country      Currency0  Samurai   34   777.0  usa--->jp    usd--->yen1     Jack   31   555.5        usa           usd2     Mojo   16   488.1        n/a           n/a3     Jojo   32  119.11  uk--->usa  pound--->usd>>> df.applymap(lambda x: isinstance(x, str) and '--->' in x)    Name    Age  Balance  Country  Currency0  False  False    False     True      True1  False  False    False    False     False2  False  False    False    False     False3  False  False    False     True      TrueTo use the .str accessor you can:>>> df.select_dtypes(object).apply(lambda col: col.str.contains('--->'))    Name  Balance  Country  Currency0  False    False     True      True1  False    False    False     False2  False    False    False     False3  False    False     True      TrueThe output differs a little &#8211; note the Age column is not there.

Advertisement

Answer