Skip to content
Advertisement

Read data from Excel and search it in df, TypeError: ‘in ‘ requires string as left operand, not float

I read a lot about this Error, but I couldn’t find a solution for me.

I have an Excel with 3 columns in which I store keywords. I want to read these keywords and search it in a Pandas Dataframe. The Code below gives me an Error:

    # Error 
    if Keywords_EKN[y] in df.iloc[x, 12]:
    TypeError: 'in <string>' requires string as left operand, not float

The Code:

    df_Dienstleister = pd.read_excel('Dienstleister.xlsx', header=None)
    Keywords_Dritte = df_Dienstleister.values.T[0].tolist()
    Keywords_EDT = df_Dienstleister.values.T[1].tolist()
    Keywords_EKN = df_Dienstleister.values.T[2].tolist()

    # Search for Keywords in df and replace some new data
    # There is another Excel in df
       for x in range(0, rows-1):
           for y in range(0, number_of_Keywords_EKN):
               if Keywords_EKN[y] in df.iloc[x, 12]:
                   df.iloc[x, 13] = "EKN"
           for z in range(0, number_of_Keywords_EDT):
               if (Keywords_EDT[z] in df.iloc[x, 12]):  
                   df.iloc[x, 13] = "EDT"
           for w in range(0, number_of_Keywords_Dritte):
               if  (Keywords_Dritte[w] in df.iloc[x, 12]) :
                  df.iloc[x, 13] = "Dritte"

But when I read just one column from Excel and write the another Keywords in the Code, it works fine: (I have more Keywords in EKN and EDT, it’s just to show my problem)

Keywords_Dritte = df_Dienstleister.values.T[0].tolist()
Keywords_EKN = ['EKN']
Keywords_EDT = ['EDT']

The ouput of print(Keywords_EKN[y]) is

EKN
nan

I don’t know, what’s the problem. Thanks for any help.

Advertisement

Answer

Your EKN contains np.nan which is float value (or any other non-string value). You can invoke the error with code like this:

import numpy as np
import pandas as pd

kw = ['EKN', np.nan] # or 2, 2.3,...any non-string value
df = pd.DataFrame({'vals': ["EKN", "KNE", "xs"]})

for y in range(0, len(kw)):
    if kw[y] in df.iloc[0, 0]:
        print('found')

Result is error because in expects string from kw[y] but got float. Solution could be quite simple:

if str(kw[y]) in df.iloc[0, 0]:

or in your case:

if str(Keywords_EKN[y]) in df.iloc[x, 12]:

or replace nan values from the dataframe at the beginning as suggested Timus in comment.

User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement