Drop data frames with condition contains (os.path.exists)

Tags: ,



Trying to drop rows with path that doesn’t exist…

data_docs = pd.read_csv('Documents_data.csv')
data_docs.drop(data_docs[os.path.exists(str(data_docs['file path']))].index, inplace=True)

Error:

KeyError: False

Answer

As it stands, os.path.exists looks at the whole str representation of the column, not element-by-element. One way is to apply:

exists = data_docs["file path"].apply(os.path.exists)
data_docs = data_docs[exists]

If you print exists, it will be a boolean series saying which paths exist and which do not.

exists = ~exists
data_docs.drop(data_docs[exists].index, inplace=True)

inverted exist to drop the one with false result, now it will drop the files that doesn’t exist.



Source: stackoverflow