Skip to content
Advertisement

remove empty dataframe from list and drop corresponding name in second list

I have two lists, where the first one is a list of strings called names and has been generated by using the name of the corresponding csv files.

names = ['ID1','ID2','ID3'] 

I have loaded the csv files into individual pandas dataframes and then done some preprocessing which leaves me with a list of lists, where each element is the data of each dataframe:

dfs = [['car','fast','blue'],[],['red','bike','slow']]

As you can see it can happen that after preprocessing a dataframe could be empty, which leads to an empty list in dfs.

I would like to remove the element from this list and return it’s index, so far I have tried this but I get no index when printing k.

k = [i for i,x in enumerate(dfs) if not x]

The reason I need this index is, so I can then look at removing the corresponding index element in list names.

The end results would look a bit like this:

names = ['ID1','ID3'] 
dfs = [['car','fast','blue'],['red','bike','slow']]

This way I can then save each individual dataframe as a csv file:

for df, name in zip(dfs, names):
    df.to_csv(name + '_.csv', index=False)

EDIT: I MADE A MISTAKE: The list of lists called dfs needs changing from [”] to []

Advertisement

Answer

You can use the built-in any() method:

k = [i for i, x in enumerate(dfs) if not any(x)]

The reason your

k = [i for i, x in enumerate(dfs) if not x]

doesn’t work is because, regardless of what is in a list, as long as the list is not empty, the truthy value of the list will be True.

The any() method will take in an array, and return whether any of the elements in the array has a truthy value of True. If the array has no elements such, it will return False. The thruthy value of an empty string, '', is False.

EDIT: The question got edited, here is my updated answer:

You can try creating new lists:

names = ['ID1','ID2','ID3'] 
dfs = [['car','fast','blue'],[],['red','bike','slow']]

new_names = list()
new_dfs = list()

for i, x in enumerate(dfs):
    if x:
        new_names.append(names[i])
        new_dfs.append(x)

print(new_names)
print(new_dfs)

Output:

['ID1', 'ID3']
[['car', 'fast', 'blue'], ['red', 'bike', 'slow']]

If it doesn’t work, try adding a print(x) to the loop to see what is going on:

names = ['ID1','ID2','ID3'] 
dfs = [['car','fast','blue'],[],['red','bike','slow']]

new_names = list()
new_dfs = list()

for i, x in enumerate(dfs):
    print(x)
    if x:
        new_names.append(names[i])
        new_dfs.append(x)
User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement