I’m trying to split my data by different labels, like this:
dfa = df_a[((df_a['label'] == 0) | (df_a['label'] == 15) | (df_a['label'] == 16))]
And this works fine for small amounts of numbers. However, I want to do this for many values. for example:
to_train = (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14,17, 18, 19, 20) # this can change dfb = [i for i in to_train if df_b['label']==i] # ValueError
This spits outs an error:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
I’ve read the other questions with this error, but I am already using bitwise operators, they don’t address many conditions from what I understand.
How do I split the dataframe based on what’s in the tuple/list/etc?
Advertisement
Answer
to_train = (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14,17, 18, 19, 20) dfb = dfa[df_a['label'].isin(to_train)]