Skip to content
Advertisement

Pandas: add column name to a list, if the column contains a specific set of value

I wish to create a new list which contains column names of those columns which have atleast one of the following values.

  1. Most of the time
  2. Quite Often
  3. Less than often
  4. Never

Sample :

df ={‘A’:[‘name1’, ‘name2’, ‘name3’, ‘name4’],

 'B':['Most of the Time', 'Never', 'Quite Often', 'Less than 
        Often'],

 'C':['abc', 'DEF', 'ghi', 'Jkl'],
 'D':['1', '2', '3', '4'],
 'E':['Most of the Time', 'Never', 'Quite Often', 'Less than 
        Often'],
 'F':['Most of the Time', 'Never', 'Quite Often', 'Less than 
        Often']       
           }

h_ls = [‘B’,’E’,’F’]

I tried the following code

df.columns

h_ls=[]

    

for x in df.columns:
    xi = str(x)
    for i in df[x]:
        if i.startswith("Most of the Time") or i.startswith("Quite 
            Often") or i.startswith("Less that Often") or 
            i.startswith("Never"):
            
            h_ls.append(xi)
            break
        else:
            continue

I get an error that says ‘Timestamp’ object has no attribute ‘startswith’

It gets stuck on the first column where the condition is false.

Can anyone tell me where I am committing the mistake or if there exists a better solution? I have dropped the timestamp column from the data frame but still it’s popping up.

Advertisement

Answer

Here is another way:

df.columns[df.isin(accepted_strings).any()].tolist()

Output:

['B', 'E', 'F']
User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement