Subset dataframe based on integer in column name

Question

I have a dataframe that has names such as these for its columns: In Python, how can I return a list of only the dataframe columns where the number after the first underscore is greater than 20? Answer We can use a list comprehension with basic string splitting logic:

Accepted Answer

We can use a list comprehension with basic string splitting logic:column_names = ["c_12_2_heart", "c_29_4_lung", "c_21_21_stomach", "c_2_25_bladder", "c_40_1_kidney"]output = [x for x in column_names if int(x.split("_")[1].split("_")[0]) > 20]print(output)  # ['c_29_4_lung', 'c_21_21_stomach', 'c_40_1_kidney']

Advertisement

Answer