Skip to content
Advertisement

Find string in data frame and store new values in a new column

I am creating a script that takes a csv file which columns organisation and columns name are unknown. However I know that only one of the column contains some values in which the str ‘rs’ and ‘del’ appears.

I need to create an extra column (called ‘Type’) and store ‘dbsnp’ in the row where ‘rs’ was found and ‘deletion’ in the row where ‘del’ was found. If not str is found, leave this row in column type empty.

As example I provide this df:

JavaScript

I have been trying things like that

JavaScript

But one of my problems is that I dont know non the name of the column neither the position.

Desire output

JavaScript

The next for loop solve the problem of the unknown column but now I need to solve the issue of identify my str in the value.

How can I use str.contains(“rs”) in the if condition?

JavaScript

Advertisement

Answer

You can do it without the loop. Here’s an approach. You can use applymap and search all the columns.

JavaScript

Based on the data in the table, rsAdela has both rs and del. Since I am applying rs first and del second, the row is flagged for deletion. You can choose to swap the order to decide if you want to retain value as dbsnp or deletion.

The code processes all the columns irrespective of dtype.

The output of the above data is:

JavaScript
User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement