Skip to content
Advertisement

New column based on values ​from other columns AND respecting pre-established rules

I’m looking for an algorithm to create a new column based on values ​​from other columns AND respecting pre-established rules. Here’s an example:

artificial data

JavaScript

The goal is to create a new_column based on the values ​​of col_1, col_2, and col_3. For that, the rules are:

  • If the value ‘Yes’ is present in any of the columns, the value of the new_column will be ‘Yes’;
  • If the value ‘Yes’ is not present in any of the columns, but the value ‘No’ is present, then the value of the new_column will be ‘No’;
  • If the values ​​’Yes’ and ‘No’ are absent, then the value of new_columns will be ‘Unknown’.

I managed to operationalize this using case_when() describing all possible combinations; or ifelse sequential. But these solutions are not scalable to N variables.

Current solution:

JavaScript

I’m looking for some algorithm capable of operationalizing this faster and capable of being expanded to N variables.

After searching for StackOverflow, I couldn’t find a way to my problem (I know there are several posts about creating a new column based on values ​​obtained from different columns, but none). Perhaps the search strategy was not the best. If anyone finds it, please provide the link.

I used R in the code, but the current solution works in Python using np.where. Solutions in R or Python are welcome.

Advertisement

Answer

A solution using Python:

JavaScript

Output:

JavaScript
User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement