New column based on values from other columns AND respecting pre-established rules

Question

I&#8217;m looking for an algorithm to create a new column based on values ​​from other columns AND respecting pre-established rules. Here&#8217;s an example: artificial data The goal is to create a new_column based on the values ​​of col_1, col_2, and col_3. For that, the rules are: If the value &#8216;Yes&#8…

Accepted Answer

A solution using Python:import pandas as pddf = pd.DataFrame({  'col_1': ['No','Yes','Yes','Yes','Yes','Yes','No','No','No','Unknown'],  'col_2': ['Yes','Yes','Unknown','Yes','Unknown','No','Unknown','No','Unknown','Unknown'],  'col_3': ['Unknown','Yes','Yes','Unknown','Unknown','No','No','Unknown','Unknown','Unknown']})df['col_4'] = [('Yes' if 'Yes' in x else ('No' if 'No' in x else 'Unknown')) for x in zip(df['col_1'], df['col_2'], df['col_3'])]print(df)Output:     col_1    col_2    col_3    col_40       No      Yes  Unknown      Yes1      Yes      Yes      Yes      Yes2      Yes  Unknown      Yes      Yes3      Yes      Yes  Unknown      Yes4      Yes  Unknown  Unknown      Yes5      Yes       No       No      Yes6       No  Unknown       No       No7       No       No  Unknown       No8       No  Unknown  Unknown       No9  Unknown  Unknown  Unknown  Unknown

New column based on values from other columns AND respecting pre-established rules

artificial data

The goal is to create a new_column based on the values of col_1, col_2, and col_3. For that, the rules are:

I managed to operationalize this using case_when() describing all possible combinations; or ifelse sequential. But these solutions are not scalable to N variables.

I’m looking for some algorithm capable of operationalizing this faster and capable of being expanded to N variables.

Advertisement

Answer