I have the below dataframe:
Col1 Col2 Col3 Col4 Col5 Col6 Col7 1 A T 1 AG NBL NH 2 A T 1 NAG BL NH 3 A M 2 NAG NBL HL 4 NS M 1 NAG BL NH 5 NS T 1 NAG NBL HL 6 A M 2 NAG NBL HL
I want to create a new column based on this conditions:
IF Col5 = 'AG' AND Col2 = 'A' AND Col3 = 'M' AND Col4 = 1 Then *(In the NewColumn)* 'A1' IF Col5 = 'AG' AND Col2 = 'A' AND Col3 = 'M' AND Col4 = 0 Then *(In the NewColumn)* 'A2' IF Col6 = 'BL' AND Col2 = 'A' AND Col3 = 'M' AND Col4 = 1 Then *(In the NewColumn)* 'B1' IF Col6 = 'BL' AND Col2 = 'A' AND Col3 = 'M' AND Col4 = 0 Then *(In the NewColumn)* 'B2' IF Col7 = 'HL' AND Col2 = 'A' AND Col3 = 'M' AND Col4 = 1 Then *(In the NewColumn)* 'H1' IF Col6 = 'HL' AND Col2 = 'A' AND Col3 = 'M' AND Col4 = 0 Then *(In the NewColumn)* 'H2'
I have many different conditions in the original df but I’m trying to create a way just to add this conditions. I’m working with phyton in jupyter
Advertisement
Answer
Since the logic here is pretty complicated, I would suggest putting your conditions inside a function, and using DataFrame.apply() to call that function for every row in your dataset. Here’s how I would translate your example into Pandas:
import pandas as pd df = pd.read_csv("test.csv") def classify(row): Col2 = row["Col2"] Col3 = row["Col3"] Col4 = row["Col4"] Col5 = row["Col5"] Col6 = row["Col6"] Col7 = row["Col7"] if Col5 == 'AG' and Col2 == 'A' and Col3 == 'M' and Col4 == 1: return 'A1' if Col5 == 'AG' and Col2 == 'A' and Col3 == 'M' and Col4 == 0: return 'A2' if Col6 == 'BL' and Col2 == 'A' and Col3 == 'M' and Col4 == 1: return 'B1' if Col6 == 'BL' and Col2 == 'A' and Col3 == 'M' and Col4 == 0: return 'B2' if Col7 == 'HL' and Col2 == 'A' and Col3 == 'M' and Col4 == 1: return 'H1' if Col6 == 'HL' and Col2 == 'A' and Col3 == 'M' and Col4 == 0: return 'H2' # No match return None df["NewCol"] = df.apply(classify, axis=1) print(df)
Note: I tried this function on your dataset, and I get the following result, which is probably not what you want:
Col1 Col2 Col3 Col4 Col5 Col6 Col7 NewCol 0 1 A T 1 AG NBL NH None 1 2 A T 1 NAG BL NH None 2 3 A M 2 NAG NBL HL None 3 4 NS M 1 NAG BL NH None 4 5 NS T 1 NAG NBL HL None 5 6 A M 2 NAG NBL HL None
Inspecting the individual rows, they seem to be correct – none of them follow any of your rules. I’d suggest double checking that your rules are correct.