I have a dataframe
col0 col1 col2 col3 col4 0 1 3 6 6 0 1 0 2 8 7 3 2 0 0 4 3 4 3 4 2 2 0 4
The logic is if col1 is not zero, return col1. If col 1 is zero, return col2 (non-zero). If col 2 is zero, return col3. We don’t need to do anything for col4
My code looks like below but it only returns col1
def test(df): if df['col1'].iloc[0] > 0: return df['col1'] elif df['col1'].iloc[0] == 0 & df['col2'].iloc[0] > 0: return df['col2'] elif df['col2'].iloc[0] == 0 & df['col3'].iloc[0] > 0: return df['col3'] else: return 0 test(new)
I tried .any() and .all(), it doesnt work either. Also, is there anyway to make this piece of code more efficient?
Advertisement
Answer
A variation on @ALollz idea, since lookup is deprecated on pandas 1.2.0:
indices = np.argmax(df.ne(0).values, axis=1) print(df.values[np.arange(len(df)), indices])
Output
[1 2 4 4]
UPDATE
For excluding the last column, and return 0, do this instead:
indices = np.argmax(df.ne(0).iloc[:, :-1].values, axis=1) result = np.where(df.ne(0).iloc[:, :-1].any(1), df.values[np.arange(len(df)), indices], 0) print(result)