Skip to content
Advertisement

Return the last non-zero value in a panda df

I have a dataframe

    col0 col1   col2 col3 col4
0   1   3   6  6  0
1   0   2   8  7  3
2   0   0   4  3  4
3   4   2   2  0  4

The logic is if col1 is not zero, return col1. If col 1 is zero, return col2 (non-zero). If col 2 is zero, return col3. We don’t need to do anything for col4

My code looks like below but it only returns col1

def test(df):
        if df['col1'].iloc[0] > 0:
            return df['col1']
        elif df['col1'].iloc[0] == 0 & df['col2'].iloc[0] > 0:
            return df['col2']
        elif df['col2'].iloc[0]  == 0 & df['col3'].iloc[0]  > 0:
            return df['col3']
        else:
            return 0
test(new)

I tried .any() and .all(), it doesnt work either. Also, is there anyway to make this piece of code more efficient?

Advertisement

Answer

A variation on @ALollz idea, since lookup is deprecated on pandas 1.2.0:

indices = np.argmax(df.ne(0).values, axis=1)
print(df.values[np.arange(len(df)), indices])

Output

[1 2 4 4]

UPDATE

For excluding the last column, and return 0, do this instead:

indices = np.argmax(df.ne(0).iloc[:, :-1].values, axis=1)
result = np.where(df.ne(0).iloc[:, :-1].any(1), df.values[np.arange(len(df)), indices], 0)
print(result)
User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement