Skip to content
Advertisement

Creating a column with conditions over multiple rows

I have the next DataFrame:

pd.DataFrame(['a','a','a','a','b','b','b','c','c','a'])

I need to create a column considering the variation on the other column.

Following this result:

Letter Number
a 1
a 0
a 0
a 0
b 1
b 0
b 0
c 1
c 0
a 1

Every time the letter change, I need to put a 1.

Advertisement

Answer

shift

I’m assuming that df is what OP provided

df = pd.DataFrame(['a','a','a','a','b','b','b','c','c','a'])

Then reasigned the first column to a series letter

letter = df.iloc[:, 0]

pd.DataFrame({
    'Letter': letter,
    'Number': letter.shift().ne(letter).astype(int)
})

  Letter  Number
0      a       1
1      a       0
2      a       0
3      a       0
4      b       1
5      b       0
6      b       0
7      c       1
8      c       0
9      a       1
User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement