Skip to content
Advertisement

How to write a for-loop/if-statement for a dataframe (integer) column

I have a dataframe with a column of integers that symbolise birthyears. Each row has 20xx or 19xx in it but some rows have only the xx part.

What I wanna do is add 19 in front of those numbers with only 2 “elemets” if the integer is bigger than 22(starting from 0), or/and add 20 infront of those that are smaller or equal to 22.

This is what I wrote;

for x in DF.loc[DF["Year"] >= 2022]:
  x + 1900
  if:
    x >= 22 
  else:
    x + 2000

You can also change the code completely, I would just like you to maybe explain what exactly your code does.

Thanks for everybody who takes time to answer this.

Advertisement

Answer

Instead of iterating through the rows, use where to change the whole column:

y = df["Year"] # just to save typing
df["Year"] = y.where(y > 99, (y + 1900).where(y > 22, y + 2000))

or indexing:

df["Year"][df["Year"].between(0, 21)] += 2000
df["Year"][df["Year"].between(22, 99)] += 1900

or loc:

df.loc[df["Year"].between(0, 21), "Year"] += 2000
df.loc[df["Year"].between(22, 99), "Year"] += 1900
User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement