Skip to content
Advertisement

Pandas fillna based on a condition

I’m still new to pandas, but I have a dataframe in the following format:

JavaScript

and I’m trying to fill all NaN fields in the ‘d_header’ column using the following conditions:

  • ‘d_header’ column should be set only for rows belonging to the same group
  • the group should be determined by the ‘d_prefix’ column value of a row immediately after non-Nan ‘d_header’ row

So in the following example:

  • 0: ‘d_header’ == ‘##### MOROCCO #####’
  • 1: check ‘d_prefix’ and set ‘d_header’ column for all rows going forward to ‘##### MOROCCO #####’ until ‘d_prefix’ has changed (set value to NaN) OR new ‘d_header’ found (start over)
JavaScript

but I’m not having any luck with this approach. Would there be a better way to achieve the same result?

Advertisement

Answer

  • d_prefix is almost the grouping key you need. bfill it then groupby()
  • reduced to simple ffill
JavaScript
d_title d_prefix d_header d_country d_subtitles d_season d_episode
0 nan nan ##### MOROCCO ##### Morocco nan nan nan
1 title1 AR ##### MOROCCO ##### nan nan nan nan
2 title2 AR ##### MOROCCO ##### nan nan nan nan
3 nan nan ##### MOROCCO 2 ##### Morocco nan nan nan
4 title3 AR ##### MOROCCO 2 ##### nan nan nan nan
5 nan nan ##### ALGERIA ##### Algeria nan nan nan
6 title4 AR ##### ALGERIA ##### nan nan nan nan
7 title5 AR ##### ALGERIA ##### nan nan nan nan
8 title6 IT nan nan nan nan nan
9 title7 PL nan nan nan 1 1
10 title8 UK nan nan nan nan nan
11 title9 UK nan nan nan nan nan
User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement