I’m trying to aggregate a pandas df in a way an excel pivot table would. I have one quantitative variable called “Count”. I would like the same qualitative variables to combine and the “Count” data to sum.
However, when I am trying to do this with the below code, I see that I am somehow losing data. Any idea why this might be happening and how I can fix it?
I expect the number of rows to decrease but the total sum of the “Count” column shouldn’t change.
Advertisement
Answer
Since you have NaNs in your dataframe, they won’t be included in your groupby operation, and thus the data for those rows will not be summed.
