I am trying to calculate the % of men and women in a dataframe column named “gender”.
“gender” is defined as an object taking 3 values : “Man” “Woman” “nan” (NA).
I did this :
total = len(df['gender']) men = len(df[df['gender']=="Man"]) women = len(df[df['gender']=="Woman"]) pct_men = round(men/total*100,1) pct_women = round(women/total*100,1) print(f'{pct_men}%') print(f'{pct_women}%')
But it returns 0.0% for both.
When i check ‘total’ value it returns : 10123033 but zero for both ‘men’ and ‘women’ Thanks.
Advertisement
Answer
pay attention maybe in your dataframe “man” or “men” or “Men” exist instead of “Man” and for woman …
try this:
import pandas as pd df = pd.DataFrame({'gender': {0: 'Man', 1:'Woman', 2:'Man'}}) total = len(df['gender']) men = len(df[df['gender']=="Man"]) women = len(df[df['gender']=="Woman"]) pct_men = round(men/total*100,1) pct_women = round(women/total*100,1) print(f'{pct_men}%') print(f'{pct_women}%')
output:
66.7% 33.3%