Skip to content
Advertisement

How to use Excel’s SUMIF function in Pandas

I have a difficulty in calculating “total_sum.” If someone didn’t apply to subject, I expressed N/A. When total_sum is calculated, total_sum refer to Standard field and N/A is excluded. I’m not good at Python, So I don’t know how to calculate “total_sum”

            ENG   MATH ART COM  subject_sum  total_sum   
 Standard    10   10   5    5       
   A         10   N/A  N/A  1      11           15
   B         N/A  5    3    5      13           20
   C         3    3    2    2      10           30

Advertisement

Answer

Suppose this dataframe is the same as yours (with index of strings)

dataframe = {'index' :['Standard', 'A', 'B', 'C'],
             'ENG' : [10, 10, np.nan, 3],
             'MATH' : [10, np.nan, 5, 3],
             'ART' : [5, np.nan, 3, 2],
             'COM' : [5, 1, 5, 2]}
df = pd.DataFrame(dataframe).set_index('index').rename_axis(None)
df['subject_sum'] = df.sum(axis=1)
df


            ENG     MATH    ART   COM   subject_sum
Standard    10.0    10.0    5.0   5     30.0
A           10.0    NaN     NaN   1     11.0
B           NaN     5.0     3.0   5     13.0
C           3.0     3.0     2.0   2     10.0

then you can apply .dot() of every .notna() subject values to the values of Standard

standard = df.loc['Standard', ['ENG', 'MATH', 'ART', 'COM']]
df['total_sum'] = df[['ENG', 'MATH', 'ART', 'COM']].notna().dot(standard)
df

result :

            ENG     MATH    ART     COM     subject_sum     total_sum
Standard    10.0    10.0    5.0     5       30.0            30.0
A           10.0    NaN     NaN     1       11.0            15.0
B           NaN     5.0     3.0     5       13.0            20.0
C           3.0     3.0     2.0     2       10.0            30.0
User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement