I have two data_frames, as below:
JavaScript
x
6
1
df_name:
2
Student_ID Name DOB
3
0 1 Raju 1993-02-02
4
1 2 Indu 1987-01-04
5
2 3 Laya 2000-06-24
6
JavaScript
1
14
14
1
df_marks:
2
Student_ID Subject Int1/40 Int2/40
3
0 1 Eng 10 35
4
1 1 Tam 30 38
5
2 1 Mat 20 30
6
3 1 Sci 15 20
7
4 2 Eng 35 25
8
5 2 Tam 25 15
9
6 2 Mat 22 30
10
7 2 Sci 29 23
11
8 3 Eng 18 17
12
9 3 Tam 19 16
13
10 3 Mat 27 26
14
The task is to create a data_frame(below one), where I need to add df_marks['Int1/40']
& df_marks['Int2/40']
, if df_name['Student_ID'] == df_marks['Student_ID']
JavaScript
1
5
1
Student_id Name DOB Tam/50
2
0 1 Raju 1993-02-02 NaN
3
1 2 Indu 1987-01-04 NaN
4
2 3 Laya 2000-06-24 NaN
5
I tried
JavaScript
1
2
1
df_out['Tam/50'] = df_marks[['Int1/40','Int2/40']].sum(axis=1).where(df_marks['Subject']==df_out['Student_id'])
2
But its giving error as,
JavaScript
1
2
1
ValueError: Can only compare identically-labeled Series objects
2
Do we have any simple way to do this?
Regards, Deepak Dash
Advertisement
Answer
Use DataFrame.join
with aggregated sum
for new column in df_name
:
JavaScript
1
8
1
df_marks['Tam/50'] = df_marks[['Int1/40','Int2/40']].sum(axis=1)
2
df_name = df_name.join(df_marks.groupby('Student_ID')['Tam/50'].sum(), on='Student_ID')
3
print (df_name)
4
Student_ID Name DOB Tam/50
5
0 1 Raju 1993-02-02 198
6
1 2 Indu 1987-01-04 204
7
2 3 Laya 2000-06-24 123
8
Or solution without helper column:
JavaScript
1
12
12
1
s = (df_marks[['Int1/40','Int2/40']].sum(axis=1)
2
.groupby(df_marks['Student_ID'])
3
.sum()
4
.rename('Tam/50'))
5
6
df_name = df_name.join(s, on='Student_ID')
7
print (df_name)
8
Student_ID Name DOB Tam/50
9
0 1 Raju 1993-02-02 198
10
1 2 Indu 1987-01-04 204
11
2 3 Laya 2000-06-24 123
12