I have two data frames. I want to add the columns from the second data frame to the first one by matching the column information in data frame 1.
The first data frame: dd
JavaScript
x
16
16
1
BCS libcs
2
0 AAA-1 1
3
1 ABA-1 1
4
2 ACD-1 1
5
3 AAT-1 1
6
4 AAR-1 1
7
8
188453 TCC-61 61
9
188454 TTG-61 61
10
188455 CTG-61 61
11
188456 ATG-61 61
12
188457 TTT-61 61
13
14
print(dd.shape)
15
(188458, 2)
16
The second dataframe: df
JavaScript
1
16
16
1
sample_id DState
2
0 H20_AGS AG
3
1 H21_AGS AG
4
2 H22_BGS BG
5
3 H23_AGS AG
6
4 H24_CGS CG
7
..
8
56 H90_DGS DG
9
57 H91_DGS DG
10
58 H92_BGS BG
11
59 H93_BGS BG
12
60 H94_CGS CG
13
14
print(df.shape)
15
(61, 2)
16
There are 61 unique items in column libcs
in dataframe1 and 61 rows in the second data frame. I want to merge based on the matching column information.
Expected Output:
JavaScript
1
13
13
1
BCS libcs sample_id DState
2
0 AAA-1 1 H20_AGS AG
3
1 ABA-1 1 H20_AGS AG
4
2 ACD-1 1 H20_AGS AG
5
3 AAT-1 1 H20_AGS AG
6
4 AAR-1 1 H20_AGS AG
7
8
188453 TCC-61 61 H94_CGS CG
9
188454 TTG-61 61 H94_CGS CG
10
188455 CTG-61 61 H94_CGS CG
11
188456 ATG-61 61 H94_CGS CG
12
188457 TTT-61 61 H94_CGS CG
13
I tried the below code but it seems like it is not working
JavaScript
1
5
1
dd.libcs.value_counts()
2
dd.index = dd.groupby(level=0).cumcount()
3
C = dd.join(df).reset_index(drop=True)
4
print (C)
5
Advertisement
Answer
Use merge
JavaScript
1
14
14
1
>>> dd.merge(df, left_on='libcs', right_on=(df.index+1).astype(str))
2
3
BCS libcs sample_id DState
4
0 AAA-1 1 H20_AGS AG
5
1 ABA-1 1 H20_AGS AG
6
2 ACD-1 1 H20_AGS AG
7
3 AAT-1 1 H20_AGS AG
8
4 AAR-1 1 H20_AGS AG
9
5 TCC-61 61 H94_CGS CG
10
6 TTG-61 61 H94_CGS CG
11
7 CTG-61 61 H94_CGS CG
12
8 ATG-61 61 H94_CGS CG
13
9 TTT-61 61 H94_CGS CG
14