Skip to content
Advertisement

Python merging csv files problem coping and dupplicated cells-parts

I have the following two CSV files:

file1.csv

val   calc_1
20    1
20    2
20    3
20    4
10    5
10    6
10    7
5     8
5     9

and the second file2.csv

val   calc_2
20    12
20    22
20    32
20    42
10    52
10    62
10    72
5     82
5     92

and i would like to have them merged like this:

val   calc_1   calc_2
20    1        12
20    2        22
20    3        32
20    4        42
10    5        52
10    6        62
10    7        72
5     8        82
5     9        92

my code that I wrote is:

    a = pd.read_csv('file1.csv')
    b = pd.read_csv('file2.csv')
    merged = a.merge(b, on='val')
    merged.to_csv("merged.csv", index=False)

but when I do this I get for each calc_1 point the whole dataset copied like this:

val   calc_1   calc_2
20    1        12
20    1        22
20    1        32
20    1        42
10    1        52
10    1        62
10    1        72
5     1        82
5     1        92

20    2        12
20    2        22
20    2        32
20    2        42
10    2        52
10    2        62
10    2        72
5     2        82
5     2        92
...

what am I doing wrong?

Advertisement

Answer

Here is a solution you can try out, using pd.concat

import pandas as pd

pd.concat([df1.set_index('val'), df2.set_index('val')], axis=1).reset_index()

   val  calc_1  calc_2
0   20       1      12
1   20       2      22
2   20       3      32
3   20       4      42
4   10       5      52
5   10       6      62
6   10       7      72
7    5       8      82
8    5       9      92
User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement