Skip to content
Advertisement

Python CSVkit compare CSV files

I have two CSV files that look like this..

CSV 1

reference  |  name  |  house
----------------------------
2348A      |  john  |  37
5648R      |  bill  |  3
RT48       |  kate  |  88
76A        |  harry |  433

CSV2

reference
---------
2348A
76A

Using Python and CSVkit I am trying to create an output CSV of the rows in CSV1 by comparing it to CSV2. Does anybody have an example they can point me in the direction of?

Advertisement

Answer

I would recommended to use pandas to achieve what you are looking for:

And here is how simple it would be using pandas, consider your two csv files are like this:

CSV1

reference,name,house
2348A,john,37
5648R,bill,3
RT48,kate,88
76A,harry ,433

CSV2

reference
2348A
76A

Code

import pandas as pd
df1 = pd.read_csv(r'd:tempdata1.csv')
df2 = pd.read_csv(r'd:tempdata2.csv')
df3 = pd.merge(df1,df2, on= 'reference', how='inner')
df3.to_csv('outpt.csv')

output.csv

,reference,name,house
0,2348A,john,37
1,76A,harry ,433
User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement