I’ve got a script updating 5-10 columns worth of data , but sometimes the start csv will be identical to the end csv so instead of writing an identical csvfile I want it to do nothing…
How can I compare two dataframes to check if they’re the same or not?
JavaScript
x
8
1
csvdata = pandas.read_csv('csvfile.csv')
2
csvdata_old = csvdata
3
4
# ... do stuff with csvdata dataframe
5
6
if csvdata_old != csvdata:
7
csvdata.to_csv('csvfile.csv', index=False)
8
Any ideas?
Advertisement
Answer
You also need to be careful to create a copy of the DataFrame, otherwise the csvdata_old will be updated with csvdata (since it points to the same object):
JavaScript
1
2
1
csvdata_old = csvdata.copy()
2
To check whether they are equal, you can use assert_frame_equal as in this answer:
JavaScript
1
3
1
from pandas.util.testing import assert_frame_equal
2
assert_frame_equal(csvdata, csvdata_old)
3
You can wrap this in a function with something like:
JavaScript
1
6
1
try:
2
assert_frame_equal(csvdata, csvdata_old)
3
return True
4
except: # appeantly AssertionError doesn't catch all
5
return False
6
There was discussion of a better way…