Skip to content
Advertisement

CSV comparison with python multipleindex

I need to compare two CSV files and print out changed, remained same or deleted rows in a third CSV file. First csv file is like this:

JavaScript

Second csv file:

JavaScript

at the end this is the result i want to get:

JavaScript

if a there is a new country added to a siteid then it has status of new. Location can have multiple siteids. I want to catch if there is a new country added for a specific location and siteid not just one of them but for both of them as a condition. In the dataset some siteids are NA thats why i added location here. so in some cases from the location the file should understand the status.

Here is my code but it is not working as i wanted. If can you help me that will be really great :)

JavaScript
JavaScript
JavaScript

Advertisement

Answer

I’m not yet convinced this can be done exclusively with pandas operators. You do have several problems in your code. xxx.set_index returns a new data frame — it doesn’t modify in place. So, you need

JavaScript

Once you do that, you don’t have to set_index on df3. You really want to add the “status” value to df3, not df3a; after the grouping, df3a doesn’t look like what you need any more. I’m not sure the grouping is really the answer; I’m afraid you’re going to have to iterate the rows that are in both and compare the “price” value to df1. You can find out which rows with

JavaScript

but after that, I think you’ll need to iterate.

User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement